You can define a function in the ConfigureFields to check each character individually. The unicodedata module has a function that retrieves the name of the character and if the name of the character contains the word cyrillic, then it is cyrillic.
import unicodedata
out1.x = unicode
def has_cyrillic(text):
for char in unicode(text):
if 'CYRILLIC' in unicodedata.name(char):
return True
return False
In the ProcessRecords property you can use the function in an if statement:
text = in1.possible_cyrillic_text # Possible Cyrillic text
if has_cyrillic(text):
out1.x = ("Contains Cyrillic characters")
else:
out1.x = ("Does not contain Cyrillic characters")
------------------------------
Ernest Jones
Precisely Software Inc.
PEARL RIVER NY
------------------------------
Original Message:
Sent: 04-18-2023 12:40
From: Dhinmar James Cayog
Subject: Identify records containing cyrillic characters
Hi,
I want to use a split node to separate records containing cyrillic characters.
For example. If Field1 has a value of "System of a Down - Официальная Дискография [MP3]", I want it to go to the false output pin of the split node.
Any recommendation how to achieve this?
Thank you.
------------------------------
Dhinmar James Cayog
Knowledge Community Shared Account
------------------------------