Data360 Analyze

View Only

Back to discussions

Expand all | Collapse all

Non accepted characters with "Modify Fields" for UNICODE-to-STRING conversion

1. Non accepted characters with "Modify Fields" for UNICODE-to-STRING conversion

Like
Stephane TELLA
Posted 07-11-2019 07:03

Reply Reply Privately
Facing an issue which was not present when I worked with Lavastorm on the exact same database.

I've imported a CSV file with "FileCharacterSet" at Autodetect

TIERSNOM filed's Type is Unicode

Using "Modify Fields" to change the type into String

Get that error message. See attachment

Tried to use "FileCharacterSet" with

UNICODE BOM

UTF-8

ISO 8859-1

same issue
2. RE: Non accepted characters with "Modify Fields" for UNICODE-to-STRING conversion

Like
Employee

Adrian Williams
Posted 07-15-2019 07:03

Reply Reply Privately
I cannot see your attachment with the error message. Can you please re-post it and, if possible, a sanitized sample of the data that causes the error.

When I try importing some sample data from a .csv file using AutoDetect for the FileCharacterSet property value the data is imported with a unicode data type. If I then set the Type property for a field in the Modify Fields node, the data is converted to string data type:
3. RE: Non accepted characters with "Modify Fields" for UNICODE-to-STRING conversion

Like
Stephane TELLA
Posted 07-15-2019 08:10

Reply Reply Privately
here is the error message
4. RE: Non accepted characters with "Modify Fields" for UNICODE-to-STRING conversion

Like
Employee

Adrian Williams
Posted 07-16-2019 05:57

Reply Reply Privately
I assume the undefined characters in the value generating the error have acute accents. I created some test data (see attached file) and imported it using the Autodetect option on the CSV/Delimited File node. The conversion to string type in the Modify Fields node also worked as expected.

Can you open your source CSV data file in Notepad++ and check the encoding being reported for the file, e.g.:

Attached files

TestData_UTF-8.csv
5. RE: Non accepted characters with "Modify Fields" for UNICODE-to-STRING conversion

Like
Employee

Adrian Williams
Posted 07-16-2019 06:01

Reply Reply Privately
It would be constructive to have a small sample of the data that is causing the issue.

If required, you can use the Submit a request link to open a ticket and upload the data to us.
6. RE: Non accepted characters with "Modify Fields" for UNICODE-to-STRING conversion

Like
Stephane TELLA
Posted 07-16-2019 06:50

Reply Reply Privately
my source is a 3Gb file

I've identified these lines :
7. RE: Non accepted characters with "Modify Fields" for UNICODE-to-STRING conversion

Like
Employee

Adrian Williams
Posted 07-17-2019 04:57

Reply Reply Privately
In Data3Sixty Analyze can you add another CSV/Delimited node to the canvas and configure it to import the data as before. However, switch to the Define tab in the node properties panel, scroll down to the bottom and add a third output pin to the node :

When you run the node there will be a single record on the out3 pin which provides details of the results of the auto-detection process. Please post this information to us.

The characters you indicate in your post are valid for the Windows-1282 code page and ISO-8859-1 character set so there should be no issue handling those characters in Analyze within either a unicode data type field or string type field.

I created a text file with the data you indicate causes the issues (attached). The view of the data in a Hex editor is this:

The highlighted byte is the first e with the acute accent (0xE9). When this data is viewed in Notepad++ the characters are displayed as expected:

Can you also let us know what locale your machine is configured to use.

Attached files

Identified_Lines_w_Extended_Chars.txt
8. RE: Non accepted characters with "Modify Fields" for UNICODE-to-STRING conversion

Like
Employee

Adrian Williams
Posted 07-17-2019 06:22

Reply Reply Privately
If you want to replace the problematic characters you could use the following in a Transform node:

out1.field=in1.field.decode("ascii","replace")

or

out1.field=in1.field.decode("ascii","ignore")

Data360 Analyze

Non accepted characters with "Modify Fields" for UNICODE-to-STRING conversion

Stephane TELLA07-11-2019 07:03

Adrian Williams07-15-2019 07:03

Stephane TELLA07-15-2019 08:10

Adrian Williams07-16-2019 05:57

Adrian Williams07-16-2019 06:01

Stephane TELLA07-16-2019 06:50

Adrian Williams07-17-2019 04:57

Adrian Williams07-17-2019 06:22

1. Non accepted characters with "Modify Fields" for UNICODE-to-STRING conversion

2. RE: Non accepted characters with "Modify Fields" for UNICODE-to-STRING conversion

3. RE: Non accepted characters with "Modify Fields" for UNICODE-to-STRING conversion

4. RE: Non accepted characters with "Modify Fields" for UNICODE-to-STRING conversion

Attached files

5. RE: Non accepted characters with "Modify Fields" for UNICODE-to-STRING conversion

6. RE: Non accepted characters with "Modify Fields" for UNICODE-to-STRING conversion

7. RE: Non accepted characters with "Modify Fields" for UNICODE-to-STRING conversion

Attached files

8. RE: Non accepted characters with "Modify Fields" for UNICODE-to-STRING conversion

About Precisely

Customer Support

Copyright ©2025 Precisely. All rights reserved worldwide.