Data360 Analyze

 View Only
  • 1.  Locate a string and extrac

    Posted 07-06-2021 13:21

    My nightmare for a week (at night)

    I have a field "justification" which contain a text

    My objective is to locate and extract a reference (standard format for this reference is "DS/ YYYY.99999-A.99") and to record it in a new field I have named "reference"

    This reference could be anywhere in the text or could be missing

    I try many options like getItem() or regexMatch() but I think the script is more complicated than I initially thought

    So far, i have just manage to filter the records which include the reference. i am a complete desaster

    Do you have any clue ?



  • 2.  RE: Locate a string and extrac

    Employee
    Posted 07-07-2021 01:29

    You can use the re.match() function in a Transform node:

     

    The Python documentation for using regular expressions provides more detail.

    See the attached example data flow.

     

     

    Attached files

    Regex_Find_Reference_in_Text - 7 Jul 2021.lna

     



  • 3.  RE: Locate a string and extrac

    Employee
    Posted 07-07-2021 01:43

    You could tighten the pattern a little to better match a year if required

    pattern = r'.*(DS/ [1-2]\d\d\d\.\d\d\d\d\d-A\.\d\d).*'

    Which can also be written as:

    pattern = r'.*(DS/ [1-2]\d{3}.\d{5}-A\.\d{2}).*'

     



  • 4.  RE: Locate a string and extrac

    Posted 08-02-2021 11:47

    hello Adrian

    whua, great. it works

    I was on the worng way obviously

    i went for the following pattern = r'.*(CR/\d{4}.\d{5}-\D\.\d{2}).*'

    Michel