Data360 Analyze

 View Only
  • 1.  String Percent Match

    Employee
    Posted 10-11-2019 07:56

    Note: This was originally posted by an inactive account. Content was preserved by moving under an admin account.

    Is there a experimental module or python code that can take two strings and return a % of match between them ?  I tried using the fuzzy nodes that's not what i'm looking for.  



  • 2.  RE: String Percent Match

    Employee
    Posted 10-17-2019 05:22

    You could investigate some of the options described in this StackOverflow post.

     

    Here is the equivalent of the first example in the post using a Transform node.

     

     

    #### ConfigureFields Script

    from difflib import SequenceMatcher

    #Configure all fields from input 'in1' to be mapped
    #to the corresponding fields on the output 'out1'
    out1 += in1

    out1.MatchRatio = float

    def strSimilar(a, b):
      if (a is Null or b is Null):
        return Null
      else:
        return SequenceMatcher(None, a, b).ratio()

    #### End ConfigureFields Script

     

    #### ProcessRecords Script

    #Copy all fields from input 'in1' to the corresponding output fields
    #in output 'out1'. Copies the values of the fields that have been setup
    #in the mapping defined in the ConfigureFields property
    out1 += in1

    out1.MatchRatio = strSimilar(fields.StringA, fields.StringB)

    #### End ProcessRecords Script

     

    Note, you may need to use the Python node rather than the Transform node if you are interested in using 3rd party Python packages.