Data360 Analyze

 View Only
  • 1.  Expanding a List into Multiple Columns

    Employee
    Posted 11-20-2017 10:55

    When preparing data you may have to handle a field that contains a list of values that are separated by a delimiter. A previous article described how to expand the list of values to produce a record for each value in the list. However, you may instead want to expand the list values into multiple columns and output them in a single record. The attached .lna file shows how this can be achieved using the Transform node and some Python script to implement an 'Expand List to Columns' node.

    As an example, lets assume your data looks like this:

    You want to expand each of the items in the 'State List' field into separate columns in the output data using the comma (,) character as the delimiter.

    The 'ListField' property is configured with the name of the input field containing the list of values. In this case the 'Delimiter' property can be left blank as the comma character is the default delimiter.

    The node creates new output fields for the list values using the 'ItemField' property. The names of the output fields are based on a stem plus an incrementing suffix. For example if the 'ItemField' property is set to 'State' the output fields will be 'State_1', 'State_2', etc.

    The number of new fields to output is set by the 'NumColumns' property. If not configured the default is to output two fields.

    In the example the data has a varying number of items. By default, the node will treat this as an error condition as the number of items will  not align with the number of output fields. You can change this behavior using the 'UnequalElementsBehavior' property. Having set the property to 'Log' or 'Ignore' you then have the option to either include all the remaining items in the final output field or to drop the remaining items - this is determined by the 'ExtraElementsBehavior' property. The default is to include the remaining items in the final output field.

    If there are fewer items than the number of output fields the trailing fields will contain Null values.

    The node can be configured to either remove or retain any white-space characters around the items. This is set using the 'TrimItem' property. The default is to trim the values.

    Using the configured properties in this example, the node will output the following results:



  • 2.  RE: Expanding a List into Multiple Columns

    Employee
    Posted 10-04-2023 04:27
      |   view attached

    Attaching the missing zip file containing the original .lna 



    ------------------------------
    Adrian Williams
    Precisely Software Inc.
    ------------------------------