Data360 Analyze

 View Only
  • 1.  D360 Analyse - Load JSON Lines file using JSON Node

    Posted 11-23-2023 14:43
    Edited by Scott PERRY 11-24-2023 04:09

    Hi All

    Hoping you can help with the following issue. 

    I am trying to load a JSON Lines file using the D360 Analyse JSON node. It loads the data but where there is a value for a key that is a list of values it splits each value in that list into it's own row and then any data on the records (for other keys) are also put onto their own row. Is there a way to stop that from happening using settings or something? 

    See snippet of 3 lines below of a mocked up example of the JSON Lines file i am actually loading (which is 13GB...)

    it's the middle row with the list causing the issue of i.e "Underlying":{"UnderlyingInstrumentISIN":["US01","US02","US3","US4","US5","US6","US7","US8"]}

    each of these UnderlyingInstrumentISIN in the list will be put on it's own row with a full copy of all the other fields populated. I would like to keep it as a list in that field, not split onto separate rows. 

    {"TemplateVersion":"2M1","Header":{"AssetClass":"Equity","InstrumentType":"Option","UseCase":"Single_Name","Level":"InstRefDataReporting"},"ISIN":{"ISIN":"ABC1","Status":"New","StatusReason":"","LastUpdateDateTime":"2023-09-27T23:35:23","Parents":{"UPI":"ABC1UPI1"}},"Derived":{"FullName":"Equity Option Single_Name","ClassificationType":"TYPE1","CommodityDerivativeIndicator":"FALSE","IssuerorOperatoroftheTradingVenueIdentifier":"NA","ShortName":"NA/O 20240627","UnderlyingAssetType":"Single Stock","StrikePriceCurrency":"HKD"},"Attributes":{"NotionalCurrency":"HKD","ExpiryDate":"2024-06-27","UnderlyingInstrumentISIN":"HK23456","OptionType":"CALL","OptionExerciseStyle":"EURO","ValuationMethodorTrigger":"Vanilla","DeliveryType":"CASH","StrikePriceType":{"StrikePriceType":"Monetary Value","StrikePrice":12},"PriceMultiplier":1}}
    {"TemplateVersion":"1M1","Header":{"AssetClass":"Equity","InstrumentType":"Swap","UseCase":"Price_Return","Level":"InstRefDataReporting"},"ISIN":{"ISIN":"CEF1","Status":"New","StatusReason":"","LastUpdateDateTime":"2023-09-27T23:38:37","Parents":{"UPI":"CEF1UPI1"}},"Derived":{"FullName":"Equity Swap1023","ClassificationType":"S","CommodityDerivativeIndicator":"FALSE","IssuerorOperatoroftheTradingVenueIdentifier":"NA","ShortName":"NA","UnderlyingAssetType":"Basket","ISOUnderlyingInstrumentIndex":"","UnderlyingInstrumentIndexTermValue":0,"UnderlyingInstrumentIndexTermUnit":"DAYS"},"Attributes":{"NotionalCurrency":"USD","ExpiryDate":"2023-10-23","Underlying":{"UnderlyingInstrumentISIN":["US01","US02","US3","US4","US5","US6","US7","US8"]},"ReturnorPayoutTrigger":"Price","PriceMultiplier":1,"DeliveryType":"CASH"}}
    {"TemplateVersion":"2M1","Header":{"AssetClass":"Equity","InstrumentType":"Option","UseCase":"Single_Name","Level":"InstRefDataReporting"},"ISIN":{"ISIN":"EZ8","Status":"New","StatusReason":"","LastUpdateDateTime":"2023-09-27T23:38:54","Parents":{"UPI":"QZ9"}},"Derived":{"FullName":"Equity Option 27","ClassificationType":"H","CommodityDerivativeIndicator":"FALSE","IssuerorOperatoroftheTradingVenueIdentifier":"NA","ShortName":"NA HKD 20","UnderlyingAssetType":"Single Stock","StrikePriceCurrency":"HKD"},"Attributes":{"NotionalCurrency":"HKD","ExpiryDate":"2024-06-27","UnderlyingInstrumentISIN":"HK2","OptionType":"CALL","OptionExerciseStyle":"EURO","ValuationMethodorTrigger":"Vanilla","DeliveryType":"CASH","StrikePriceType":{"StrikePriceType":"M","StrikePrice":2},"PriceMultiplier":1}}

    thanks

    Scott



    ------------------------------
    .
    ------------------------------



  • 2.  RE: D360 Analyse - Load JSON Lines file using JSON Node

    Employee
    Posted 11-27-2023 15:53
      |   view attached

    Hi Scott,

    So far, when I've had this problem I have done one of the following:

    • If I don't need that part of the JSON data, I use the ability of the JSON node to split off certain parts of the JSON documents to additional outputs.  So for your example, add an output named Attributes.Underlying.UnderlyingInstrumentISIN and the list of data goes there.  Sadly, I have not been able to figure out how to join that data back with the output on the first pin.

    • If I really need the data and the JSON node does not seem to easily get what I need, I use a Transform node with the json Python library.  I created the following to parse that JSON and write out the list values as a " * " separated list.
      out1 += in1
      
      j = json.loads(in1._RawData)
      
      whatImLookingFor = Null
      if "Attributes" in j:
          attributes = j["Attributes"]
          if "Underlying" in attributes:
              underlying = attributes["Underlying"]
              if "UnderlyingInstrumentISIN" in underlying:
                  whatImLookingFor = underlying["UnderlyingInstrumentISIN"]
      
      if whatImLookingFor is not Null:
          out1.listValues = " * ".join(whatImLookingFor)
      

    I'm attaching the dataflow I used for this.



    ------------------------------
    Ernest Jones
    Precisely Software Inc.
    PEARL RIVER NY
    ------------------------------

    Attachment(s)

    zip
    JSON Adds Lines.zip   5 KB 1 version


  • 3.  RE: D360 Analyse - Load JSON Lines file using JSON Node

    Posted 11-28-2023 03:34

    Hi Ernest, thanks for the detailed reply. I will give it a try using the transform node and let you know. thanks again!



    ------------------------------
    .
    ------------------------------