Data360 Analyze

 View Only
  • 1.  auto deleting file by

    Posted 01-24-2023 03:22
    Edited by Grzegorz Mazur 01-24-2023 03:31
      |   view attached
    Hi.

    I will appreciate any advice in terms of deleting files from overloaded path. To be honest we have a path with over 57k files and number still growing.
    Fetching list of files from UI takes around 10-15 min and deleting manually files in that folder is uncomfortable and time consuming, the app freezes and is very slow response once we come to that path. For example deleting 1k files is more or less impossible. Our goal is get an option from UI by a new data flow deleting files automatically by scheduler where file is older than X. Is there any option to get around it, different than deleting folder from UI?

    Thank you in advance for your advice.
    Best,
    Grzesiek


  • 2.  RE: auto deleting file by

    Posted 01-26-2023 14:46
    Read the directory using Directory List node

    Using the date columns from the output of the Directory List node (created or modified date), filter and split out the files that are older than X

    Then you can link that to a Transform node, and in your Transform you can use this script to loop through and delete the files:

       Configure Fields:
        import os

       Process Records:
        os.remove(in1.FileName)

    It should loop through each file and delete it one by one

    ------------------------------
    Alex Day
    Knowledge Community Shared Account
    ------------------------------



  • 3.  RE: auto deleting file by

    Posted 01-27-2023 05:00
    Edited by Grzegorz Mazur 01-27-2023 05:08
    Hi Alex,

    Thank you for your replay
    meanwhile, yesterday I had a conversation with Akshita, thank you both for your help.
    I have tried directory list node + path to folder but the outcome is: directory cannot be found
    the directory cannot be found even though I copy/paste path as below so there is no typo

    I am wondering out loud if I should pass to path on server site?

    Best,
    Grzesiek



  • 4.  RE: auto deleting file by

    Posted 01-31-2023 04:11
    Edited by Henrik B 01-31-2023 04:59


  • 5.  RE: auto deleting file by

    Posted 01-31-2023 04:32
    The transform node is designed to work in a loop already, so simply having one transform node with os.remove() added will delete all of the row separated filepath that you feed into the transform. 

    Knowing that the transform node loops once for each row of data that you feed into it, your example above looks like it will work but it doesn't necessarily make sense. For each file that you want to delete (each record input to your transform node), you are creating another list of all files in your temp folder and doing a loop through each of those with your delete - instead of deleting one file your loop is deleting everything in the folder for each file. For example if you had 5 files to delete in your dataflow, you are telling Data360 to create a list of the files in temp and loop through each of those files and delete all of them, 5 times over.

    ------------------------------
    Alex Day
    Knowledge Community Shared Account
    ------------------------------



  • 6.  RE: auto deleting file by

    Employee
    Posted 02-02-2023 06:55
    Can you clarify the source of the value that you have entered into the Directory List node's DirectoryName property?

    The value appears to be the stem of a Resource Path. For example the Resource Path to a Run State object generated by an Execute Data Flow node, e.g.:


    If this is the case then you cannot use a Directory List node to access these objects. Run State objects are stored in the Analyze database, they are not files that exist on the machine's filesystem.

    ------------------------------
    Adrian Williams
    Precisely Software Inc.
    ------------------------------



  • 7.  RE: auto deleting file by

    Posted 02-03-2023 06:09
    Edited by Grzegorz Mazur 02-03-2023 06:14
    Hi,

    First of all, I do apologize for my late reply, applied rules in my mailbox hidden your response message.
    "If this is the case then you cannot use a Directory List node to access these objects. Run State objects are stored in the Analyze database, they are not files that exist on the machine's filesystem."

    It seems this is our case. We use 'execute data flow' noode in order to trigger downstream workflows dynamically by schedule jobs. Schedule jobs kick off hundreds level 2 and 3 workflows and we expect some errors from L2/L3 which we want to capture in order to investigate. Here we have an issue, we create in several months thousands outputs as error which we want clean automatically older than X looping by this folder but since this is stored in Analyze DB and we can not fetch list of files machine's filesystem only way is manual work.

    Is there any alternative way to get access by data360 to Analyze DB and clean it?
    Moreover we set up in settings>>scheduling auto deleting runs after x days but these files are not affected.

    Best,
    Grzesiek


    ------------------------------



  • 8.  RE: auto deleting file by

    Posted 08-21-2023 10:02

    This helps me in one of our project requirements. Thank you very much for sharing. 



    ------------------------------
    clef andrin
    Knowledge Community Shared Account
    ------------------------------