Data360 Analyze

 View Only
  • 1.  Directory List search takes an eternity

    Posted 10-14-2024 05:15

    Using the 'Directory List' node to search for a partial filename does not work if the folder has a larger number of files.  In my case 100K+ csv files.  I've had to kill the process as it runs over 20 minutes.  And i have several files to search for every day.

    Is there a more efficient/different node combination or Python code that works well for this situation? 

    Example filename for the re-stated Sep 2024 month-end TB as at 10th Oct:
    TB_monthend_IFRS_SETTLE_Europe_20240930_20241010_20241011015416.csv

    The partial search uses 'TB_monthend_IFRS_SETTLE_Europe_20240930_202410* '.  This works well if a small subset of files are manually copied across to a temporary folder, but that defeats the purpose of an automated scheduled solution.

    Any help greatly appreciated.



    ------------------------------
    Michael Goulbourn
    AD, Change Management
    RBC Capital Markets
    TORONTO
    ------------------------------


  • 2.  RE: Directory List search takes an eternity

    Employee
    Posted 10-15-2024 08:53
      |   view attached

    I have not been able to reproduce the issue with poor performance that you described. When I tested the Directory List node against a directory with 100K + files the node completed successfully in about 30 seconds using a desktop system. Can you check whether your machine has anti-virus software that uses on-access scanning as this may be the cause of the delays. If so, you may want to re-test with it temporarily disabled.

    I have attached a text file containing an example of how the directory search functionality can be implemented using a Transform node and python scripts. However, in my tests the python approach was slower than the Directory List node.

    If the issue persists, you may want to create a support case.



    ------------------------------
    Adrian Williams
    Precisely Software Inc.
    ------------------------------