Assure DQ

 View Only
  • 1.  Check for Duplicates - Doesn't always work

    Posted 07-17-2020 08:58

    We have the option selected on all our controls to check for duplicates before running file captures. However, this check doesn't always seem to work. We have a folder set up that receives files daily. Every day the control will run to capture a new version of a file that is placed in the folder. If we don't receive a file for that day, and the file in the folder is still the same one from the last time the control ran, the duplicate check will detect this and the control will fail due to duplicate data source. This is how we expect the control to work. The problem is if the same file that was sent to us yesterday is sent again today, the duplicate check cannot tell that it's the same file, and the control will run and capture the same data twice.

    Is there a way for the duplicate check to know that we received a duplicate file? it only seems to detect if no new file was received at all.



  • 2.  RE: Check for Duplicates - Doesn't always work

    Employee
    Posted 09-23-2021 15:01

    Hi John,

    Based on the scenario you describe, most likely the files have a different heading ( causing a different hash value ) or the duplicate file check is not going enough cycles back to notice the prior input.

    Assure builds a hash for the file, which is then compared to detect a duplicate ( so the files need to be identical ). If the file has a header with a date in it, and the file from yesterday / today have different dates, then regardless of the records being identical we would still see it as a different file due to the hash values being different.

    One good resource to troubleshoot this type of issue would be the following page :

    https://<hostname>:<port>/infogixassure/caplist.sp

    Within this "capture source list" page you may filter for the control entity, control point and source. Looking at the stored hash signatures over time will allow better diagnosis of why this may be happening. Please note, this page requires "superuser" permissions to access.

    Matthew Kennedy