Data360 Analyze

 View Only
  • 1.  HTTP node: NoHttpResponseException

    Posted 02-04-2020 14:29

    Dear community,

    Can't get the following to work in both LAE 6.1 and Data360 Analyze (same error message):

    I'm trying to use the GET method in the HTTP node to parse around 30,000 URL's, all from the same domain. All I want to know is the status message (such as 200 OK).

    Some of these URL's are 'bad' and consistently generate the following error message:

    "A connection error occurred for GET request to http://XXXXXX: Details: org.apache.http.NoHttpResponseException: XXXXX failed to respond.
    Error Code: brain.node.http.transportProtocolError"

    If I paste such a 'bad' URL in a web browser, It gives a response like "Hmmm...can’t reach this page".

    So what I would expect the HTTP node to do is NOT to just stop working in the middle of parsing 30,000 URL's, but to simply continue and accumulate such errors in a second pin (for example). Unfortunately, such functionality does not exist (yet), a second pin cannot even be added...

    Is there perhaps another way to solve or circumvent this issue? The thing is: I have no way of knowing beforehand whether an URL is 'bad'.

    Any help would be greatly appriciated!

    Best regards, Bart.



  • 2.  RE: HTTP node: NoHttpResponseException

    Employee
    Posted 02-05-2020 13:15

    Note: This was originally posted by an inactive account. Content was preserved by moving under an admin account.

    I deleted my first reply to replace with this data flow. It gives 3 different options to check if a web page exists:

     

    1. Using subprocess module - it will execute a curl command and give the result from it.

    2. Using os module - another curl command but you won't see the result, just a result code.

    3. Using urllib2 - probably the most correct way to do it. You must set the parameter DontRunInContainer for it to work. The documentation talks about how to do this in more detail.

     

    Hope this helps!

     

     

    Attached files

    Check if Web Page Exists - 5 Feb 2020.lna

     



  • 3.  RE: HTTP node: NoHttpResponseException

    Posted 02-08-2020 12:30

    Wow, Gerry, this is really really briliant!!!!

    Thank you so much, this absolutely solved my problem!!

    The very best regards from The Netherlands from Bart!!