View Only
  • 1.  How to aquire data from a website via http calls to a sql web tool

    Posted 11-12-2019 15:14

    How do I use LAE to replicate the process of signing into a website, navigate to sql web tool and retrieve data using the sql web tool via LAE?

  • 2.  RE: How to aquire data from a website via http calls to a sql web tool

    Posted 12-18-2019 06:55

    Hi Mark, just posting the resolution that you and Stony came up with here so that others can use it.

    Essentially this task requires creating a string of Filters and HTTP nodes. The filter node can set up the URL and BODY you need, and then the HTTP node sends that request, similarly to how you would in a browser.

    If the website has an official API to retrieve the data, it would make this task easier. However, if they don’t have a formal API, then you’re going to need to simulate the steps that a desktop browser goes through to retrieve the data.
    The first difficulty you will encounter is how to log in to the site. Any given website may involve the use of any one (or more) of these:

    1. Open website, no credentials
    2. Trivial login – username/password only
    3. Secured trivial – where they’ve added a SSL certificate that you must install locally before you can access the site
    4. Secured access using username/password within the headers of the HTTP standard
    5. Secured access using username/password in the body of a HTTP POST request
    6. You make one request to the site, and get back a cookie that grants you access
    7. You make one request to the site, and get back a session variable that is then used to go deeper
    8. Oauth1, Oauth2
    9. Kerberos

    Often, this task requires a string of 3 to 5 HTTP nodes to make it work. But, it’s different and unique for every website, and there will be a fair bit of trial and error to get it to work.

    The next difficulty is that several sites cause you trouble getting to your files even after you get in. That’ll likely be another set of coding to retrieve the correct file. In Google Chrome, without any page open, open the Developer Tools (F12) -> Network tab. Then go through the full set of steps to download a file. The information that Chrome traces there will most likely be the steps you need to emulate with discrete nodes. You obviously can ignore the requests for images and logos, but the login steps should be listed there.

    Note: you should plan for the fact that any of the schemes above will likely only be “alive” for between 6-18's likely the website will make alterations and you often have to start over with coding the logic, so keeping this logic working in the long term will require some work.

    The attached graph provides an example of this. It works for a single website. The website used in this example requires a “token” called CSRF before you can actually issue the login request. Your target website might not require that.


    Attached files