How to retrieve a value from an external page?

How to retrieve a value from an external page?

  

We have the following challenge:

We connect to a webservice of a supplier and in some situations we receive an URL as reaction. On that page a number is "visible". We need that number in our system. We can ask the customer of our application to copy / paste it from a page in an iframe but that's not very customer friendly.

What we tried:

1) Used httprequesthandler.getrequest_submit to receive the html of the page.
    That didn't work because the page was created using Telerik Kendo (javascript) to fill the page after loading the HTML. So the value is not visible on load of the HTML structure.

2) Javascript
     Javascript reading across different domains isn't allowed in the browser

A solution would be a service or component that renders the page completely and then return the whole page source.

Unfortunately the supplier doesn't have a proper return structure through their webservice on their roadmap :(

I hope someone has an idea to tackle this problem...

Peter van den Ochtend wrote:

We have the following challenge:

We connect to a webservice of a supplier and in some situations we receive an URL as reaction. On that page a number is "visible". We need that number in our system. We can ask the customer of our application to copy / paste it from a page in an iframe but that's not very customer friendly.

What we tried:

1) Used httprequesthandler.getrequest_submit to receive the html of the page.
    That didn't work because the page was created using Telerik Kendo (javascript) to fill the page after loading the HTML. So the value is not visible on load of the HTML structure.

2) Javascript
     Javascript reading across different domains isn't allowed in the browser

A solution would be a service or component that renders the page completely and then return the whole page source.

Unfortunately the supplier doesn't have a proper return structure through their webservice on their roadmap :(

I hope someone has an idea to tackle this problem...

Peter,

Maybe this post can help you out!

Is Web Scraping possible using Outsystems ?


Regards


César Mateus wrote:

Peter van den Ochtend wrote:

We have the following challenge:

We connect to a webservice of a supplier and in some situations we receive an URL as reaction. On that page a number is "visible". We need that number in our system. We can ask the customer of our application to copy / paste it from a page in an iframe but that's not very customer friendly.

What we tried:

1) Used httprequesthandler.getrequest_submit to receive the html of the page.
    That didn't work because the page was created using Telerik Kendo (javascript) to fill the page after loading the HTML. So the value is not visible on load of the HTML structure.

2) Javascript
     Javascript reading across different domains isn't allowed in the browser

A solution would be a service or component that renders the page completely and then return the whole page source.

Unfortunately the supplier doesn't have a proper return structure through their webservice on their roadmap :(

I hope someone has an idea to tackle this problem...

Peter,

Maybe this post can help you out!

Is Web Scraping possible using Outsystems ?


Regards


Hi César,

Thanks for your reaction. That thread actually uses the same approach as i did (httprequesthandler or ardohttp and then retreive the value from the response using index or XML (that can use xpath).

But my problem is that the value I retrieve is the HTML structure but since the page uses some javascript libraries that connect to a DB and retrieve / insert the value in the HTML DOM after the HTML response was received. The URL of my test page is: 

"https://nalert.ncontrol.nl/?n=2&c=dev-demo&a=pids-%7C"

If you load that page you can see my problem.

If i go into the sourcecode of the page when it's fully loaded in the browser I see the correct value.


Peter van den Ochtend wrote:

César Mateus wrote:

Peter van den Ochtend wrote:

We have the following challenge:

We connect to a webservice of a supplier and in some situations we receive an URL as reaction. On that page a number is "visible". We need that number in our system. We can ask the customer of our application to copy / paste it from a page in an iframe but that's not very customer friendly.

What we tried:

1) Used httprequesthandler.getrequest_submit to receive the html of the page.
    That didn't work because the page was created using Telerik Kendo (javascript) to fill the page after loading the HTML. So the value is not visible on load of the HTML structure.

2) Javascript
     Javascript reading across different domains isn't allowed in the browser

A solution would be a service or component that renders the page completely and then return the whole page source.

Unfortunately the supplier doesn't have a proper return structure through their webservice on their roadmap :(

I hope someone has an idea to tackle this problem...

Peter,

Maybe this post can help you out!

Is Web Scraping possible using Outsystems ?


Regards


Hi César,

Thanks for your reaction. That thread actually uses the same approach as i did (httprequesthandler or ardohttp and then retreive the value from the response using index or XML (that can use xpath).

But my problem is that the value I retrieve is the HTML structure but since the page uses some javascript libraries that connect to a DB and retrieve / insert the value in the HTML DOM after the HTML response was received. The URL of my test page is: 

"https://nalert.ncontrol.nl/?n=2&c=dev-demo&a=pids-%7C"

If you load that page you can see my problem.

If i go into the sourcecode of the page when it's fully loaded in the browser I see the correct value.


You need to tell your provider to have a proper UI. You aren't going to do this with any sort of ease, in OutSystems, .NET, Java, PHP, Ruby... you *might* be able to do this in Node.js. MIGHT. At best, you are going to have to do something like run webkit or the IE components, or something similar to render the page, run the JavaScript, and then provide the *resulting* HTML to you. There's no way this is happening in a reasonable fashion. Your provider needs a real API.

J.Ja

Justin James wrote:

Peter van den Ochtend wrote:

César Mateus wrote:

Peter van den Ochtend wrote:

We have the following challenge:

We connect to a webservice of a supplier and in some situations we receive an URL as reaction. On that page a number is "visible". We need that number in our system. We can ask the customer of our application to copy / paste it from a page in an iframe but that's not very customer friendly.

What we tried:

1) Used httprequesthandler.getrequest_submit to receive the html of the page.
    That didn't work because the page was created using Telerik Kendo (javascript) to fill the page after loading the HTML. So the value is not visible on load of the HTML structure.

2) Javascript
     Javascript reading across different domains isn't allowed in the browser

A solution would be a service or component that renders the page completely and then return the whole page source.

Unfortunately the supplier doesn't have a proper return structure through their webservice on their roadmap :(

I hope someone has an idea to tackle this problem...

Peter,

Maybe this post can help you out!

Is Web Scraping possible using Outsystems ?


Regards


Hi César,

Thanks for your reaction. That thread actually uses the same approach as i did (httprequesthandler or ardohttp and then retreive the value from the response using index or XML (that can use xpath).

But my problem is that the value I retrieve is the HTML structure but since the page uses some javascript libraries that connect to a DB and retrieve / insert the value in the HTML DOM after the HTML response was received. The URL of my test page is: 

"https://nalert.ncontrol.nl/?n=2&c=dev-demo&a=pids-%7C"

If you load that page you can see my problem.

If i go into the sourcecode of the page when it's fully loaded in the browser I see the correct value.


You need to tell your provider to have a proper UI. You aren't going to do this with any sort of ease, in OutSystems, .NET, Java, PHP, Ruby... you *might* be able to do this in Node.js. MIGHT. At best, you are going to have to do something like run webkit or the IE components, or something similar to render the page, run the JavaScript, and then provide the *resulting* HTML to you. There's no way this is happening in a reasonable fashion. Your provider needs a real API.

J.Ja

Thanks for responding!

We tried to make a solution (in a time-boxed POC situation) but it is indeed to complex to make a stable (and performing) solution.

We asked the supplier to make a real API for this and they are planning it in the near future. Until then we show the page in a small iframe where the customer can get the value using copy/paste (if they want it) and we will change it in the future when the API will be ready.



Hi Peter.

Looking at the network transfers on that page, I can see that some data is fetched using a REST service:



That led me to this URL: https://nalert.ncontrol.nl/api/NAlert/GetNAlert2Data?token=pids-%7C&check=dev-demo

This is a standard JSON REST web service. You could easily import it in Service Studio and read data from it. You don't even need to parse HTML and all that nasty stuff, let alone run javascript!

Solution

Hi Leonardo,

Thanks for your reply. Now I learned a new way to inspect a page and it's behavior. :)

It indeed works but we don't know if we receive all the correct values in a production scenario. But this is a very nice way to avoid loading the page, processing javascript etc. We can now continue with this case.

Solution