I just want to scrape the product's name, link, name of the shop, link of shop on this website (https://www.bukalapak.com/c/perawatan-kecantikan/makeup-bibir?page=2&search%5Brating_gte%5D=4&search%5Btop_seller%5D=1). I try to get the HTML with Rest Integration, and I got it. But when I tried to parse the text with module SelectHtmlText on Text and HTML Processing Forge Asset, I got nothing. for the example : I want scrape these information (on the image) with this selector (
#product-explorer-container > div > div.bl-flex-container > div.bl-flex-item.bl-product-list-wrapper > div > div:nth-child(2) > div:nth-child(3) > div > div:nth-child(1) > div > div > div.bl-product-card__description > div.bl-product-card__description-name > p > a)
and I got nothing. please help me, how toscrape these page information seems like name of shop, name of all these stuff, price and these town. ty Attachments: Oml files.
Hi Mohamad,
It's very simple to do web scraping with OutSystems but several steps are needed: get the file, parse it to an HTML document, use the selector, etc.
You can follow a complete step-by-step tutorial, with screenshots, here.
After taking a quick look to your code, I would suggest the following:
Kind Regards,João
on your second point (use shorter selectors) its has different result when I use shorter selectors rather than full path?
I would use shorter selectors for several reasons:
João said it all. For the second point, short answer is yes. Having the full selector is only viable when you're scraping a pretty static website, because the full path selector is so so so specific, that if there's some JavaScript that changes the DOM content, you could easily fail to get the desired content.
In your example, follow what João said, and try to get the list of items, then cycle through them to get the details.For the selector, you can try to use only the class name: '.bl-product-card__description-name', by using the Inspect tools and by doing a search in the DOM (CTRL+F) in the Inspect window, try pasting the previous selector, and you'll see that it picks the name of the products.
Miguel and marques, thanks a lot for the information. I tried 1, and I got it. Problem solves, thanks. A shorter selector is the answer.