We have launched training.mrscraper.com, a collection of dummy websites to practice web scraping.
We will be writing a set of guides using these websites, but feel free to create your own standard and AI scrapers to practice against these sites.
You can now create
webhooks for specific scrapers. 🎉
To send events to specific scrapers, simply go to create a new webhook and toggle the "
Apply to specific scrapers" option.
If not toggled, the selected events will be sent for all your scrapers.
✨ Added a new option inside the advanced tab to prevent resources to load.
💡 For example, disabling images and styles when screenshot is not needed can speed-up the scraper a lot.
Added a new type of parser to keep one or all matches of a regex expression.
✨ It's now possible to define a timezone for a scheduled scraper!
(additionally, it's now possible to select a default timezone in your profile settings to localize dates and timestamps).
The new results page has the following sections:
- Extracted data: View, copy and download the data from the defined extractors.
- Scraped source code: View, copy and download the HTML of the scraped website. Useful to debug unsuccessful scrapings along with the HTML scraper free tool.
- Screenshot (screen or full-page): If the scraper fails or the screenshot option is enabled in the scraper, a partial or full-page screenshot will be displayed here.
It's now possible to save a screenshot of the scraped page.
Screenshots are available for all paid plans, and full-page screenshots for the Ultimate and Business plans.
If a scraping fails, a screenshoot is going to be attached, even if the screenshot option is disabled. Screenshots for failed scrapings are also triggered in the free tier.
The results page now prints the extracted data with unescaped UNICODE and other chars such as Japanese or Chinese.
It's now possible to share a scraper configuration with other users (or the support team if you need help debugging your extractors).
You will find the share action at the edit/view scraper pages, by opening the dropdown menu.
By clicking the
sharebutton, you will open the configuration menu where you can enable/disable the sharing status and copy the shareable link.
Fixed an error where the URL validation logic was not accepting scrapings to URLs containing non ASCII chars (Japanese, Chinese, German, etc).