A Reference(Id) at scraper run Request
complete
H
Holger
usually, in batch jobs, you have a referenceId e.g. producId, queryId etc. not only the targetUrl
it would be nice if a scraper run request have a parameter like "reference" (e.g. string with len 100) and the value will returned in the result for reference what todo with the result
the url is usually not the internal reference
at time I encode the internal reference id and add it to as ancor #534554 to the url ... (will not presented to the server) but i got the information back in the result (as url and can extract the id)
Cahyo
complete
Cahyo
Closing this request since we have a new "Scraping Run" entity 👍
You can now query all results from a scraping run using the "scraping_run_id" value. More information in the API documentation:
Cahyo
in progress
Cahyo
planned
Cahyo
Hi, thanks for the suggestion
If I understood correctly, you want that all the results from a scraper /run have a same "group or batch id"?
And maybe the ability to query the group at a later time?
Thanks
H
Holger
Yes, that goes in the direction :-)
Yes, a group or batch_id parameter on the scraper execution request returned in the result satisfies my requirements.
The ability to query it is not necessary for my case, but makes perfect sense, and even more so if there is the ability to query a scarper_id in results.
I need the ability to add an internal reference (e.g. internal articleId or keywordId) to every single requested URL. But due to the internal structure, I request a single URL per scraper/run, so a reference id per run will fit it.
The abstract scenario that is thereby solved is:
The request scrape url is generated of existing objects.
So there is the need to reference back to the generating object.
This is different to the classic scraping case where you collect the data for all URLs you can grab :-)
Thanks