The potential of this is really big and there are clearly needs across a lot of industries.
Unfortunately, my tests with some ecommerce product pages had less than stellar results.
Awesome!!! Looking for a replacement for Kimono (https://www.kimonolabs.com/) since they got acquired by Palantir. Need something to power my Slack menu bot ;)
Looks great. Congrats on the launch! I can see some nice applications for this. One suggestion - I understand that it might take a while to scrape the data, but an instant email to say it will be X minutes, or just a notification after email input would be good, to manage expectations. I used it 10 mins ago, and am on the verge of tears that I still have no email... ;-)
Just tried not so great.
1. does not capture the actual URL, it captures root/ folder
2. We can save as json and csv? what's the use of that? We need to call the API each time right? is it only onetime getting?
Hey ProductHunters!
Crawly is a free tool I built that uses Diffbot's automatic article extraction api to turn web content into structured data. I've used it for creating a centralized database of all of our content, but you could also use it to do content audits / migrations or analyze your competitor's content.
It's currently limited to 200 pages and only articles at the moment, but I plan to add support for scraping products in the future. Any other features you'd like to see added?
@druwynings You should really update that page to clarify the "articles only" caveat. Especially when the tagline on your home page is "No rules required"
Seems, like a suitable solution for small data scraping tasks, I am going to make a little testing to it right now! However, for more extended tasks, it is reasonable to go for professional services on request, like this one https://data-ox.com/webscraping-..., for instance. This guys can also develop a custom tool for specific needs, by the way.
Baremetrics