Dru Wynings

Crawly - Never write another web scraper

Turn websites into data in seconds. Crawly spiders and extracts complete structured data from an entire website.

Add a comment

Replies

Best
Dru Wynings
Hey ProductHunters! Crawly is a free tool I built that uses Diffbot's automatic article extraction api to turn web content into structured data. I've used it for creating a centralized database of all of our content, but you could also use it to do content audits / migrations or analyze your competitor's content. It's currently limited to 200 pages and only articles at the moment, but I plan to add support for scraping products in the future. Any other features you'd like to see added?
Luka
@druwynings images?
Erik Dungan
@druwynings You should really update that page to clarify the "articles only" caveat. Especially when the tagline on your home page is "No rules required"
Dru Wynings
@callmeed Will do! Like I mentioned, support for products, discussions, images, and videos is in the works.
Matt Gardner
Awesome!!! Looking for a replacement for Kimono (https://www.kimonolabs.com/) since they got acquired by Palantir. Need something to power my Slack menu bot ;)
Neil Cocker
Looks great. Congrats on the launch! I can see some nice applications for this. One suggestion - I understand that it might take a while to scrape the data, but an instant email to say it will be X minutes, or just a notification after email input would be good, to manage expectations. I used it 10 mins ago, and am on the verge of tears that I still have no email... ;-)
Neil Cocker
Cannot GET /results/56ebdf254b0bfe03003ef0d8 :-(
Dru Wynings
@neilcocker Hey Neil, things should be back to normal. Servers were crumbling under the PH load :)
Dru Wynings
@neilcocker I didn't want to inundate people with unnecessary emails, but I don't want people crying either...
Neil Cocker
@druwynings Tears are over. All working now. Very impressive. Good work - this will definitely be very useful.
Eric Iannaccone
I would love to be able to scrape sports stats easily!
Rob Spectre
Diffbot is such a useful service for manipulating published content - huge fan of this team.
Yugendhar Devale
How can I try this? I think there is some server issue Code 503.
Dru Wynings
@go_venky Sorry about that! Things should be back to normal *fingers crossed*.
Erik Dungan
The potential of this is really big and there are clearly needs across a lot of industries. Unfortunately, my tests with some ecommerce product pages had less than stellar results.
Dru Wynings
@callmeed Thanks Erik. If you'd like, I can set you up with a Diffbot trial account for crawling ecommerce pages. Interested?
Erik Dungan
@druwynings for sure ... I think I have a call with you next week btw :)
Sarthak Grover
Awesome! Any plans for a command-line version in the future?
Dru Wynings
@sarthakgrover To be honest, probably not. That being said, Crawly's big brother is Crawlbot (https://www.diffbot.com/products...) which has a fully-supported API (https://www.diffbot.com/dev/docs...)
Mat Newton
love the name! Nothing else to add. Just love the name. Great movie.
Dave Lynam
Is there a way to customize what data gets scraped? The ability to mouse over elements and select/de-select them would be ideal.
Dru Wynings
@drlynam Not at the moment within Crawly. That's something that we fully support using our Custom Api (http://www.diffbot.com/products/...)
Christopher Leach
@druwynings Any student access at a cheaper price? love this kinda stuff but don't have the money for it
abdullah m. ceylan
@datarade wow, great! My fav is apifier ofcourse.
Surjith S M
Just tried not so great. 1. does not capture the actual URL, it captures root/ folder 2. We can save as json and csv? what's the use of that? We need to call the API each time right? is it only onetime getting?
🚀 Pierre-Henry 💡
Love it! Great job
Julius Muraguri
Hello there Please can it be used to collect data from websites into airtable
Andrey Demchenko
Seems, like a suitable solution for small data scraping tasks, I am going to make a little testing to it right now! However, for more extended tasks, it is reasonable to go for professional services on request, like this one https://data-ox.com/webscraping-..., for instance. This guys can also develop a custom tool for specific needs, by the way.
Dara
Looks great. Scraping data from sites is a pain. Well done.