Crawly - Never write another web scraper

Dru Wynings

Sensible Instruct

Turn websites into data in seconds. Crawly spiders and extracts complete structured data from an entire website.

Replies

Best

Kumar Thangudu

Link Texting

I love this for its uniqueness ability to specify parameters. Hugely useful! Adding this to my running list of Scraping and Crawling Technologies! http://scrapinghub.com http://www.outwit.com/products/hub/ http://webroots.io http://kimonolabs.com http://grabby.io http://fullcontact.com http://emailhunter.co http://clearbit.com http://toofr.com http://import.io http://kimonolabs.com http://apifier.com (my number one favorite) http://elink.club http://www.eliteproxyswitcher.com/ - ;) http://www.uipath.com/ http://diffbot.com http://cloudscrape.com http://community.screen-scraper.... https://commoncrawl.org/ http://www.fminer.com/ https://scraperwiki.com/ http://nutch.apache.org/ http://www.ubotstudio.com/index7 http://mozenda.com http://fivefilters.org/

Report

9yr ago

Blaine Hatab

@datarade god mode scraping collection.

Report

9yr ago

Cesare D. Forelli

GlanceCam

@datarade wow, thanks!

Report

9yr ago

Robin Wouters

Mariahfy

@datarade This list should be a collection!

Report

9yr ago

Nick Kwan

Pakible

@datarade Great list! Have any of these had @Kimonify-like capabilities to generate api's? @skrypt

Report

9yr ago

Dru Wynings

Sensible Instruct

Maker

@nwkwan diffbot does =)

Report

9yr ago

Dru Wynings

Sensible Instruct

Maker

Hey ProductHunters! Crawly is a free tool I built that uses Diffbot's automatic article extraction api to turn web content into structured data. I've used it for creating a centralized database of all of our content, but you could also use it to do content audits / migrations or analyze your competitor's content. It's currently limited to 200 pages and only articles at the moment, but I plan to add support for scraping products in the future. Any other features you'd like to see added?

Report

9yr ago

Luka

Penta

@druwynings images?

Report

9yr ago

Erik Dungan

Baremetrics

@druwynings You should really update that page to clarify the "articles only" caveat. Especially when the tagline on your home page is "No rules required"

Report

9yr ago

Dru Wynings

Sensible Instruct

Maker

@callmeed Will do! Like I mentioned, support for products, discussions, images, and videos is in the works.

Report

9yr ago

Matt Gardner

Awesome!!! Looking for a replacement for Kimono (https://www.kimonolabs.com/) since they got acquired by Palantir. Need something to power my Slack menu bot ;)

Report

9yr ago

Neil Cocker

Ramp T-shirts

Looks great. Congrats on the launch! I can see some nice applications for this. One suggestion - I understand that it might take a while to scrape the data, but an instant email to say it will be X minutes, or just a notification after email input would be good, to manage expectations. I used it 10 mins ago, and am on the verge of tears that I still have no email... ;-)

Report

9yr ago

Neil Cocker

Ramp T-shirts

Cannot GET /results/56ebdf254b0bfe03003ef0d8 :-(

Report

9yr ago

Dru Wynings

Sensible Instruct

Maker

@neilcocker Hey Neil, things should be back to normal. Servers were crumbling under the PH load :)

Report

9yr ago

Dru Wynings

Sensible Instruct

Maker

@neilcocker I didn't want to inundate people with unnecessary emails, but I don't want people crying either...

Report

9yr ago

Neil Cocker

Ramp T-shirts

@druwynings Tears are over. All working now. Very impressive. Good work - this will definitely be very useful.

Report

9yr ago

Dru Wynings

Sensible Instruct

Maker

@neilcocker

Report

9yr ago

Eric Iannaccone

I would love to be able to scrape sports stats easily!

Report

9yr ago

Rob Spectre

Formula 1 Bingo

Diffbot is such a useful service for manipulating published content - huge fan of this team.

Report

9yr ago

Dru Wynings

Sensible Instruct

Maker

@dn0t

Report

9yr ago

Yugendhar Devale

How can I try this? I think there is some server issue Code 503.

Report

9yr ago

Dru Wynings

Sensible Instruct

Maker

@go_venky Sorry about that! Things should be back to normal *fingers crossed*.

Report

9yr ago

Erik Dungan

Baremetrics

The potential of this is really big and there are clearly needs across a lot of industries. Unfortunately, my tests with some ecommerce product pages had less than stellar results.

Report

9yr ago

Dru Wynings

Sensible Instruct

Maker

@callmeed Thanks Erik. If you'd like, I can set you up with a Diffbot trial account for crawling ecommerce pages. Interested?

Report

9yr ago

Erik Dungan

Baremetrics

@druwynings for sure ... I think I have a call with you next week btw :)

Report

9yr ago

David Rosenberg

GIPHY

Report

9yr ago

Dru Wynings

Sensible Instruct

Maker

@david_rosenberg

Report

9yr ago

Sarthak Grover

Awesome! Any plans for a command-line version in the future?

Report

9yr ago

Dru Wynings

Sensible Instruct

Maker

@sarthakgrover To be honest, probably not. That being said, Crawly's big brother is Crawlbot (https://www.diffbot.com/products...) which has a fully-supported API (https://www.diffbot.com/dev/docs...)

Report

9yr ago

Mat Newton

love the name! Nothing else to add. Just love the name. Great movie.

Report

9yr ago

Dave Lynam

Bookmark OS

Is there a way to customize what data gets scraped? The ability to mouse over elements and select/de-select them would be ideal.

Report

9yr ago

Dru Wynings

Sensible Instruct

Maker

@drlynam Not at the moment within Crawly. That's something that we fully support using our Custom Api (http://www.diffbot.com/products/...)

Report

9yr ago

Christopher Leach

@druwynings Any student access at a cheaper price? love this kinda stuff but don't have the money for it

Report

9yr ago

abdullah m. ceylan

@datarade wow, great! My fav is apifier ofcourse.

Report

9yr ago

Surjith S M

Just tried not so great. 1. does not capture the actual URL, it captures root/ folder 2. We can save as json and csv? what's the use of that? We need to call the API each time right? is it only onetime getting?

Report

9yr ago

🚀 Pierre-Henry 💡

Lifyzer

Love it! Great job

Report

8yr ago

Julius Muraguri

Hello there Please can it be used to collect data from websites into airtable

Report

5yr ago

Andrey Demchenko

Seems, like a suitable solution for small data scraping tasks, I am going to make a little testing to it right now! However, for more extended tasks, it is reasonable to go for professional services on request, like this one https://data-ox.com/webscraping-..., for instance. This guys can also develop a custom tool for specific needs, by the way.

Report

5yr ago

Dara

Looks great. Scraping data from sites is a pain. Well done.

Report

3yr ago

Crawly - Never write another web scraper

Replies

Engineering & Development

AI

Work & Productivity

Marketing & Sales

Design & Creative

Social & Community

Finance

Product add-ons

Trending categories

Top reviewed

Trending products

Top forum threads

Engineering & Development

AI

Work & Productivity

Marketing & Sales

Design & Creative

Social & Community

Finance

Product add-ons

Trending categories

Top reviewed

Trending products

Top forum threads