thirdweb

Just found a Bug on Bughunt. ;) When you're analyzing a site and click the link to actually visit the site you're analyzing the href tag is missing the ":" in the "http://"... :)

Report

11yr ago

Paul Cleary

IdBloc

Maker

@sleinadsanoj Thanks! I think that's node's url parser, I'll get that one fixed :)

Report

11yr ago

Murat Mutlu

Ballpark

Hunter

Here's the results for Product Hunt http://bughunt.io/results/543bae...

Report

11yr ago

Paul Cleary

IdBloc

Maker

@mutlu82 So cool to see someone using it like this! That crawl was started at the peak of the HN effect, so it may be a while before it completes, unfortunately I didn't design the MVP to handle anywhere near this kind of load!

Report

11yr ago

Murat Mutlu

Ballpark

Hunter

@pauljohncleary Hey Paul! How long do they take on average?

Report

11yr ago

Paul Cleary

IdBloc

Maker

@mutlu82 Good question! Usually 10 seconds to access and parse each link (most of that time is spent waiting to ensure it renders fully). Then crawling all found links for HTTP errors is sub-second, any links that return a text/html mime type are crawled with the browser (Selenium Server / PhantomJS - 10 seconds per link). So 1-2 minutes under normal circumstances. Normally we'd only run around 10 crawls at a time, but with HN/PH there's thousands of requests coming through. The bottleneck is the selenium server, I'll probably have to kill it and restart all the "In Progress" crawls tonight. As a product that users pay for this would never be an issue, because we're able to schedule crawls as part of a monitoring service and notify users by email of issues. Which means my focus is on product right now and not scale :) EDIT: here's some results for product hunt: http://bughunt.io/results/543d8c...

Report

11yr ago