Do you actually mind bots scraping data from your website?
Misha Krunic
17 replies
Hello, PH!
As I’m developing my products (the latest one being BotMeNot - https://botmenot.com/), I often wonder - do people actually mind data being scraped from their websites?
On the one hand, it’s publicly available information - meaning it’s up for grabs for any visitor. I’ve even read some people saying that if you don’t want information being collected from your website - don’t put it on the website in the first place. Do you agree or is this too harsh of a stance in your opinion?
On the other hand, it’s a very automated process of information gathering. Web scrapers are far more efficient than humans can ever hope to be. Does this make a difference to you?
Also, even if you’re not a website owner, please share your opinion!
Replies
Fabian Maume@fabian_maume
Warmup Inbox
The issue with the bots is always: "How Is the information used?"
Like any technology, web scraping can be used for good or evil.
Share
Price2Spy
@fabian_maume That's true!
I guess it depends on a Fair use policy. Google is scraping your website all the time. But in return you receive traffic you would not have received other wise. On the other hand if you are hosting a list of some kind and someone steals it that's bad. Whats missing is a fair use definition what people can adopt like a Creative Commons version of fair use for bots
Price2Spy
@chandan_maruthi1 Interesting point!
Sales Sparrow by True Sparrow
Hmm.. At times it is about what are they going to do with the scrapped data.
Price2Spy
@nishith_shah Interesting point of view. Would you say that you're indifferent to the act of scraping itself?
Sales Sparrow by True Sparrow
@price2spy tricky question to answer. If you are scrapping twitter or PH in order to automate some manual process - like getting twitter handles of all people who upvoted a PH product - that could be fine. But if you are scrapping Medium blog posts and repurposing those - that wouldn't be considered fair use.
Every piece of content was created got owner and has copyright. You can scrape data, however you cant republish as your own.
Do I mind about bot's scraping my website to gather data?- heck YES I do (apart from Google), because it is additional stress for server in the first place.
Log Harvestor
Launching soon!
Yeah, it's definitely annoying when it comes to analytics. Plus there are some crazy malicious bots out there that do DDoS or spam your public forms. As far as scraping websites goes... I feel like anti-scraping bots would improve social media pages.
Log Harvestor
Launching soon!
@price2spy Sure! So sites like Twitter are constantly being scraped, so if you post a tweet and want to delete it later, people still have a copy. I could see it being useful for a site like Twitter to employ something like this to ensure that if a user decides to delete a Tweet, than their won't be external copies with all the metadata. Idk how feasible this is.
Price2Spy
@solomon_bush Yes, there's many different types of bots actually. Can you elaborate what you mean when it comes to social media?
Price2Spy
@solomon_bush I see, thanks for clarifying!
Hey! I believe that bots for collecting information do not have a bad effect. It is always a person's choice - to leave contacts or not. If I had my own website, I think I would use all the automation possibilities :)
Price2Spy
@antonovna Thanks for answering! You make a good point.