• Subscribe
  • Do you actually mind bots scraping data from your website?

    Misha Krunic
    17 replies
    Hello, PH! As I’m developing my products (the latest one being BotMeNot - https://botmenot.com/), I often wonder - do people actually mind data being scraped from their websites? On the one hand, it’s publicly available information - meaning it’s up for grabs for any visitor. I’ve even read some people saying that if you don’t want information being collected from your website - don’t put it on the website in the first place. Do you agree or is this too harsh of a stance in your opinion? On the other hand, it’s a very automated process of information gathering. Web scrapers are far more efficient than humans can ever hope to be. Does this make a difference to you? Also, even if you’re not a website owner, please share your opinion!

    Replies

    Fabian Maume
    The issue with the bots is always: "How Is the information used?" Like any technology, web scraping can be used for good or evil.
    Chandan Maruthi
    I guess it depends on a Fair use policy. Google is scraping your website all the time. But in return you receive traffic you would not have received other wise. On the other hand if you are hosting a list of some kind and someone steals it that's bad. Whats missing is a fair use definition what people can adopt like a Creative Commons version of fair use for bots
    Nishith from True Sparrow
    Sales Sparrow by True Sparrow
    Hmm.. At times it is about what are they going to do with the scrapped data.
    Misha Krunic
    @nishith_shah Interesting point of view. Would you say that you're indifferent to the act of scraping itself?
    Nishith from True Sparrow
    Sales Sparrow by True Sparrow
    @price2spy tricky question to answer. If you are scrapping twitter or PH in order to automate some manual process - like getting twitter handles of all people who upvoted a PH product - that could be fine. But if you are scrapping Medium blog posts and repurposing those - that wouldn't be considered fair use.
    Mantas
    Every piece of content was created got owner and has copyright. You can scrape data, however you cant republish as your own. Do I mind about bot's scraping my website to gather data?- heck YES I do (apart from Google), because it is additional stress for server in the first place.
    Solomon Bush
    Log Harvestor
    Log Harvestor
    Launching soon!
    Yeah, it's definitely annoying when it comes to analytics. Plus there are some crazy malicious bots out there that do DDoS or spam your public forms. As far as scraping websites goes... I feel like anti-scraping bots would improve social media pages.
    Misha Krunic
    @solomon_bush Yes, there's many different types of bots actually. Can you elaborate what you mean when it comes to social media?
    Solomon Bush
    Log Harvestor
    Log Harvestor
    Launching soon!
    @price2spy Sure! So sites like Twitter are constantly being scraped, so if you post a tweet and want to delete it later, people still have a copy. I could see it being useful for a site like Twitter to employ something like this to ensure that if a user decides to delete a Tweet, than their won't be external copies with all the metadata. Idk how feasible this is.
    Ender
    I think people who mind have a robots.txt file that says as much, no?
    Alina Ihnatiuk
    Hey! I believe that bots for collecting information do not have a bad effect. It is always a person's choice - to leave contacts or not. If I had my own website, I think I would use all the automation possibilities :)
    dodger
    It's useful to scrape huge amount of data to analyze. For example, prices for some good on marketpalce. Once it done, you can analyze the data. It's useful for marketers.