Nika

Restrictive Reddit's move: is blocking archive posts protecting users or censoring history?

A few days ago stumbled upon a statistic that Reddit is widely used as a source of information for AI answers.

Reddit answered: They wanna block access to the Internet Archive (kind of protection against scraping)

Is restricting archive access the right way to safeguard data, or does it risk losing valuable online history, which is also useful for journalists/media, etc?

(TBH, knowing what kind of discussions are on Reddit, I am surprised it is one of the dominant sources) :D

105 views

Add a comment

Replies

Best
Aleksandar Blazhev

No, I don’t think so. At the end of the day, online discussions, on Twitter/Facebook/Reddit, reflect current conversations. When we go back to topics from 10–15 years ago, we often laugh at what we wrote. How wrong we were. Or how someone actually guessed something right. So for me, they have huge value.

If someone is old enough, they’ll remember that 20 years ago we used to communicate in forums. And I find it very funny when I come across a thread from that time in which I participated. I see nuances about events that I had forgotten. I see certain trends that could already be felt back then. And that is a treasure. A history. Which, to me, is quite unnecessary to lose.

After all, AI already has enough information. But for users, it’s a loss.

Helga Razinkova

Cool topic, Nika (as always 😁)!
To be honest, it would be cool if AI mentioned the source it used to generate its answers (which I think it already does?) so end users can evaluate them and decide whether they want to trust it. I'm not sure it's the right way to just delete/archive tons of data trying to safeguard the AI performance. We'll might end up archiving the entire internet at this rate 😅

Nika

@helga_impalpable I think the source is included. But tbh, it is less visible because they will show logo of one source and they will add "and other 15+ sources" – and then, you need to roll out but honestly, human beings are too lazy to take another step to observe (like 90% of population) :D

Helga Razinkova

@busmark_w_nika Fair enough 😄 Anyway, I don't see blocking and archiving old data as a solution. I agree with Aleksandar that those old threads are a treasure trove of wisdom at the time - it could be really great to look through such staff and draw specific parallels/compare those approaches and ideas with today's trends, etc.
History must be respected :)

Linh Pham

Actually, Reddit users contribute surprisingly helpful and well-thought-out opinions, sometimes just to gain karma before they start self-promotion, but it still benefits the community. In many niche subreddits, you can find in-depth discussions and expert insights that aren’t easily available elsewhere. That might be why Reddit ends up as one of the top sources for AI training, since it’s a goldmine of perspectives.