I’m not so sure, there are a lot of businesses and people training their AI models right now and sites like reddit or twitter are very attractive huge collections of user generated content. It’s not the most outrageous assumption that they’ll try to get that data for free by scraping instead of paying for API access.
But also, hasn’t that boat left already for several AI companies? They’ve already trained it up, no need to scrape again, they just use what they got last time for their core training, it’s only the last couple of years/months they’re missing.
I’m not so sure, there are a lot of businesses and people training their AI models right now and sites like reddit or twitter are very attractive huge collections of user generated content. It’s not the most outrageous assumption that they’ll try to get that data for free by scraping instead of paying for API access.
But also, hasn’t that boat left already for several AI companies? They’ve already trained it up, no need to scrape again, they just use what they got last time for their core training, it’s only the last couple of years/months they’re missing.