Perhaps it’s becoming clear that search needs to become a common cooperatively managed infrastructure similar to Wikipedia. That this is in the best interest of everyone but advertisers and spammers.
Truly. I wonder if ActivityPub could be utilized to create a resilient search engine that shares the cost among federated instances. We already have something like that in Lemmy and Mastodon where federated data can be search from any instance. If the data is pages crawled by some automatic crawler which is then federated across instances which in turn allow to search through it, perhaps it might resemble a search engine. Page ranking beyond text matching could even be done by peoples up/down votes instead of some arbitrary algorithm. Similar to how voting works on StackExchange or Lemmy. 🤔 I’m sure someone is thinking about this.
Yeah, decentralized ownership or democratic ownership would be another way to achieve this. A federated system even if possible would almost certainly be less efficient resource-wise.
Just to be clear, what I’m referring to here is that a search would occur on a single instance. E.g. searches on lemmy.world occur on the lemmy.world instance, and load lemmy.world’s servers. The federated part is in the building the database on lemmy.world. E.g. a crawler or a user on lemmy.ca adds a new web site and that record is federated to lemmy.world to add to its database. Another user on feddit.de upvotes a search result and that upvote is federated to lemmy.world so that the search result shows higher for users searching on lemmy.world. In this kind of model individual search instances could in fact be very large based on their usage. If there’s no limit to what’s federated, that would put a lower bound on the size of instances. If there’s a limit (something dumb like federate only search records for *.fr domains) then that would allow for smaller instances that don’t have the compute and storage for the complete index.
One answer that’s proven to work is by involving a lot of people’s labor in the editorial/curation process. Similar to how posting/commenting/voting/moderation work on Lemmy, how it’s worked on Reddit and other human-driven platforms. Corporations have proven on multiple occasions that paying for this labor is not feasible and so a system that depends on it should be corpo-resistant or capital-resistant.
Perhaps it’s becoming clear that search needs to become a common cooperatively managed infrastructure similar to Wikipedia. That this is in the best interest of everyone but advertisers and spammers.
Too bad the Mozilla foundation didn’t pivot to that instead of whatever the hell they’re doing with AI
Truly. I wonder if ActivityPub could be utilized to create a resilient search engine that shares the cost among federated instances. We already have something like that in Lemmy and Mastodon where federated data can be search from any instance. If the data is pages crawled by some automatic crawler which is then federated across instances which in turn allow to search through it, perhaps it might resemble a search engine. Page ranking beyond text matching could even be done by peoples up/down votes instead of some arbitrary algorithm. Similar to how voting works on StackExchange or Lemmy. 🤔 I’m sure someone is thinking about this.
The answer to your question is no, federation is not an appropriate model for internet scale search.
Yeah I think you need a centralized system with decentralized ownership, so that no single party can fuck it up by themselves
so a Search DAO? :)
I mean yeah exactly
Yeah, decentralized ownership or democratic ownership would be another way to achieve this. A federated system even if possible would almost certainly be less efficient resource-wise.
Just to be clear, what I’m referring to here is that a search would occur on a single instance. E.g. searches on lemmy.world occur on the lemmy.world instance, and load lemmy.world’s servers. The federated part is in the building the database on lemmy.world. E.g. a crawler or a user on lemmy.ca adds a new web site and that record is federated to lemmy.world to add to its database. Another user on feddit.de upvotes a search result and that upvote is federated to lemmy.world so that the search result shows higher for users searching on lemmy.world. In this kind of model individual search instances could in fact be very large based on their usage. If there’s no limit to what’s federated, that would put a lower bound on the size of instances. If there’s a limit (something dumb like federate only search records for *.fr domains) then that would allow for smaller instances that don’t have the compute and storage for the complete index.
the biggest question would be how to defend it from spammers and corporations with potentially much more money.
One answer that’s proven to work is by involving a lot of people’s labor in the editorial/curation process. Similar to how posting/commenting/voting/moderation work on Lemmy, how it’s worked on Reddit and other human-driven platforms. Corporations have proven on multiple occasions that paying for this labor is not feasible and so a system that depends on it should be corpo-resistant or capital-resistant.
well reddit did that and was full of shills and bots, vote manipulation, and more, this approach completely failed for them.
and they do put a lot of money into it.