Please Don’t Share Our Links on Mastodon: Here’s Why! | itsfoss.com

BuddyTheBeefalo@lemmy.ml · 2 years ago

Please Don’t Share Our Links on Mastodon: Here’s Why! | itsfoss.com

cbarrick@lemmy.world · edit-2 2 years ago

Just put the site behind a cache, like Cloudflare, and set your cache control headers properly?

They mention that they are already using Cloudflare. I’m confused about what is actually causing the load. They don’t mention any technical details, but it does kinda sound like their cache control headers are not set properly. I’m too lazy to check for myself though…

helenslunch@feddit.nl · edit-2 1 year ago

deleted by creator

cbarrick@lemmy.world · 2 years ago

If caching is properly configured, the cache (Cloudflare) will see thousands of requests, but the VPS should only see one request.

tacofox@lemm.ee · 2 years ago

This should be front and center, caching won’t be able to make up for that…

breakingcups@lemmy.world · edit-2 2 years ago

Of course it will, cloudflare is in front of it, they can definitely handje this traffic as long as itsfoss bothers to set correct caching headers for cloudflare to use. That’s the entire point of cloudflare…

Hugh_Jeggs@lemm.ee · 2 years ago

I always downvote posts with titles like this. Here’s Why -

Darth_Mew@lemmy.world · 2 years ago

same. read more to find out!

parpol@programming.dev · 2 years ago

deleted by creator

CameronDev@programming.dev · 2 years ago

I think they just advertised how trivial it would be to take their website down…

SatyrSack@lemmy.one · 2 years ago

Direct link to article:

https://news.itsfoss.com/mastodon-link-problem/

TL;DR:

When you share a link on Mastodon, a link preview is generated for it, right?

With Mastodon being a federated platform (a part of the Fediverse), the request to generate a link preview is not generated by just one Mastodon instance. There are many instances connected to it who also initiate requests for the content almost immediately.

And, this “fediverse effect” increases the load on the website’s server in a big way.

Does Lemmy not cause this issue? Other federated software was not mentioned in the article at all.

catloaf@lemm.ee · 2 years ago

So the preview should be federated as well?

How many requests are we actually talking about here, though? Is that better or worse than everyone clicking the link?

chameleon@kbin.social · 2 years ago

Lemmy (and Kbin for that matter) very much do the same thing for posts. I don’t think they fetch URL previews for links in comments, but that doesn’t matter: posts and comments are both fairly likely to end up spreading to Mastodon/etc anyway, so even comments will trigger this cascade.

Direct example: If you go to mastodon.social, stick @fediverse@lemmy.world in the search box at the topleft and click for the profile, you can end up browsing a large Mastodon server’s view of this community, and your very link has a preview. (Unfortunately, links to federated communities just result in a redirect, so you have to navigate through Mastodon’s UI.)

BuddyTheBeefalo@lemmy.ml · edit-2 2 years ago

They say it’s fediversal in the comments on Mastodon.

taanegl@lemmy.world · edit-2 10 months ago

deleted by creator

Lvxferre [he/him]@mander.xyz · edit-2 2 years ago

That sounds a lot like a weird spin on the Slashdot effect, caused by content mirroring. It seems that it could be handled by tweaking the ActivityPub protocol to have one instance requesting to generate a link preview, and the other instances copying the link preview instead of sending their own requests.

But frankly? I think that the current way that ActivityPub works is outright silly. Here’s what it does currently:

User is registered to instance A
Since A federates with B, A mirrors content from B into A
The backend is either specific to instance A (the site) or configured to use instance A (for a phone program)
When the user interacts with content from B, actually it’s the mirrored version of content from B that is hosted in A

In my opinion a better approach would be:

User is registered to instance A
Since A federates with B, B accepts login credentials from A
The backend is instance-agnostic, so it’s able to pull/send content from/to multiple instances at the same time
When the user interacts with content from B, the backend retrieves content from B, and uses the user’s A credentials to send content to B

Note that the second way would not create this “automated Slashdot effect” - only A would be pulling info from the site, and then users (regardless of their instance) would pull it from A.

Now, here’s my question: why does the ActivityPub work like in that first way, instead of this second one?

chicken@lemmy.dbzer0.com · 2 years ago

Check out Nostr, ActivityPub alternative that does authentication separately from content, works more like that.

Lvxferre [he/him]@mander.xyz · edit-2 2 years ago

I’m aware of Nostr. In my opinion it splits better back- and front-end tasks than the AP does, even if the later does some things better (as the balance between safeness and censorship-resistance). It’s still an interesting counterpoint to ActivityPub.

DaGeek247@fedia.io · 2 years ago

If server A makes one request, it keeps server B from being overload by thousands of requests from users A.

Lvxferre [he/him]@mander.xyz · 2 years ago

“A” Users would need to send requests to some server anyway, either A or B; that’s only diverting the load from B to A, but it isn’t alleviating or even sharing it.

Another issue with the current way that ActivityPub works is foul content, that needs to be removed. Remember when some muppet posted CP in LW?

breakingcups@lemmy.world · 2 years ago

Yes, but this way demand on instances scales with user count and aliows smaller instances to exist. Otherwise an errant toot on a small instance that suddenly gets popular will instantly drag that smaller instance down.

Lvxferre [he/him]@mander.xyz · 2 years ago

Got it - and that’s a fair point. I wonder however if this problem couldn’t be solved another way, specially because mirroring is itself a burden for the smaller instances.

iltg@sh.itjust.works · 2 years ago

consider that caching happens at thousands of levels on the internet. every centralized site has its content replicated many many times in geo local caches, proxies and even local browsers. caching is a very core concept for the internet. others often bash AP because it replicates a lot, but that’s kind of like explicit caching: if the whole fediverse network fetched a post from it source, millions of requests would beat small servers down constantly. big servers cache the content they intend to distribute and handle the traffic spike instead of the small instance. small instances on their hand dont need to replicate as much and can rely more on bigger instances, maybe cleaning their cached content often and refetching when necessary. replication is a feature, not a design flaw!

Lvxferre [he/him]@mander.xyz · 2 years ago

replication is a feature, not a design flaw!

In this case I’d argue that it’s both. (A problematic feature? A useful bug? They’re the same picture anyway.)

Because of your comment I can see the pros of the mirroring strategy, even if the cons are still there. I wonder if those pros couldn’t be “snipped” and implemented into a Nostr-like network, or if the cons can’t be ironed out from a Fediverse-like one.

Tag365@lemmy.world · 2 years ago

So why doesn’t a random follower posting a link on Mastodon cause server load issues, but a popular follower does?

Sean Tilley@lemmy.world · 2 years ago

It’s an interesting and frustrating problem. I think there are three potential ways forward, but they’re both flawed:

Quasi-Centralization: a project like Mastodon or a vetted Non-Profit entity operates a high-concurrency server whose sole purpose is to cache link metadata and Images. Servers initially pull preview data from that, instead of the direct page.
We find a way to do this in some zero-trust peer-to-peer way, where multiple servers compare their copies of the same data. Whatever doesn’t match ends up not being used.
Servers cache link metadata and previews locally with a minimal amount of requests; any boost or reshare only reflects a proxied local preview of that link. Instead of doing this on a per-view or per-user basis, it’s simply per-instance.

I honestly think the third option might be the least destructive, even if it’s not as efficient as it could be.

Quacksalber@sh.itjust.works · 2 years ago

As I understand it, 3) already happens. What causes the load is that each connected instance is also loading and caching the preview.

Please Don’t Share Our Links on Mastodon: Here’s Why! | itsfoss.com

Please Don’t Share Our Links on Mastodon: Here’s Why! | itsfoss.com

It's FOSS (@itsfoss@mastodon.social)