EDIT

TO EVERYONE ASKING TO OPEN AN ISSUE ON GITHUB, IT HAS BEEN OPEN SINCE JULY 6: https://github.com/LemmyNet/lemmy/issues/3504

June 24 - https://github.com/LemmyNet/lemmy/issues/3236

TO EVERYONE SAYING THAT THIS IS NOT A CONCERN: Everybody has different laws in their countries (in other words, not everyone is American), and whether or not an admin is liable for such content residing in their servers without their knowledge, don’t you think it’s still an issue anyway? Are you not bothered by the fact that somebody could be sharing illegal images from your server without you ever knowing? Is that okay with you? OR are you only saying this because you’re NOT an admin? Different admins have already responded in the comments and have suggested ways to solve the problem because they are genuinely concerned about this problem as much as I am. Thank you to all the hard working admins. I appreciate and love you all.


ORIGINAL POST

cross-posted from: https://lemmy.ca/post/4273025

You can upload images to a Lemmy instance without anyone knowing that the image is there if the admins are not regularly checking their pictrs database.

To do this, you create a post on any Lemmy instance, upload an image, and never click the “Create” button. The post is never created but the image is uploaded. Because the post isn’t created, nobody knows that the image is uploaded.

You can also go to any post, upload a picture in the comment, copy the URL and never post the comment. You can also upload an image as your avatar or banner and just close the tab. The image will still reside in the server.

You can (possibly) do the same with community icons and banners.

Why does this matter?

Because anyone can upload illegal images without the admin knowing and the admin will be liable for it. With everything that has been going on lately, I wanted to remind all of you about this. Don’t think that disabling cache is enough. Bad actors can secretly stash illegal images on your Lemmy instance if you aren’t checking!

These bad actors can then share these links around and you would never know! They can report it to the FBI and if you haven’t taken it down (because you did not know) for a certain period, say goodbye to your instance and see you in court.

Only your backend admins who have access to the database (or object storage or whatever) can check this, meaning non-backend admins and moderators WILL NOT BE ABLE TO MONITOR THESE, and regular users WILL NOT BE ABLE TO REPORT THESE.

Aren’t these images deleted if they aren’t used for the post/comment/banner/avatar/icon?

NOPE! The image actually stays uploaded! Lemmy doesn’t check if the images are used! Try it out yourself. Just make sure to copy the link by copying the link text or copying it by clicking the image then “copy image link”.

How come this hasn’t been addressed before?

I don’t know. I am fairly certain that this has been brought up before. Nobody paid attention but I’m bringing it up again after all the shit that happened in the past week. I can’t even find it on the GitHub issue tracker.

I’m an instance administrator, what the fuck do I do?

Check your pictrs images (good luck) or nuke it. Disable pictrs, restrict sign ups, or watch your database like a hawk. You can also delete your instance.

Good luck.

  • Swedneck@discuss.tchncs.de
    link
    fedilink
    English
    arrow-up
    71
    ·
    1 year ago

    seems like the solution to this should be to automatically remove images that haven’t been posted, after like 3 minutes

    • Venat0r@lemmy.world
      link
      fedilink
      English
      arrow-up
      8
      ·
      1 year ago

      Or make it like 1hr and don’t let the user know the url of the uploaded image until they post it, that way it wouldn’t be able to be shared or reported.

      • squiblet@kbin.social
        link
        fedilink
        arrow-up
        3
        ·
        1 year ago

        It’s difficult to display an image without the client knowing the URL, but it would be possible to use a temporary URL that only works for that signed-in user.

    • KIM_JONG_JUICEBOX@lemmy.ml
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      Or you set a flag that says something like “incomplete image” and then only once user completes whatever operation by hitting “submit” do you then set it to complete.

      And maybe while an image is not yet complete, only the uploading user can view the image.

  • gencha@lemm.ee
    link
    fedilink
    English
    arrow-up
    40
    arrow-down
    4
    ·
    1 year ago

    This is not unique to Lemmy. You can do the same on Slack, Discord, Teams, GitHub, … Finding unused resources isn’t trivial, and you’re usually better off ignoring the noise.

    If you upload illegal content somewhere, and then tell the FBI about it, being the only person knowing the URL, let me know how that turns out.

    • bmygsbvur@lemmy.caOP
      link
      fedilink
      English
      arrow-up
      6
      arrow-down
      3
      ·
      1 year ago

      Imagine if the image link is shared to other people and you aren’t aware of it. You think that’s acceptable?

      • gencha@lemm.ee
        link
        fedilink
        English
        arrow-up
        1
        ·
        1 year ago

        I do not. As far as I’m aware, this is usually countered through a proper way to follow through on reports. If you host user-generated content, have an abuse contact who will instantly act on reports, delete reported content, and report whatever metadata came along with the upload to the authorities if necessary.

        The bookkeeping code for keeping track of unused uploads has a cost attributed to it. I claim that most providers are not willing to pay that cost proactively, and prefer to act on reports.

        I can only extrapolate from my own experience though. No idea how the industry at large really handles or reasons about this.

    • BreakDecks@lemmy.ml
      link
      fedilink
      English
      arrow-up
      3
      ·
      1 year ago

      I hate how everything is a double edged sword, because this is now also the perfect tool for making sure your CSAM doesn’t trip the filter. Also, it uses CLIP so a simple obfuscation overlay would render it useless.

        • BreakDecks@lemmy.ml
          link
          fedilink
          English
          arrow-up
          4
          ·
          1 year ago

          Any of filter or image processing technique that fools machine vision.

          Example: https://sandlab.cs.uchicago.edu/fawkes/

          At a high level, Fawkes “poisons” models that try to learn what you look like, by putting hidden changes into your photos, and using themn as Trojan horses to deliver that poison to any facial recognition models of you.

          This could be done with any kind of image or detail, not just faces.

          • db0@lemmy.dbzer0.com
            link
            fedilink
            English
            arrow-up
            2
            arrow-down
            4
            ·
            1 year ago

            I don’t think random trolls like that would be be that sophisticated, but in any case we can deal with that once we get to that point.

  • planish@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    21
    ·
    1 year ago

    Why does Lemmy even ship its own image host? There are plenty of places to upload images you want to post that are already good at hosting images, arguably better than pictrs is for some applications. Running your own opens up whole categories of new problems like this that are inessential to running a federated link aggregator. People selfhost Lemmy and turn around and dump the images for “their” image host in S3 anyway.

    We should all get out of the image hosting business unless we really want to be there.

    • Gecko@lemmy.world
      link
      fedilink
      English
      arrow-up
      29
      arrow-down
      1
      ·
      1 year ago

      Convenience for end-users and avoiding link rot is probably one of the reasons.

      • BitOneZero @ .world@lemmy.world
        link
        fedilink
        English
        arrow-up
        15
        ·
        edit-2
        1 year ago

        and avoiding link rot

        Lemmy seems built to destroy information, rot links. Unlike Reddit has been for 15 years, when a person deletes their account Lemmy removes all posts and comments, creating a black hole.

        Not only are the comments disappeared from the person who deleted their account, all the comments made by other users disappear on those posts and comments.

        Right now, a single user just deleting one comment results in the entire branch of comment replies to just disappear.

        Installing an instance was done pretty quickly… over 1000 new instances went online in June because of the Reddit API change. But once that instance goes offline, all the communities hosted there are orphaned and no cleanup code really exists to salvage any of it - because the whole system was built around deleting comments and posts - and deleting an instance is pretty much a purging of everything they ever created in the minds of the designers.

      • planish@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        5
        ·
        1 year ago

        Seems to not be paying off though; having whole communities and instances close is pretty inconvenient.

    • squiblet@kbin.social
      link
      fedilink
      arrow-up
      4
      ·
      1 year ago

      S3 is expensive, while if you use a third party like img.bb or imgur, you never know when they will close, accidentally lose your data, or decide to delete it.

  • 𝘋𝘪𝘳𝘬@lemmy.ml
    link
    fedilink
    English
    arrow-up
    19
    ·
    1 year ago

    This is how it works. Since pictrs and Lemmy are two completely different applications (they even run in two different containers with two different databases) they do not communicate and tracking what images belong to what post or comment simply isn’t possible in the current state I guess.

    How come this hasn’t been addressed before?

    This is how the Fediverse works. There is so much bad practices, so much haphazardly implemented functionality and so much bad API documentation all over the place that I wonder why nothing has extremely exploded so far. We don’t even have proper data protection and everything is replicated to everywhere causing a shitload of legal issues all over the workd but no-one seems to care so far.

    • danwardvs@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      4
      ·
      1 year ago

      This isn’t unique to Lemmy or haphazard coding. It’s a common technique to get pictures into Github READMEs this way. You’d create a PR, upload an image, copy the link, delete the PR, and then paste the link elsewhere on Github for use.

  • BreakDecks@lemmy.ml
    link
    fedilink
    English
    arrow-up
    15
    arrow-down
    3
    ·
    1 year ago

    the admin will be liable for it.

    These bad actors can then share these links around and you would never know! They can report it to the FBI and if you haven’t taken it down (because you did not know) for a certain period, say goodbye to your instance and see you in court.

    In most jurisdictions this is not now it would work. Even a less tech savvy investigator would figure out that it was an online community not obviously affiliated with CSAM, and focus on alerting you and getting the content removed.

    There’s this misunderstanding that CSAM is some sort of instant go-to-prison situation, but it really does depend on context. It’s generally not so easy to just plant illegal files and tip off the FBI, because the FBI is strategic enough not to be weaponized like that. Keep an eye on your abuse and admin email inboxes, and take action as soon as you see something, and nobody is going to shut you down or drag you to court.

    • bmygsbvur@lemmy.caOP
      link
      fedilink
      English
      arrow-up
      9
      arrow-down
      5
      ·
      1 year ago

      Doesn’t change the fact that this is an issue that needs to be resolved.

      • koper@feddit.nl
        link
        fedilink
        English
        arrow-up
        6
        arrow-down
        1
        ·
        1 year ago

        It’s not. Image hosting sites have existed for decades. Websites are not liable unless they have actual knowledge of illegal content and ignore takedown requests. Stop fearmongering.

        • bmygsbvur@lemmy.caOP
          link
          fedilink
          English
          arrow-up
          2
          arrow-down
          3
          ·
          1 year ago

          Doesn’t change the fact that this issue needs to be addressed. Besides, do you think all countries laws are the same?

      • BreakDecks@lemmy.ml
        link
        fedilink
        English
        arrow-up
        5
        ·
        1 year ago

        Never said otherwise, I just want to make sure we’re not scaring people away from Lemmy administration and moderation, as if they were risking going to prison as a child sex offender or something.

  • garrett@infosec.pub
    link
    fedilink
    English
    arrow-up
    11
    ·
    1 year ago

    Yeah, this is a big issue. I know Lemmy blew up a bit before it was truly ready for prime time but I hope this cleans up.

  • Kool_Newt@lemm.ee
    link
    fedilink
    arrow-up
    12
    arrow-down
    2
    ·
    1 year ago

    This is just like how someone could put printed CSAM behind a bush in my yard or something and some authorities could decide to hold me responsible.

    • bmygsbvur@lemmy.caOP
      link
      fedilink
      English
      arrow-up
      4
      arrow-down
      2
      ·
      1 year ago

      So you’re telling me you’re NOT bothered if CSAM was sitting on your server and shared with others without your knowledge? Do you think all countries have the same laws? You don’t think any of this is an issue?

      • Kool_Newt@lemm.ee
        link
        fedilink
        arrow-up
        1
        arrow-down
        1
        ·
        1 year ago

        You’re a regarded one. I won’t bother to answer such dumb questions.

        • bmygsbvur@lemmy.caOP
          link
          fedilink
          English
          arrow-up
          1
          arrow-down
          1
          ·
          1 year ago

          You’re not an admin so of course you don’t care. How come every admin in this thread has expressed their concern? Because it IS a concern. :)

  • spiritedpause@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    5
    arrow-down
    1
    ·
    edit-2
    1 year ago

    There really needs to be an option for instances to upload images to imgur using their API.

    imgur has been hosting images for years, and has the resources and experience to deal with stuff like CSAM.

    It shouldn’t be the default/only option that hosting an instance means having to open the floodgates for anyone to upload images to their servers.

    From a liability standpoint alone, it’s an absurd thing to just expect every instance to accept.

    • bmygsbvur@lemmy.caOP
      link
      fedilink
      English
      arrow-up
      10
      arrow-down
      1
      ·
      1 year ago

      Most admins aren’t in the USA. But that’s not really the issue here is it?

    • TORFdot0@lemmy.world
      link
      fedilink
      English
      arrow-up
      3
      arrow-down
      1
      ·
      1 year ago

      Are individuals granted the same 230 protections as organizations when it comes to self-hosting an instance? I doubt people are forming non-profits for their self hosting endeavors

        • TORFdot0@lemmy.world
          link
          fedilink
          English
          arrow-up
          2
          ·
          1 year ago

          Thank you! That’s a clear and concise explanation of section 230. I’ve always heard it in reference to big social media companies but your link clearly shows the protections extend to individuals and users as well

  • Admiral Patrick@dubvee.org
    link
    fedilink
    English
    arrow-up
    4
    arrow-down
    4
    ·
    edit-2
    1 year ago

    Just my two cents, but I feel it’s quite irresponsible to post a “how to exploit this platform” guide ON the platform.

    • bmygsbvur@lemmy.caOP
      link
      fedilink
      English
      arrow-up
      13
      arrow-down
      1
      ·
      1 year ago

      This has been known forever. Any bad actor already knows about this. There’s no reason to hide this. I am reminding people so solutions can be solved sooner. I will keep reminding until the problem is solved.

        • bmygsbvur@lemmy.caOP
          link
          fedilink
          English
          arrow-up
          1
          ·
          1 year ago

          Thank you. I did not see this one but it’s almost two months old now. This is what I was talking about when I said that it was already a known issue back then. it just isn’t being addressed. I hope this post will give more attention to this problem.

      • Chickenstalker@lemmy.world
        link
        fedilink
        English
        arrow-up
        2
        arrow-down
        3
        ·
        1 year ago

        Meh. I main 4chan. All sorts of shit get uploaded on 4chan, yet it still exists. I’m not saying nothing should be done, but no need to panic. Quietly delete the images periodically. In terms of what users can do, I suggest a report system where after a certain number of similar reports, the media gets auto pulled for moderation.

        • bmygsbvur@lemmy.caOP
          link
          fedilink
          English
          arrow-up
          2
          ·
          1 year ago

          “Quietly delete the images periodically”. If only it was made easier for admins. You can’t even report these images because nobody knows it was there in the first place.

  • squiblet@kbin.social
    link
    fedilink
    arrow-up
    2
    arrow-down
    2
    ·
    1 year ago

    It would not be difficult to use SQL to delete any images that are not associated with a post or active as an avatar etc. So, set that to be run periodically and it would solve this problem.

    • gencha@lemm.ee
      link
      fedilink
      English
      arrow-up
      2
      ·
      1 year ago

      Checking every single image ID against all stored text blobs is not trivial. Most platforms don’t do this. It’s cheaper to just ignore the unused images.

      • squiblet@kbin.social
        link
        fedilink
        arrow-up
        1
        ·
        1 year ago

        Yeah, this is only if what OP was saying was a real legal threat, which I don’t think it is.

    • bmygsbvur@lemmy.caOP
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      I’m not knowledgeable with SQL. If you know or if anyone knows how to fix it with a script or built into Lemmy, please share.

      • squiblet@kbin.social
        link
        fedilink
        arrow-up
        2
        ·
        1 year ago

        I haven’t worked with Lemmy, but I certainly could craft a script to do that if I was familiar with the database structure. Perhaps I’ll try installing it and running an instance. In the meantime, surely there’s someone with an instance and SQL skills who could figure that out.

    • Kaldo@kbin.social
      link
      fedilink
      arrow-up
      1
      arrow-down
      1
      ·
      1 year ago

      Isn’t it more likely that paths are used to reference resources like images rather than a db fk?

      • squiblet@kbin.social
        link
        fedilink
        arrow-up
        1
        ·
        edit-2
        1 year ago

        Not familiar with Lemmy specifically, but usually in an app like this, while of course the files are stored on a filesystem, IDs and metadata are stored in the DB and associated with each other through relations. It seems in this case one way to express it would be ‘don’t delete every image that is associated with a valid post or in-use avatar, but delete everything else’.

        Take this random image for instance: https://lemmy.world/pictrs/image/ede63269-7b8a-42a4-a1fa-145beea682cb.jpeg
        associated with this post: https://lemmy.world/post/4130981

        Highly likely the way it works is there is an entry for post 4130981 that says it uses ede63269-7b8a-42a4-a1fa-145beea682cb, or an image table with a relation to the post table where an image entry (with whatever ID) that is ede63269-7b8a-42a4-a1fa-145beea682cb says it is related to post 4130981. Whatever the specifics, it would be possible.