Hi guys, I’ve been working on a self-hostable web analytics platform since the start of this year after being frustrated with Google Analytics and Plausible.

I’ve packed a bunch of cool web analytics features into Rybbit, but I’ve tried very hard to keep the interface simple to use,

https://github.com/rybbit-io/rybbit

Check it out!

  • rekabis@lemmy.ca
    link
    fedilink
    English
    arrow-up
    2
    ·
    9 hours ago

    OpenBSD does not have a docker engine. Can this be installed without docker?

    • Goldflag@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      4
      ·
      12 hours ago

      from what i know, awstats gets analytics from server-side logs while Rybbit uses a client side script. So not really and apples to apples comparison

    • danhab99@programming.dev
      link
      fedilink
      English
      arrow-up
      3
      ·
      13 hours ago

      The same advantages as all free and open source solution, it’s free and open source. That means how much it’s going to cost to your business is directly under your control. You can make a decision on how you acquire hardware based on your business’s needs. If you want to add or change features you can decide how to do that based on the deals you have with your programmers (like pick the developer you have with the best skills and the lowest cost), and then you get to control how much it costs you and how reliable the result is going to be.

      If you feel like the support you get from customer service from Amazon or Google or Microsoft is reliable enough and you don’t need more reliability then go ahead and stick with paid products. But if you already have a team of really expensive and talented engineers you might as well let them solve problems with free and open source equipment.

  • parmesancrabs@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    6
    ·
    24 hours ago

    Aways a fan of alternate options, this looks quite tidy! I had a few thoughts / queries. Not at my system right now but I will test it out later.

    I noticed in the screenshots you have a “users” page - but with a cookieless tracking system I would have assumed it wouldn’t be reliable to identify a long term user past individual sessions? Are you doing some hefty finger printing?

    Looking at your features table has a few statements that might need adjusting. Such as GA4’s segmentation sequencing / filtering can be quite complex, I’d argue its not limited and potentially more advanced than Rybbit (not tested yet). It also has a user exploration feature.

    Do you have any plans for a drag and drop style report creation, so that I could create reports with any dimensions / metrics and filter accordingly? I think that would bring a lot of flexibility to the platform for an individuals bespoke needs.

  • Lung@lemmy.world
    link
    fedilink
    English
    arrow-up
    18
    ·
    1 day ago

    Wow holy crap, great work - the world badly needs this. Im assuming the mechanism is the same, you inject a js script into your site. I’m also very interested in pure server side solutions for analytics, but they can’t hit all the features you did in a generic way afaik

    • Goldflag@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      20
      ·
      1 day ago

      Yea, we use a client-side script like almost everyone else. The major difference is that we don’t use cookies so you can avoid a lot of the cookie banner/GDPR nonsense.

      Rybbit definitely isn’t the first open source cookieless web analytics platform (Plausible and Umami are the two other big ones), but it’s probably the most “all-in-one” of all these alternatives.

  • osprior@lemmy.world
    link
    fedilink
    English
    arrow-up
    10
    ·
    1 day ago

    Question is the self-hosted version less featured than the paid hosted version?

    This looks amazing btw.

    • Goldflag@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      15
      arrow-down
      2
      ·
      1 day ago

      Only very slightly so. One of the reasons I created Rybbit is because platforms like plausible and fathom have much inferior self-hosted versions (very limited featureset and basically never updated). We have a comparison here

      • spacelord@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        21
        arrow-down
        1
        ·
        edit-2
        23 hours ago

        @Goldflag

        I appreciate the intent behind Rybbit, but I have to respectfully disagree with the “only very slightly so” characterization. Looking at your official comparison table, the self-hosted version is missing:

        • Pages View
        • Web Vitals
        • Email reports
        • Google Search Console integration
        • VPN/Crawler/ASN tracking
        • Google/GitHub OAuth
        • Email support

        That’s 7 significant features—which seems more than “very slightly” different.

        More importantly, this raises AGPL compliance questions. Under AGPLv3 Section 13, if users interact with modified AGPL software over a network (your cloud version), you’re required to make the complete corresponding source code available to those users. If these cloud-only features are integrated into the same AGPL-licensed codebase, withholding them from the public repo while running them as a network service appears to conflict with the license terms.

        There are really only two compliant scenarios here:

        1. These features exist in the public repo but are just marketed as “cloud-only” (in which case the comparison table’s misleading)
        2. These features are truly separate proprietary code that interfaces with Rybbit without being part of the AGPL-licensed work (which would require careful architectural separation)

        If it’s neither—if these are AGPL-covered features running in your cloud service but withheld from the repo—that’s exactly the “loophole” the AGPL was designed to close. The irony is that you criticized Plausible and Fathom for having “much inferior self-hosted versions,” yet this appears to be a similar approach.

        Could you clarify the licensing status of these cloud-only features? Are they in the public repo but disabled by default, or are they proprietary additions that don’t derive from the AGPL codebase?

        • Goldflag@lemmy.worldOP
          link
          fedilink
          English
          arrow-up
          1
          ·
          21 hours ago

          Everything is in the repo and cloud features are just toggled off in the self-hosted build.

          • spacelord@sh.itjust.works
            link
            fedilink
            English
            arrow-up
            6
            arrow-down
            3
            ·
            20 hours ago

            @Goldflag,

            Thanks for clarifying! Good to hear everything’s in the repo and that it’s truly AGPL compliant.

            Since as self-hosters we already carry the burden of maintenance, updates, security, and infrastructure costs that cloud users don’t, would you consider documenting how to enable the cloud features in self-hosted setups?

            I see the docs cover basic environment variables, but not for Pages View, Web Vitals, or VPN/ASN tracking. Even if some features need extra config (SMTP, OAuth creds), having that documented would help those of us willing to do the work.

            That would truly differentiate Rybbit from Plausible/Fathom—not just code parity, but empowering self-hosters with full feature access.

  • Otter@lemmy.ca
    link
    fedilink
    English
    arrow-up
    12
    ·
    edit-2
    1 day ago

    You mentioned being frustrated at Plausible. What did you not like about it?

    I haven’t tried Plausible, but it seemed popular

    • Goldflag@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      11
      ·
      1 day ago

      it didn’t have enough features, especially since the community version is heavily nerfed (it’s missing even funnels)

      • quick_snail@feddit.nl
        link
        fedilink
        English
        arrow-up
        1
        ·
        edit-2
        11 hours ago

        I think that has the same problems, no? Or does podman do signature verification on all the layers it downloads from the container registry?

    • partofthevoice@lemmy.zip
      link
      fedilink
      English
      arrow-up
      7
      ·
      13 hours ago

      Docker is a security risk? … excuse me, what? Can’t you just, idunno, secure the environment that docker runs in? Use rootless images? Use immutable images?

      And, are you asking for something that runs on bare metal? Couldn’t you just install the ISO that the dockerfile uses, then convert the dockerfile logic to an sh script?

        • partofthevoice@lemmy.zip
          link
          fedilink
          English
          arrow-up
          3
          arrow-down
          1
          ·
          edit-2
          10 hours ago

          You can verify the checksum to ensure the contents pulled are exactly the same as what was published. You can also use a private container registry.

          How exactly would docker pull be any more insecure than something like pip install? Or, really anything… Let’s go with your preferred alternative, how are you going to get it on your machine in a more secure way than docker provides?

          Docker uses TLS with registries, layers and manifests have cryptographic digests, checksums, and you can verify the publisher yourself. Push it into your own registry if you want, or just don’t use latest.

          • quick_snail@feddit.nl
            link
            fedilink
            English
            arrow-up
            1
            arrow-down
            2
            ·
            edit-2
            10 hours ago

            Yeah, that’s the insecurity I’m talking about.

            If you want to know how to implement this properly, look at apt. Its a known issue in docker; they just haven’t prioritized the fix yet (DCT)

            • partofthevoice@lemmy.zip
              link
              fedilink
              English
              arrow-up
              1
              ·
              edit-2
              10 hours ago

              What are you talking about, “yeah that’s the insecurity I’m talking about.”

              I didn’t mention an insecurity and neither have you. Would you mind being a little more clear than “Docker pull is insecure?”

              Frankly, I was expressing confidence in dockers security. It goes without saying though, any user can do insecure things like download from untrusted sources. That’s not dockers problem though, it’s the users.

              Edit: I see now that you added “it’s the download that’s not verified.” Integrity is verified, so I assume you mean authorship (via signing)? I guess you’re saying that, if admin credentials are stolen from a container publisher and the thief force pushes malicious code into the registry under a pre-existing tag—then you would be exposed to that?

              Even in that case, though, a digest cannot be overwritten. Tags can. So you’d just pin the digest to avoid this one attack vector?

              • quick_snail@feddit.nl
                link
                fedilink
                English
                arrow-up
                1
                ·
                10 hours ago

                Checksums are not for security. You need signatures. I’m not making claims that aren’t clearly documented.

                • partofthevoice@lemmy.zip
                  link
                  fedilink
                  English
                  arrow-up
                  1
                  ·
                  10 hours ago

                  You’re talking about authorship. Sure. But if you verify the container yourself as secure and pin the digest, what’s the issue?

      • LordKitsuna@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        12 hours ago

        In its default state i think thats fair. Example docker bypasses most firewalls as it runs before iptables rules process. So if you don’t either use 127.0.0.1:port:port (many compose files offered by projects do not do this) or add specialized iptables rules to fix that up you can end up directly exposing services with meaning to or even realizing.

        And yeah privilege escalation etc. There are solutions like what you mentioned but it can be a lot of work to set all that up so most people won’t

  • StarkZarn@infosec.pub
    link
    fedilink
    English
    arrow-up
    3
    ·
    1 day ago

    Glad to see you post this here. I’ve been experimenting with selfhosted analytics for a while now and have attempted your project here a couple times. The thing that kills me is the Clickhouse requirement. It makes it impossible to host on a lightweight VPS. Like why should my analytics platform require so much more compute than my simple static site? Am I missing something?

    • Goldflag@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      8
      ·
      1 day ago

      Clickhouse definitely takes a lot of resources! There’s unfortunately no way around that, though in my experience it runs fine on the cheapest Hetzner instances which are like $3-4 a month for 2GB of RAM. How lightweight is your VPS?

      And yea, you don’t need clickhouse for a simple static site. I chose clickhouse because it Postgres or MySQL does not scale well since the main site I personally use Rybbit for sends around 20 million events a month.

      It pains me to plug my competitors, but check out Umami or Goatcounter if you want a platform that uses postgres.

      • StarkZarn@infosec.pub
        link
        fedilink
        English
        arrow-up
        1
        ·
        19 hours ago

        Hey thanks so much for the engagement. I was trying to run it on a VPS that cost $35/year. 2GiB of RAM wasn’t quite enough to make it work for me, granted that was with the webserver and ancillary supporting services.

        I’ll find an opportunity to test it out though, as rybbit looks great. I appreciate the mention on the other FOSS products, that’s a good look for you. I have plenty of experience with umami already. Cheers!

    • Goldflag@lemmy.worldOP
      link
      fedilink
      English
      arrow-up
      6
      ·
      1 day ago

      Posthog makes it almost impossible to actually self-host since they try to push you onto the cloud as much as possible. They say that the self-hosted version only works well up until 100k events … which is insane since their cloud free tier is 1 million events. It’s actually the reason why I built Rybbit. I tried to self-host posthog on my server but it ran it up to 100% CPU on 8 cores and didn’t even work.

      Ok posthog rant done.

      The other main difference is that Posthog has like 10+ different products all in one. Their web analytics is good, but it’s just kind of bland (imo) because it’s not their main focus.

  • solrize@lemmy.ml
    link
    fedilink
    English
    arrow-up
    5
    arrow-down
    2
    ·
    1 day ago

    Aren’t there already tons of these already? Piwik has been around for a quite a while, plus there are others mentioned in the comments.