According to the release:

Adds experimental PostgreSQL support

The code was written by Cursor and Claude

14,997 added lines of code, and 10,202 lines removed

reviewed and heavily tested over 2-3 weeks

This makes me uneasy, especially as ntfy is an internet facing service. I am now looking for alternatives.

Am I overreacting or do you all share the same concern?

  • henfredemars@infosec.pub
    link
    fedilink
    English
    arrow-up
    43
    arrow-down
    3
    ·
    edit-2
    2 days ago

    Definitely share your initial concern. Without strong review processes to ensure that every line of code follows the intent of the human developer, there’s no way of knowing what exactly is in there and the implications for the human users. And I’m not just talking about bugs.

    They say it’s reviewed, but the temptation to blindly trust is there. In this case, developer appears to have taken some care.

    The code was written by Cursor and Claude, but reviewed and heavily tested over 2-3 weeks by me. I created comparison documents, went through all queries multiple times and reviewed the logic over and over again. I also did load tests and manual regression tests, which took lots of evenings.

    Let us hope so. Handle with care to ensure responsibility is not offloaded to a machine instead of a person.

    • Slotos@feddit.nl
      link
      fedilink
      English
      arrow-up
      54
      arrow-down
      1
      ·
      2 days ago

      The size of that changeset means that it’s inherently unreviewable.

      The commit history is something I’ve seen only in the PRs that even the most dysfunctional companies would demand a rewrite for.

      Also, 2-3 weeks review? PostgreSQL support could be added in that time without the need for a damn „vibe check”. Hell, it would probably take less time than that.

      • Mirror Giraffe@piefed.social
        link
        fedilink
        English
        arrow-up
        19
        ·
        2 days ago

        To be fair they would have needed to spend time testing the manual implementation as well.

        The problem I see mainly is that even if this rolls out perfectly, the erratic and changing nature if llms still make it pointless as a proof of concept. Next time Claude might fuck up in a fringe way that’s not covered by unit tests and is missed by manual tests.

        On the other hand I guess I’ve been guilty myself on numerous occasions to implement fringe bugs into production code, but at least I learn from it.

        • Slotos@feddit.nl
          link
          fedilink
          English
          arrow-up
          28
          ·
          1 day ago

          I made my statement as a BDD/TDD practitioner.

          The code goal of software engineering is not to deliver said code, but to deliver it in a framework that lets others—and consequently me in a week’s time—to contribute easily. This makes both future improvements and bug fixes easier.

          Dumping a ~25000 lines changeset with a git history that’s almost designed to confuse is antithetical to both engineering and open source.