• Echo Dot@feddit.uk
        link
        fedilink
        English
        arrow-up
        1
        arrow-down
        11
        ·
        2 years ago

        Well that’s an easy problem to solve by not being a useless programmer.

        • Throwaway@lemm.ee
          link
          fedilink
          English
          arrow-up
          7
          arrow-down
          2
          ·
          2 years ago

          You’d think so, but it’s just not. Pretend “Gamer” is a slur. I can type it “G A M E R”, I can type it “GAm3r”, I can type it “GMR”, I can mix and match. It’s a never ending battle.

          • Echo Dot@feddit.uk
            link
            fedilink
            English
            arrow-up
            1
            arrow-down
            1
            ·
            2 years ago

            That’s because regular expressions are a terrible way to try and solve the problem. You don’t do exact tracking matching you do probabilistic pattern matching and then if the probability of something exceeds a certain preset value then you block it then you alter the probability threshold on the frequency of the comment coming up in your data set. Then it’s just a matter of massaging your probability values.