Let’s share stories where your automation efforts have been rejected and you can’t quite understand why! Here’s mine.

  • Oliver Lowe@lemmy.sdf.orgOP
    link
    fedilink
    arrow-up
    1
    ·
    1 year ago

    Thanks for sharing. I did a bit of work for a NOC and know exactly what you mean about letting real work slip through your hands. I wasn’t directly responsible for managing the alarms, but it felt strange to be writing software streamlining the workflow. All the time I spent I felt like I could have just helped the technicians actually solving problems they faced in their day to day - to stop the alarms going off in the first place!

    • thisisnotgoingwell@programming.dev
      link
      fedilink
      arrow-up
      2
      ·
      1 year ago

      To be fair, most of the work that you have to do in a NOC is total bullshit. About 30% of the time you will be working on technical issues, and for most other people in the NOC, that would mean escalating the technical issues to me. Unfortunately, I had to earn the stripes, which means I had to work harder than everyone else, which meant doing their work as well as handling all escalations. Eventually, I was promoted to a supervisor for my efforts, but I did not want to be in a managerial role.

      The real bulk of NOC work that is tiresome is the amount of alarms that are unnecessary. Managing SNMP is a nightmare, and configuring it properly involves a deep level of engineering knowledge. You can either tune the alarm board to only show certain alarms(which means parsing through many alarms to find out what is necessary and what isn’t), or you make sure that devices that are onboarded are configured locally for what SNMP traps they will alert for. Typically, the devices’ SNMP settings are not configured, so all alarms get sent to the SNMP server, and the SNMP server was never tuned to know which alarms it should show or it shouldn’t, so there are alarms which don’t really “mean anything” and alarms that “could potentially mean something if it’s correlated with this other alarm,” but most of the work is sifting through so much shit, to then have to troubleshoot a network issue for a network that was never documented in the first place.