• melfie@lemy.lol
    link
    fedilink
    English
    arrow-up
    1
    ·
    edit-2
    4 hours ago

    I regularly use GH Copilot with Claude Sonnet at work and it’s a coin toss whether it’s actually useful, but I overall do find value in using it. For my own use at home, I don’t do subscriptions for software and I’m also not giving these companies my data. I would self-host something like Qwen3 with Llama.cpp, but running the flagship MoE model would basically require a $10k GPU and one hell of a PSU. I could probably self-host a smaller model that wouldn’t be nearly as useful, but I’m not sure that would even be worth the effort.

    Therein lies the problem. My company is paying a monthly fee for me to use Copilot that would take like 20 years to pay for even one of the $10k GPUs that I’m likely hogging for minutes at a time, and these companies are going to spend trillions building data centers full of these GPUs. It’s obvious that the price we are paying for AI now doesn’t cover the expense of actually running it, but it might when these models become less resource-intensive to the extent that they can run on a normal machine. However, in that case, why even run them in a data centers instead of just running them on the user’s local machine? I’m just not following how these new data centers are going to pay for themselves, though maybe my math is wrong, or I’m ignorant of the economies of scale hosting these models for a large user base.