I assume they all crib from the same training sets, but surely one of the billion dollar companies behind them can make their own?

  • hexagonwin@lemmy.sdf.org
    link
    fedilink
    arrow-up
    1
    ·
    15 hours ago

    wdym ‘disgusting’? isn’t common crawl just popular websites (alexa ranking? idk) crawled and provided raw?