• abhibeckert@lemmy.world
    link
    fedilink
    arrow-up
    25
    arrow-down
    3
    ·
    edit-2
    1 year ago

    I love the comparison of string length of the same UTF-8 string in four programming languages (only the last one is correct, by the way):

    Python 3:

    len(“🤦🏼‍♂️”)

    5

    JavaScript / Java / C#:

    “🤦🏼‍♂️”.length

    7

    Rust:

    println!(“{}”, “🤦🏼‍♂️”.len());

    17

    Swift:

    print(“🤦🏼‍♂️”.count)

    1

    • Walnut356@programming.dev
      link
      fedilink
      arrow-up
      35
      ·
      edit-2
      1 year ago

      That depends on your definition of correct lmao. Rust explicitly counts utf-8 scalar values, because that’s the length of the raw bytes contained in the string. There are many times where that value is more useful than the grapheme count.