• 0 Posts
  • 10 Comments
Joined 2 years ago
cake
Cake day: June 27th, 2023

help-circle

  • The reason I tend to object to these things is that bikeshedding isn’t free, it creates work and technical debt. That raises the bar for changes we ought to make, and I think it raises it quite a lot higher than objections which are frequently specific to the US and are largely imaginary (which is my honest interpretation of most of these changes).

    That said, “genocide” is clearly unnecessarily provocative. It’s also not an industry-wide change, it’s just one function, so this particular change seems sensible to me


  • Be cautious about trusting the AI-detection tools, they’re not much better than the AI they’re trying to detect, because they’re just as prone to false positives and false negatives as the agents they claim to detect.

    It’s also inherently an arms race, because if a tool exists which can easily and reliably detect AI generated content then they’d just be using that tool for their training instead of what they already use, and the AI would quickly learn to defeat it. They also wouldn’t be worrying about their training data being contaminated by the output of existing AI, Which is becoming a genuine problem right now





  • No, I’m arguing that the extra complexity is something to avoid because it creates new attack surfaces, new opportunities for bugs, and is very unlikely to accurately deal with all of the edge cases.

    Especially when you consider that the behaviour we have was established way before there even was a unicode standard which could have been applied, and when the alternative you want isn’t unambiguously better than what it does now.

    “What is language” is a far more insightful question than you clearly intended, because our collective best answer to that question right now is the unicode standard, and even that’s not perfect. Making the very core of the filesystem have to deal with that is a can of worms which a competent engineer wouldn’t open without very good reason, and at best I’m seeing a weak and subjective reason here.


  • The reason, I suspect, is fundamentally because there’s no relationship between the uppercase and lowercase characters unless someone goes out of their way to create it. That requires that the filesystem contain knowledge of the alphabet, which might work if all you wanted was to handle ASCII in American English, but isn’t good for a system which needs to support the whole world.

    In fact, the UNIX filesystem isn’t ASCII. It’s also not unicode. UNIX uses arbitrary byte strings, with special significance given to a very small number of bytes (just ‘/’ and ‘\0’, I think). That means people are free to label files in whatever way they like, and their terminals or other applications are free to render them in whatever way seems appropriate, without the filesystem having to understand unicode.

    Adding case insensitivity would therefore actually be significant and unnecessary complexity to add to the filesystem drivers, and we’d probably take a big step backwards in support for other languages