https://en.wikipedia.org/wiki/Private_Use_Areas

I came across a Python library that passed the ASCII range into one of these non printable character ranges and then into a database. If someone was doing that manually with a hex table, how is that detected and mitigated?

  • tal@lemmy.today
    link
    fedilink
    English
    arrow-up
    1
    ·
    9 hours ago

    Also, (a) userspace could have some higher-level encoding or encryption or compression that happens without the kernel seeing the non-encoded data, and (b) whatever particular Unicode encoding OP is probably thinking of isn’t the only Unicode encoding out there.

    That doesn’t, strictly-speaking, mean that it’s impossible to have kernel-level blocking — you could create some kind of emulated system that inspects everything, but it does mean that you couldn’t just inspect data at points where one normally enters the kernel.

    The answer that is probably most useful to OP is that if it’s a problem for his application, he should validate it in userspace.