Are there any ways to block unicode private use areas from coding at the kernel level?

𞋴𝛂𝛋𝛆@lemmy.world · 2 months ago

Are there any ways to block unicode private use areas from coding at the kernel level?

𞋴𝛂𝛋𝛆@lemmy.world · 2 months ago

It is non printing. It cannot be seen or scanned or highlighted. It looks like nothing, except the file size is large with more hex than should be in the binary.

Trigg@lemmy.world · 2 months ago

I’m still not seeing why that is a problem. The information remains even if it has no glyphs.

𞋴𝛂𝛋𝛆@lemmy.world · 2 months ago

It does not. It can be rendered as a control character.

Trigg@lemmy.world · 2 months ago

But… so what?

𞋴𝛂𝛋𝛆@lemmy.world · 2 months ago

No one reads hex as strings IRL.

Trigg@lemmy.world · 2 months ago

But it means nothing. You can cypher in much more efficient or clever ways.

𞋴𝛂𝛋𝛆@lemmy.world · 2 months ago

Unrelated to the question or circumstance

MartianSands@sh.itjust.works · 2 months ago

It ought to look like a bunch of □, which is the glyph generally used to indicate that the font has nothing to represent the character.

Specifically you’d expect U+25A1 □ WHITE SQUARE

MartianSands@sh.itjust.works · 2 months ago

Also, the answer to your actual question is no. There’s definitely no way to block people from using any particular characters at the kernel level.

What you seem to be asking for is a way to absolutely forbid all software from writing certain characters to files, and/or from reading those characters. Aside from requiring that the kernel inspect all data in detail before letting other software have it, which would slow everything way down, it would prevent anyone from reading or writing binary data which happens to contain those sequences of bytes by coincidence. Binary data includes things like the programs which make the system work, so blocking those characters would be terminal

tal@lemmy.today · 2 months ago

Also, (a) userspace could have some higher-level encoding or encryption or compression that happens without the kernel seeing the non-encoded data, and (b) whatever particular Unicode encoding OP is probably thinking of isn’t the only Unicode encoding out there.

That doesn’t, strictly-speaking, mean that it’s impossible to have kernel-level blocking — you could create some kind of emulated system that inspects everything, but it does mean that you couldn’t just inspect data at points where one normally enters the kernel.

The answer that is probably most useful to OP is that if it’s a problem for his application, he should validate it in userspace.

𞋴𝛂𝛋𝛆@lemmy.world · 2 months ago

Not necessarily. Turn this around. Let’s say I am working at somewhere like a chip foundry with tons of IP. I have no access to encryption tools, but I can easily shift characters to a hex range in bash and send emails.

These characters can use the control glyph, and so do not print or show up in any physical way except in hex.

This technique must be obfuscated at every serious organization from governments to industry.

Trigg@lemmy.world · 2 months ago

Encryption exists manually. This isn’t the problem you appear to imagine it is