• Lvxferre [he/him]@mander.xyz
    link
    fedilink
    English
    arrow-up
    2
    ·
    6 hours ago

    A few comments, in no specific order, about random related stuff.

    Sometimes characters change depending on nearby characters. Arabic provides a good example of that, as a character can take up to four forms depending on its word position (isolated, initial, medial, final); but you’ll see this in a larger or smaller degree elsewhere, too. Failure to implement this feature m a k es y o urtex t loo kbrok en and hard to read.

    A special case of the above is the ligature, 2+ characters that get joined. Compare for example ⟨f i⟩ and ⟨fi⟩, note how the later is missing the dot.

    Characters can be also modified. Classical example are diacritics; e.g. ⟨c⟩ + ⟨´ ^ ¸⟩ = ⟨ć ĉ ç⟩. Diacritics tend to have simpler shapes and look similar to other diacritics (e.g. the diaeresis and umlaut sign nowadays look the same, but their origin is different), but that’s a tendency, not a rule.

    Sometimes characters are intrinsically associated with other characters. Modern European scripts do this a lot due to the bicameral system; curiously Latin as used natively didn’t (the capital/minuscule distinction is Mediaeval). Sometimes the correspondence is language-specific, e.g. the capital counterpart to Latin ⟨i⟩ is usually ⟨I⟩, but for Turkish Latin it’s ⟨İ⟩.

    Speaking on that, sometimes variants of the “same alphabet” vary on the letters they include / exclude. Or if they interpret a sequence of characters as a single one, for collation purpose.

    The typical character width in relation to the height varies wildly from script to script. For example your typical Greek/Latin/Cyrillic grapheme is still fairly readable if you make it 1:2, although far from ideal (you’ll get issues with a few characters, like ⟨Щ⟩ and ⟨W⟩). On the other hand your typical Han character becomes messy to read if not 1:1.