This is a curated list of characters in Unicode, that have interesting (and maybe not widely known) features or are awesome in some other way.
The code points of the Unicode blocks Box Drawing (U+2500 to U+257F) and Block Elements (U+2580 to U+259F) cover most of your monospace command-line visualization needs.
╭───────╮
│Unicode│
│rules! │
╰┬─────┬╯
U+2E2E REVERSED QUESTION MARK - the “irony mark” to express irony/sarcasm. A useful character⸮
U+D800 to U+DFFF - surrogate code points. They are only reserved to ease UTF-16 encoding.
U+FEFF ZERO WIDTH NO-BREAK SPACE - it’s name suggests, that it can be used like U+2060 WORD JOINER. And in fact the latter was introduced to inherit its semantics. This is because U+FEFF had become a special beacon called the byte order mark, that was placed on the beginning of some UTF-8 files. In complying software (including many text editors) this character is stripped from the start of a file and handled as metadata. In non-complying software (like the PHP interpreter) this leads to all sorts of fun behaviour.
U+FFFD REPLACEMENT CHARACTER - when a character cannot be displayed (e.g., decoding an erroneous UTF-8 sequency), this code point steps into the breach.
U+1D455 is missing. It would be an italic small “h”. It was not encoded, because it would be identical to the Planck constant ℎ (U+210E).
U+FF03 FULLWIDTH NUMBER SIGN - it is the
"Japanese Hashtag" #
. Sites like Twitter accept it as equivalent to the
regular #
(U+0023).
U+202D and U+202E - change the text direction. Relevant XKCD:
U+FE0E VARIATION SELECTOR-15 - force black-&-white emoji. If this code point follows an emoji, an explicit monochrome rendering of the emoji is requested (if the client supports it).
U+FE0F VARIATION SELECTOR-16 - force colorful emoji. If this code point follows an emoji, an explicit colorful rendering of the emoji is requested (if the client supports it).
Diacritics and combining marks: There is a host of
characters, that add
to the characters before. Those are called Combining Marks. Unicode
provides a handy FAQ on the
details, but in a nutshell: If you add one after a character, it is placed
on top of that previous one. So, a + ̊ = å
. This may lead to all kinds
of funny problems, because for some combinations there are pre-composed
characters. Our little å
here can also be encoded as U+00E5. You might
note, that while this has a length of one character, the combination of a
and combining ring has a length of two characters.
Of course, one can also do fun things with those characters like this answer on StackOverflow.
The Regional Indicator Symbols U+1F1E6 to U+1F1FF resemble the 26 latin characters. They are used to create flag emoji. Since the Unicode consortium didn’t feel like getting on board with international politics, the solution to flags is to combine these 26 characters to the respective ISO code for a country. Examples:
Country | ISO Code | Code Points | Emoji (if supported) |
---|---|---|---|
USA | US | U+1F1FA + U+1F1F8 | 🇺🇸 |
Germany | DE | U+1F1E9 + U+1F1EA | 🇩🇪 |
China | CN | U+1F1E8 + U+1F1F3 | 🇨🇳 |
Skin color of emoji: There are five code points, that control the skin color of emoji, U+1F3FB to U+1F3FF. They are called “Emoji Modifier Fitzpatrick Type” 1 to 6, with 1 the palest and 6 the darkest. If one of these characters follows an emoji, that emoji is meant to be rendered in the appropriate skin color of the Fitzpatrick scale. If no such modifier is added, the skin tone should be unnatural, e. g., bright yellow. Fun fact: Since the Fitzpatrick modifiers are normal code points, emoji with such skin colors have the length 2, which Twitter users noticed first. Here is a comparison chart directly from the specification:
Code | Name | Samples |
---|---|---|
U+1F3FB | EMOJI MODIFIER FITZPATRICK TYPE-1-2 | |
U+1F3FC | EMOJI MODIFIER FITZPATRICK TYPE-3 | |
U+1F3FD | EMOJI MODIFIER FITZPATRICK TYPE-4 | |
U+1F3FE | EMOJI MODIFIER FITZPATRICK TYPE-5 | |
U+1F3FF | EMOJI MODIFIER FITZPATRICK TYPE-6 |
in HTML.­
)
like ZERO WIDTH SPACE, but show a hyphen if (and only if) a break occurs.@font-face
on Twitter.For better comparison of which code point has which effect, consult this table:
U+00A0 | U+00AD | U+200B | U+200D | U+2060 | |
---|---|---|---|---|---|
create space | ✓ | ✗ | ✗ | ✗ | ✗ |
allow breaking | ✗ | ✓ | ✓ | ✗ | ✗ |
possible change | ✗ | ✓ | ✗ | ✓ | ✗ |
Smashing Magazine featured a comprehensive article on the different types of whitespace.
1 + 2 === 3
.For plain-text gaming, Unicode is well equipped with several complete sets:
See the contribution guide for details.
To the extent possible under law, the contributors have waived all copyright and related or neighboring rights to this work. See the license file for details.