Questions tagged [unicode]

Unicode is the standard for computer representation of plain text. It encompasses the Universal Character Set, intended to unambiguously represent all characters used in human writing systems in any language, Unicode Transformation Formats (UTFs), defining standardized formats for storing and transmitting Unicode text, and standards for processing and manipulating text.

Unicode is the standard for computer representation of plain . It encompasses:

  • the Universal Character Set (UCS), intended to unambiguously represent all characters used in human writing systems in any language,
  • Unicode Transformation Formats (UTFs), defining standardized formats for storing and transmitting Unicode text, and
  • standards for processing and manipulating Unicode text.

The latest version is 6.0, published in 2011.

The Universal Character Set

Unicode assigns each character an integer code point (from 0 to 0x10FFFF) in the UCS to act as a unique reference. For example:

  • U+0041 A
  • U+0042 B
  • U+0043 C
  • ...
  • U+039B Λ
  • U+039C Μ

Unicode Transformation Formats

UTFs describe how to encode code points as byte representations. The most common forms are UTF-8 (which encodes code points as a sequence of one, two, three or four bytes) and UTF-16 (which encodes code points as two or four bytes).

Code Point          UTF-8           UTF-16 (big-endian)
U+0041              41              00 41
U+0042              42              00 42
U+0043              43              00 43
...
U+039B              CE 9B           03 9B
U+039C              CE 9C           03 9C

Specification

The Unicode Consortium also defines standards for sorting and collation algorithms, rules for capitalization, character normalization and other locale-sensitive character operations.

Identifying Characters

698 questions
264
votes
5 answers

Which Unicode characters do smilies like ٩(•̮̮̃•̃)۶ consist of?

How do I make Unicode texts like the one in the title, or like the following, without resorting to copy+paste? ٩(-̮̮̃-̃)۶ ٩(●̮̮̃•̃)۶ ٩(͡๏̯͡๏)۶ ٩(-̮̮̃•̃). As an aside, note that in your browser the above example should look like:
Anirudh Goel
  • 2,973
  • 4
  • 20
  • 18
144
votes
8 answers

Impossible to put a zero after an aleph?

Me and a friend were joking about aleph's. Upon trying to type א0 (switch those 2 chars), they switched themselves! Any sequence of symbols does not stop this effect. Why is this!?? Try to type these with the 0 and א reversed (c&p for א): א0 א - …
Gradyn Wursten
  • 1,819
  • 2
  • 13
  • 24
128
votes
4 answers

Why shouldn’t I use Unicode characters to simulate typographic styles (such as small caps or script)?

Unicode contains various characters that look like typographically stylised variants of characters of the basic Latin alphabet and that allow one to write texts in the corresponding typographic styles without resorting to mark-up or similar. For…
Wrzlprmft
  • 2,782
  • 5
  • 19
  • 31
125
votes
5 answers

How do these icons work: ✅️?

I can see these characters as colored icons: ✅️ It only works in Firefox for me. If you can't see the characters in color, it looks like this on my system (it's probably font-dependent): I can even see them in firebug and tab titles: And…
Tomáš Zato
  • 4,282
  • 14
  • 48
  • 77
119
votes
7 answers

How to type special characters in Linux?

In windows there is possibility to type from keyboard special signs by holding alt key and typing a few numbers, that depends on with sign you want to use. Does it work with linux in the same way?
inothemo
  • 2,099
  • 4
  • 18
  • 14
108
votes
7 answers

Is there a unicode character for the Windows key?

I'm trying to communicate over text the Windows keyboard shortcuts. For the ones that use the Windows key, I don't want to type "Windows key +" each time. Is there a unicode character for the Windows key?
Gabriel Fair
  • 3,605
  • 12
  • 35
  • 65
90
votes
11 answers

How do you type Unicode characters using hexadecimal codes?

This is in Windows, but answers for other operating systems can be handy to others. Most guides say something to the effect of "hold down the Alt key and type in the code on the keypad". This works fine for decimal codes (like 65 for 'A'), but not…
user939
83
votes
5 answers

How do I debug an emoticon-based URL?

I came across this URL (NSFW) and need to convert this to puny code. As an experiment, I'll paste this URL here, but not sure if this will save. http://..ws/ (NSFW) How can I convert this URL to a standard DNS name so I can whois the IP space?…
makerofthings7
  • 2,255
  • 7
  • 26
  • 36
73
votes
7 answers

How can I display the  (U+F8FF, Apple logo) character on Windows?

In Apple's marketing materials, the company often refers to the Apple Watch as "Watch". If that last sentence displayed as "Watch", congratulations! You're probably using an Apple device. To demonstrate, here's what the Wikipedia page for Apple…
Stevoisiak
  • 13,555
  • 39
  • 101
  • 154
69
votes
2 answers

Setting UTF8 as default Character Encoding in Windows 7

is there a way to set Windows 7 to globally use UTF-8 as standard? its really annoying to set every single text editor to use it.
Baarn
  • 6,624
  • 6
  • 35
  • 64
60
votes
14 answers

What's that Unicode character in my clipboard?

Is there a quick and easy way to find the Unicode code point for any character? For example, I see a funny character on a web page, or a PDF file, or some other document. What I current do is copy the character to the clipboard, save it to a file,…
DodgyCodeException
  • 765
  • 1
  • 7
  • 15
52
votes
5 answers

Notepad++ inserting special Unicode characters in UTF-8

What's the best ways to enter special Unicode characters into a Notepad++ document? Do I have to rely on the operating system (Windows)? Looking for a see-and-click solution. I can bring up the ASCII Insertion Panel with Edit | Character Panel —…
Bob Stein
  • 1,383
  • 1
  • 16
  • 23
52
votes
12 answers

Is there a Pac-Man-like character in ASCII or Unicode?

Simple question: is there a character that looks either like Pac-Man, or like the ghost in Pac-Man? With Google's recent Pac-Man logo, everyone should know what these look like, but in case you don't here are some sample images: If you answer "no"…
Ricket
  • 1,556
  • 4
  • 18
  • 27
48
votes
2 answers

What is this character: '*​'?

A friend pasted a command into a Slack chat room which contained the character *. This looks like a normal * but isn't: $ uniprops '*​' uniprops: no character named ‹*​› While if I run uniprops on the asterisk I get when typing on my machine, I…
terdon
  • 52,568
  • 14
  • 124
  • 170
42
votes
7 answers

Copying unicode symbols from Firefox address bar as is

Let's say I open a webpage with some Unicode characters, say, Cyrillic, in the address like this: http://ru.wikipedia.org/wiki/Функциональная_закреплённость When I try to copy it from the address bar somewhere else, it becomes unreadable…
Mirzhan Irkegulov
  • 1,384
  • 1
  • 12
  • 31
1
2 3
46 47