Paste Anything.
See Everything.
Every character decoded — code point, name, category, HTML entity, CSS escape and more.
How the Unicode Inspector works
Three steps from paste to full Unicode analysis — no account, no install.
Drop in anything — a LinkedIn bio, a suspicious username, an emoji, a symbol, or text copied from any website. The inspector handles every Unicode character ever defined, including zero-width and invisible characters.
Each code point gets its own card showing the character, its official Unicode code point (U+XXXX), and its full Unicode name. ZWJ emoji sequences — like family emoji — are intelligently grouped so you see the full picture.
Tap any character to see its complete profile: block, script, Unicode category, HTML entity, CSS escape sequence, UTF-8 byte values, decimal code, all four normalization forms, and confusable warnings.
What you can detect and inspect
The Unicode Inspector goes beyond just showing code points — it actively flags problems in text.
Instantly reveals when a text uses Cyrillic, Greek, or Mathematical Unicode characters that look identical to Latin letters. Essential for catching phishing URLs, fake brand names, and impersonation usernames. Try pasting “сlоud” — it looks like “cloud” but every letter is different.
Detects Zero Width Space (U+200B), Zero Width Joiner, Zero Width Non-Joiner, Word Joiner, Soft Hyphen, BOM, and directional marks — all characters invisible to the human eye but present in the data. Common in copy-pasted web text and social usernames.
Family emoji like 👨👩👧👦 are actually multiple emoji joined by Zero Width Joiner characters. The inspector groups these into a single token so you understand what you’re really seeing, and lets you inspect every component code point individually.
Flags text that mixes scripts (Latin + Cyrillic, Latin + Greek) in a single word — a signature pattern of homograph phishing attacks, where attackers register domain names that look identical to real ones but contain substitute characters.
Shows all four Unicode normalization forms for any character. Critical for developers dealing with string comparison bugs — two strings that look identical can fail equality checks if their normalization forms differ. Accented characters are especially common culprits.
Every character shows its HTML entity (❤), CSS escape (\2764), UTF-8 byte sequence, and decimal code point — all copy-ready in one click. Indispensable for front-end developers, email designers, and anyone working with special characters in code.
Who uses the Unicode Character Inspector
From cybersecurity researchers to front-end developers — anyone working with text at a deeper level.
Cybersecurity researchers and fraud analysts use it to analyse suspicious URLs, usernames, and phishing text. Paste a domain that looks like “paypal.com” and instantly see which characters are Cyrillic or Greek imposters.
Front-end and backend developers use it to debug encoding issues, find the correct HTML entity or CSS escape for a special character, and understand normalization differences causing string comparison failures in their code.
Content moderators and trust & safety teams use it to identify Unicode abuse — hidden characters embedded in usernames or bios, look-alike characters used to bypass filters, and invisible text used to manipulate word counts or pattern matching.
Designers and typographers use it to identify the exact Unicode characters in a symbol or ligature, find the correct code point for a decorative character, and understand which Unicode blocks contain the symbols they’re looking for.
Curious people use it to understand what’s really inside emoji, why some characters look the same but behave differently in searches, and what those weird mathematical bold characters in LinkedIn posts actually are under the hood.
Frequently asked questions
A Unicode code point is the unique number assigned to every character in the Unicode standard. Written as U+XXXX (e.g. U+0041 for the letter A, U+1F600 for 😀). The Unicode standard currently defines 149,813 characters across 154 writing scripts. Every device that supports Unicode uses the same code points, ensuring consistent text across all platforms and languages.
A confusable character is a character from one Unicode script that looks visually identical (or nearly identical) to a character from another script. For example, the Cyrillic letter а (U+0430) looks exactly like the Latin a (U+0061) but is a completely different code point. Attackers use this to register fake domain names like pаypal.com (with a Cyrillic “а”) that look identical to the real site but point to a phishing page.
A Zero Width Space (U+200B) is an invisible character with no visible width. Legitimate uses include line-breaking hints in long words and separators in certain scripts. However, it is frequently abused to bypass text filters (inserting it inside a banned word like “spam”), to create usernames that appear identical to existing ones, or to embed hidden text inside visible content. Our inspector shows a ∅ symbol where invisible characters appear.
A Zero Width Joiner (ZWJ) sequence is a group of emoji joined by U+200D (ZWJ) characters that combines into a single visible emoji. For example, 👨👩👧👦 (family) is actually 7 code points: man emoji + ZWJ + woman emoji + ZWJ + girl emoji + ZWJ + boy emoji. Similarly, 👩💻 (woman technologist) is a woman emoji + ZWJ + laptop emoji. Our inspector detects and groups these automatically.
Unicode has four normalization forms for representing equivalent characters. NFC (Canonical Decomposition, followed by Canonical Composition) is the most common and the default for web content — it composes characters like é as a single code point. NFD decomposes them into base character + combining accent (two code points). NFKC and NFKD additionally apply compatibility mappings, which means characters like the mathematical bold 𝗔 are mapped to plain A. This matters enormously for developers comparing strings.
Most Unicode lookup tools require you to know the code point in advance, show one character at a time, and have interfaces built for specialists. Our inspector analyses any pasted text character by character, intelligently groups emoji sequences, actively warns you about spoofing and hidden characters, shows normalization forms side by side, and provides copy-ready HTML entities and CSS escapes — all with a clean interface that works for non-developers too.
No. Everything runs entirely in your browser. The text you paste is never sent to any server and is not stored anywhere. The Unicode analysis, character lookup, and detection all happen locally in JavaScript on your device. You can even use it offline once the page has loaded.
Unicode is the standard that assigns a number (code point) to every character. UTF-8 is the most common way to encode those numbers as bytes for storage and transmission. ASCII characters (A–Z, 0–9, basic punctuation) use 1 byte in UTF-8. Western European accented characters use 2 bytes. Characters from Asian scripts use 3 bytes. Emoji and supplementary characters use 4 bytes. Our inspector shows the exact UTF-8 byte sequence for every character.
4,700+ Symbols & Emoji
ready to copy
Now you know what’s inside them — browse the collection.