🔍
Text Stylers
Fancy Text Generator
🎨
Stylish Name Generator
s
Small Text Generator
Text Formatters
💼
LinkedIn Formatter
💬
WhatsApp Formatter
🔠
Text Case Converter
Unicode Tools
👁
Invisible Character
🔍
Unicode Inspector
Utilities
🔁
Text Repeater
📄
Word Counter
🌟
Color Palette Generator
Productivity
Presentation Timer
🎯
Random Name Picker
Browse Symbols
😀
Browse Emoji
All Tools
📃
Blog
🔍 Unicode Inspector

Paste Anything.
See Everything.

Every character decoded — code point, name, category, HTML entity, CSS escape and more.

0
Characters
0
Unique
0
Scripts
0
Warnings
🔮
Ready to inspect
Type or paste any text above — even emoji, symbols, or suspicious-looking characters
Copied!

How the Unicode Inspector works

Three steps from paste to full Unicode analysis — no account, no install.

1
Paste or type any text

Drop in anything — a LinkedIn bio, a suspicious username, an emoji, a symbol, or text copied from any website. The inspector handles every Unicode character ever defined, including zero-width and invisible characters.

2
Every character fans out into its own card

Each code point gets its own card showing the character, its official Unicode code point (U+XXXX), and its full Unicode name. ZWJ emoji sequences — like family emoji — are intelligently grouped so you see the full picture.

3
Click any card for deep analysis

Tap any character to see its complete profile: block, script, Unicode category, HTML entity, CSS escape sequence, UTF-8 byte values, decimal code, all four normalization forms, and confusable warnings.

What you can detect and inspect

The Unicode Inspector goes beyond just showing code points — it actively flags problems in text.

⚠️
Confusable / spoofed characters

Instantly reveals when a text uses Cyrillic, Greek, or Mathematical Unicode characters that look identical to Latin letters. Essential for catching phishing URLs, fake brand names, and impersonation usernames. Try pasting “сlоud” — it looks like “cloud” but every letter is different.

👻
Invisible and zero-width characters

Detects Zero Width Space (U+200B), Zero Width Joiner, Zero Width Non-Joiner, Word Joiner, Soft Hyphen, BOM, and directional marks — all characters invisible to the human eye but present in the data. Common in copy-pasted web text and social usernames.

🔗
ZWJ emoji sequences

Family emoji like 👨‍👩‍👧‍👦 are actually multiple emoji joined by Zero Width Joiner characters. The inspector groups these into a single token so you understand what you’re really seeing, and lets you inspect every component code point individually.

🚩
Mixed-script homograph attacks

Flags text that mixes scripts (Latin + Cyrillic, Latin + Greek) in a single word — a signature pattern of homograph phishing attacks, where attackers register domain names that look identical to real ones but contain substitute characters.

🔄
Normalization forms (NFC / NFD / NFKC / NFKD)

Shows all four Unicode normalization forms for any character. Critical for developers dealing with string comparison bugs — two strings that look identical can fail equality checks if their normalization forms differ. Accented characters are especially common culprits.

📋
HTML entities and CSS escapes

Every character shows its HTML entity (❤), CSS escape (\2764), UTF-8 byte sequence, and decimal code point — all copy-ready in one click. Indispensable for front-end developers, email designers, and anyone working with special characters in code.

Who uses the Unicode Character Inspector

From cybersecurity researchers to front-end developers — anyone working with text at a deeper level.

Security

Cybersecurity researchers and fraud analysts use it to analyse suspicious URLs, usernames, and phishing text. Paste a domain that looks like “paypal.com” and instantly see which characters are Cyrillic or Greek imposters.

Dev

Front-end and backend developers use it to debug encoding issues, find the correct HTML entity or CSS escape for a special character, and understand normalization differences causing string comparison failures in their code.

Content

Content moderators and trust & safety teams use it to identify Unicode abuse — hidden characters embedded in usernames or bios, look-alike characters used to bypass filters, and invisible text used to manipulate word counts or pattern matching.

Design

Designers and typographers use it to identify the exact Unicode characters in a symbol or ligature, find the correct code point for a decorative character, and understand which Unicode blocks contain the symbols they’re looking for.

Curious

Curious people use it to understand what’s really inside emoji, why some characters look the same but behave differently in searches, and what those weird mathematical bold characters in LinkedIn posts actually are under the hood.

Frequently asked questions

What is a Unicode code point?

A Unicode code point is the unique number assigned to every character in the Unicode standard. Written as U+XXXX (e.g. U+0041 for the letter A, U+1F600 for 😀). The Unicode standard currently defines 149,813 characters across 154 writing scripts. Every device that supports Unicode uses the same code points, ensuring consistent text across all platforms and languages.

What is a confusable character and why is it dangerous?

A confusable character is a character from one Unicode script that looks visually identical (or nearly identical) to a character from another script. For example, the Cyrillic letter а (U+0430) looks exactly like the Latin a (U+0061) but is a completely different code point. Attackers use this to register fake domain names like pаypal.com (with a Cyrillic “а”) that look identical to the real site but point to a phishing page.

What is a Zero Width Space and why would someone use it?

A Zero Width Space (U+200B) is an invisible character with no visible width. Legitimate uses include line-breaking hints in long words and separators in certain scripts. However, it is frequently abused to bypass text filters (inserting it inside a banned word like “s​pam”), to create usernames that appear identical to existing ones, or to embed hidden text inside visible content. Our inspector shows a ∅ symbol where invisible characters appear.

What is a ZWJ emoji sequence?

A Zero Width Joiner (ZWJ) sequence is a group of emoji joined by U+200D (ZWJ) characters that combines into a single visible emoji. For example, 👨‍👩‍👧‍👦 (family) is actually 7 code points: man emoji + ZWJ + woman emoji + ZWJ + girl emoji + ZWJ + boy emoji. Similarly, 👩‍💻 (woman technologist) is a woman emoji + ZWJ + laptop emoji. Our inspector detects and groups these automatically.

What are the four Unicode normalization forms?

Unicode has four normalization forms for representing equivalent characters. NFC (Canonical Decomposition, followed by Canonical Composition) is the most common and the default for web content — it composes characters like é as a single code point. NFD decomposes them into base character + combining accent (two code points). NFKC and NFKD additionally apply compatibility mappings, which means characters like the mathematical bold 𝗔 are mapped to plain A. This matters enormously for developers comparing strings.

How is this different from other Unicode lookup tools?

Most Unicode lookup tools require you to know the code point in advance, show one character at a time, and have interfaces built for specialists. Our inspector analyses any pasted text character by character, intelligently groups emoji sequences, actively warns you about spoofing and hidden characters, shows normalization forms side by side, and provides copy-ready HTML entities and CSS escapes — all with a clean interface that works for non-developers too.

Does the inspector store or send my text anywhere?

No. Everything runs entirely in your browser. The text you paste is never sent to any server and is not stored anywhere. The Unicode analysis, character lookup, and detection all happen locally in JavaScript on your device. You can even use it offline once the page has loaded.

What is UTF-8 and how does it relate to Unicode?

Unicode is the standard that assigns a number (code point) to every character. UTF-8 is the most common way to encode those numbers as bytes for storage and transmission. ASCII characters (A–Z, 0–9, basic punctuation) use 1 byte in UTF-8. Western European accented characters use 2 bytes. Characters from Asian scripts use 3 bytes. Emoji and supplementary characters use 4 bytes. Our inspector shows the exact UTF-8 byte sequence for every character.

More tools

4,700+ Symbols & Emoji
ready to copy

Now you know what’s inside them — browse the collection.