ASCII Explained — The Character Encoding Every Developer Should Know
ASCII (American Standard Code for Information Interchange) was established in 1963 as the first widely adopted standard for mapping text characters to numeric values. Over 60 years later it remains foundational: it is a subset of UTF-8, it defines the characters used in every programming language syntax, and understanding the ASCII table helps when debugging encoding issues, sorting problems, and character manipulation logic.
Convert between characters, ASCII codes, decimal, hex, and binary with our free ASCII Converter tool. For understanding the hex values of ASCII codes, our Hex Converter and Binary Converter tools provide instant conversions.
The ASCII Table — What You Need to Know
ASCII defines 128 characters using 7-bit encoding, covering values from 0 to 127. These fall into three main groups:
Control characters (0 to 31): Non-printable characters that control how text is displayed or transmitted. The most important ones for developers: null character (0), horizontal tab (9), line feed meaning newline (10), carriage return (13), escape (27), and delete (127).
Printable characters (32 to 126): Space (32), punctuation marks, digits 0 through 9, uppercase letters A through Z, and lowercase letters a through z.
Key numeric relationships you should remember: Digit 0 = code 48. Digit 9 = code 57. To get the numeric value of a digit character, subtract 48. Letter A = code 65. Letter Z = code 90. Letter a = code 97. Letter z = code 122. The difference between uppercase and lowercase letters is always exactly 32. This difference of 32 is used in case-conversion algorithms: to convert uppercase to lowercase, add 32 to the code point. To convert lowercase to uppercase, subtract 32.
ASCII in Programming
Getting the ASCII Code of a Character
JavaScript: "A".charCodeAt(0) returns 65. "a".charCodeAt(0) returns 97. Python: ord("A") returns 65. ord("a") returns 97. C: (int) 'A' gives 65. Characters are just integers in C.
Getting the Character from a Code
JavaScript: String.fromCharCode(65) returns "A". String.fromCharCode(97) returns "a". Python: chr(65) returns "A". chr(97) returns "a". C: (char) 65 gives 'A'.
String Sorting and ASCII Values
When programming languages sort strings by default, they compare character by character using ASCII (or Unicode) code values. This means uppercase letters sort before lowercase letters because A through Z (65 to 90) all have lower code values than a through z (97 to 122). The string "Zebra" sorts before "apple" in default string comparison because uppercase Z (90) has a lower code than lowercase a (97).
When sorting mixed-case strings for user display, always normalise case first: sort strings by their lowercase or uppercase equivalent, not their raw ASCII values. Most languages provide locale-aware sorting options that handle this correctly.
Convert text to ASCII codes or convert ASCII codes back to text — free and instant
Try ASCII Converter Free →ASCII vs Unicode vs UTF-8
ASCII covers only 128 characters — English letters, digits, and basic punctuation. It has no accented characters, no non-Latin scripts, no emoji, no special symbols. This limitation became a serious problem as computing spread globally.
Unicode solves this by assigning a unique code point to every character in every writing system ever used — currently over 140,000 characters covering 150 scripts, emoji, historical scripts, and mathematical symbols. U+0041 is the capital letter A, U+00E9 is the letter e with acute accent, U+1F600 is the grinning face emoji, U+4E2D is the Chinese character for middle.
UTF-8 is the most common encoding of Unicode. It uses 1 to 4 bytes per character. The critical property: the first 128 UTF-8 code points are byte-for-byte identical to ASCII. Any valid ASCII text is valid UTF-8. This backward compatibility is why UTF-8 became the dominant encoding on the web and why ASCII, despite being over 60 years old, remains directly relevant today.
You can verify character encoding in text-based tools. Our Character Counter tool counts characters and bytes, useful for understanding encoding differences. Our IDN Punnycode Converter converts international domain names containing non-ASCII characters to their ASCII-compatible encoding.
Control Characters — The Invisible Ones
The control characters (0 to 31) are often the source of mysterious bugs, especially when handling text from different operating systems or external sources.
The newline situation: Unix and Linux use line feed (code 10, written as backslash n) to end lines. Windows uses carriage return plus line feed (codes 13 and 10, written as backslash r backslash n). Old Mac OS used only carriage return (code 13). When a Windows text file is opened on Linux without conversion, the carriage return characters appear as visible characters — shown as caret-M in many editors — at the end of each line. When a Unix file is opened in some Windows programs, all lines run together.
The tab character (code 9) is the source of the spaces-versus-tabs debate. A tab is a single character but displays as variable width depending on editor settings. Our Text Separator tool can help process text that uses tabs as delimiters.
Null bytes (code 0) terminate strings in C and C++. A null byte embedded in data being passed to a C library can silently truncate the data at that point — a common source of security vulnerabilities in systems that mix high-level languages with C code.

