ASCII Explained — The Character Encoding Every Developer Should Know

ASCII (American Standard Code for Information Interchange) was established in 1963 as the first widely adopted standard for mapping text characters to numeric values. Over 60 years later it remains foundational: it is a subset of UTF-8, it defines the characters used in every programming language syntax, and understanding the ASCII table helps when debugging encoding issues, sorting problems, and character manipulation logic.

Convert between characters, ASCII codes, decimal, hex, and binary with our free ASCII Converter tool. For understanding the hex values of ASCII codes, our Hex Converter and Binary Converter tools provide instant conversions.

The ASCII Table — What You Need to Know

ASCII defines 128 characters using 7-bit encoding, covering values from 0 to 127. These fall into three main groups:

Control characters (0 to 31): Non-printable characters that control how text is displayed or transmitted. The most important ones for developers: null character (0), horizontal tab (9), line feed meaning newline (10), carriage return (13), escape (27), and delete (127).

Printable characters (32 to 126): Space (32), punctuation marks, digits 0 through 9, uppercase letters A through Z, and lowercase letters a through z.

Key numeric relationships you should remember: Digit 0 = code 48. Digit 9 = code 57. To get the numeric value of a digit character, subtract 48. Letter A = code 65. Letter Z = code 90. Letter a = code 97. Letter z = code 122. The difference between uppercase and lowercase letters is always exactly 32. This difference of 32 is used in case-conversion algorithms: to convert uppercase to lowercase, add 32 to the code point. To convert lowercase to uppercase, subtract 32.

ASCII in Programming

Getting the ASCII Code of a Character

JavaScript: "A".charCodeAt(0) returns 65. "a".charCodeAt(0) returns 97. Python: ord("A") returns 65. ord("a") returns 97. C: (int) 'A' gives 65. Characters are just integers in C.

Getting the Character from a Code

JavaScript: String.fromCharCode(65) returns "A". String.fromCharCode(97) returns "a". Python: chr(65) returns "A". chr(97) returns "a". C: (char) 65 gives 'A'.

String Sorting and ASCII Values

When programming languages sort strings by default, they compare character by character using ASCII (or Unicode) code values. This means uppercase letters sort before lowercase letters because A through Z (65 to 90) all have lower code values than a through z (97 to 122). The string "Zebra" sorts before "apple" in default string comparison because uppercase Z (90) has a lower code than lowercase a (97).

When sorting mixed-case strings for user display, always normalise case first: sort strings by their lowercase or uppercase equivalent, not their raw ASCII values. Most languages provide locale-aware sorting options that handle this correctly.

Convert text to ASCII codes or convert ASCII codes back to text — free and instant

Try ASCII Converter Free →

ASCII vs Unicode vs UTF-8

ASCII covers only 128 characters — English letters, digits, and basic punctuation. It has no accented characters, no non-Latin scripts, no emoji, no special symbols. This limitation became a serious problem as computing spread globally.

Unicode solves this by assigning a unique code point to every character in every writing system ever used — currently over 140,000 characters covering 150 scripts, emoji, historical scripts, and mathematical symbols. U+0041 is the capital letter A, U+00E9 is the letter e with acute accent, U+1F600 is the grinning face emoji, U+4E2D is the Chinese character for middle.

UTF-8 is the most common encoding of Unicode. It uses 1 to 4 bytes per character. The critical property: the first 128 UTF-8 code points are byte-for-byte identical to ASCII. Any valid ASCII text is valid UTF-8. This backward compatibility is why UTF-8 became the dominant encoding on the web and why ASCII, despite being over 60 years old, remains directly relevant today.

You can verify character encoding in text-based tools. Our Character Counter tool counts characters and bytes, useful for understanding encoding differences. Our IDN Punnycode Converter converts international domain names containing non-ASCII characters to their ASCII-compatible encoding.

Control Characters — The Invisible Ones

The control characters (0 to 31) are often the source of mysterious bugs, especially when handling text from different operating systems or external sources.

The newline situation: Unix and Linux use line feed (code 10, written as backslash n) to end lines. Windows uses carriage return plus line feed (codes 13 and 10, written as backslash r backslash n). Old Mac OS used only carriage return (code 13). When a Windows text file is opened on Linux without conversion, the carriage return characters appear as visible characters — shown as caret-M in many editors — at the end of each line. When a Unix file is opened in some Windows programs, all lines run together.

The tab character (code 9) is the source of the spaces-versus-tabs debate. A tab is a single character but displays as variable width depending on editor settings. Our Text Separator tool can help process text that uses tabs as delimiters.

Null bytes (code 0) terminate strings in C and C++. A null byte embedded in data being passed to a C library can silently truncate the data at that point — a common source of security vulnerabilities in systems that mix high-level languages with C code.

Frequently Asked Questions

ASCII 10 is the Line Feed character (LF, written as backslash n). ASCII 13 is the Carriage Return character (CR, written as backslash r). Unix and Linux use LF alone to end lines. Windows uses CR followed by LF (CRLF). Old Mac OS used CR alone. This difference causes the classic cross-platform issue where Windows text files show stray characters at line endings on Unix systems or appear as one long line without line breaks in some programs.

ASCII was designed as a 7-bit encoding in the 1960s when memory was extremely expensive and 7-bit serial transmission was common. The 8th bit in a byte was often reserved for error-checking (parity bit). Using 7 bits gives 128 possible values, which was sufficient for English text at the time. Extended ASCII attempts to use all 8 bits for 256 values, but there was never a single standard for the upper 128 — different manufacturers used codes 128 to 255 differently. Unicode later solved this fragmentation properly.

The simplest approach in most languages: filter characters where the code point is greater than 127. In JavaScript: text.replace(/[^\x00-\x7F]/g, "") removes all non-ASCII characters. In Python: text.encode("ascii", errors="ignore").decode("ascii") strips non-ASCII. For removing emoji specifically from text, our Emojis Remover tool handles this instantly without writing code.

ASCII is a character encoding standard that assigns numeric values to characters. Binary is a number system using only 0 and 1. The ASCII code for letter A is the decimal number 65. In binary, 65 is written as 01000001. They are different levels of the same underlying representation: the character A is stored as the number 65, which the computer stores in memory as the binary pattern 01000001. Our Binary Converter converts between binary and decimal.