ASCII

American character encoding standard

ASCII (pronounced "az-kee", "ass-key" if American), is a table of characters for computers. It is binary code used by electronic equipment to handle text using the English alphabet, numbers, and other common symbols. ASCII is an abbreviation for American Standard Code for Information Interchange.[1] ASCII was developed in the 1960s and was based on earlier codes used by telegraph systems.

The 95 graphic ASCII characters, numbered 32 to 126 (decimal)

The code includes definitions for 128 characters, which are assigned numbers from 0 to 127. Numbers 32 to 126 stand for letters, numbers and symbols such as abc, ABC, 123, and ?&!. Numbers 0 to 31 and the number 127 stand for characters that control how text is processed and are not printed.

ASCII uses 7 binary digits (bits) to represent characters. The bits 1000001 (65 in normal base-10 numbers) represent the upper-case letter A, 1000010 represents B, 1000011 represents C, and so on. Using the table below, you can look up a number in the Decimal column and see the character it represents in the Char column. Programmers often use hexadecimal (base-16 numbers), and you can look that up in the Hex column.

An ASCII computer file uses one byte for each character. A byte has 8 bits, so can hold an ASCII character with one bit spare. In the past, the spare bit was sometimes used as a parity bit to help check if the data had been corrupted.

ASCII has no formatting control (for bold or Italics, etc.) [2] Sometimes someone talks about a file or document in ASCII, meaning it is in plain text.

Standard ASCII is still commonly used, particularly in computer software and HTML files. Until 2010 it was the standard for URLs. Often a web site that has fields for entering text will only take ASCII text. Any special markups for bold or centered text, etc. will show up incorrectly.


ASCII-Table-wide

Extended ASCII

change

ASCII does not have diacritics (marks that are added to a letter, like the dots (umlauts) above vowels in German, or the tilde (~) above the 'n' for the 'ñ' used in Spanish). It was only meant for English and doesn't work well for most other languages. Some English words borrowed from other languages use these marks as well, like resumé (see Appendix:English words with diacritics).

This led to some systems using 8 bit (a full byte) instead of 7 bits. The proper name for systems that use 8 bits is called extended ASCII. Eight bits allows for 256 characters. The first 128 characters must be the same as for ASCII and the rest are usually used for alphabetic letters with accents, for example like É, È, Î and Ü. This solves the problem for a few languages that are based on the Latin alphabet, although not all extended ASCII systems are the same. Other alphabets, like the Greek alphabet, Cyrillic alphabet need a different set of characters. And some systems like those using Chinese characters still do not work, as they use thousands of characters. So Unicode was created to have one common system for all languages.

References

change
  1. "ASCII Character Set". Robelle. Retrieved 5 April 2013.
  2. "ASCII table & description". ASCIITable. Retrieved 5 April 2013.

Other websites

change