Module Charset

Inheritance graph
Charset Locale.Charset
Description

The Charset module supports a wide variety of different character sets, and it is flexible in regard of the names of character sets it accepts. The character case is ignored, as are the most common non-alaphanumeric characters appearing in character set names. E.g. "iso-8859-1" works just as well as "ISO_8859_1". All encodings specified in RFC 1345 are supported.

First of all the Charset module is capable of handling the following encodings of Unicode:

  • utf7, utf8, utf16, utf16be, utf16le, utf32, utf32be, utf32le, utf75 and utf7½

    UTF encodings

  • shiftjis, euc-kr, euc-cn and euc-jp

Most, if not all, of the relevant code pages are represented, as the following list shows. Prefix the numbers as noted in the list to get the wanted codec:

  • 037, 038, 273, 274, 275, 277, 278, 280, 281, 284, 285, 290, 297, 367, 420, 423, 424, 437, 500, 819, 850, 851, 852, 855, 857, 860, 861, 862, 863, 864, 865, 866, 868, 869, 870, 871, 880, 891, 903, 904, 905, 918, 932, 936, 950 and 1026

    These may be prefixed with "cp", "ibm" or "ms".

  • 1250, 1251, 1252, 1253, 1254, 1255, 1256, 1257 and 1258

    These may be prefixed with "cp", "ibm", "ms" or "windows"

  • mysql-latin1

    The default charset in MySQL, similar to cp1252.

+359 more.

Note

In Pike 7.8 and earlier this module was named Locale.Charset.