Network Working Group
Request for Comments: 2319
Category: Informational
KOI8-U Working Group
April 1998

Ukrainian Character Set KOI8-U

Status of this Memo

This memo provides information for the Internet community. It does not specify an Internet standard of any kind. Distribution of this memo is unlimited.

Copyright Notice

Copyright © The Internet Society (1998). All Rights Reserved.

Abstract

This document provides information about character encoding KOI8-U (KOI8 Ukrainian) wich is a de-facto standard in Ukrainian Internet community. KOI8-U is compatible with KOI8-R (RFC 1489) in all Russian letters and extends it with four Ukrainian letters which locations are compliant with ISO-IR-111. The official site of KOI8-U Working Group is http://www.net.ua.

Introduction

This document provides information about character encoding KOI8-U (KOI8 Ukrainian) widely used in Ukrainian Internet community for mail and news exchange as well as for presentation of WWW information resources in Ukrainian language.

Originally, specification of proposed standard koi8-u was officially adopted by the conference of Postmasters of Ukrainian Internet Service Providers in Slavsk in Autumn of 1992 presented by Igor Sviridov from Kiev and Stas Vorony from Kharkiv. Later in June 1995 this specification was completed with UKRAINIAN GHE WITH UPTURN.

KOI8-U (KOI8 Ukrainian) is a de-facto standard supported in many operation systems and Internet user applications including encoding tables, fonts, locale support for many operating systems and environments.

   MIME character set name: koi8-u

Relation to other RFCs

   This standard is based on several published standards:  RFC1489 (it
   is fully compatible in all Russian letters), RFC-1345, ISO-IR-111,
   ISO 10646.

Compatibility with other character sets

The lower part of the KOI8-U Ukrainian Character Set is a complete copy of ASCII, just as it's used in KOI8-R and other non-ASCII codepages.

   The upper part of the KOI8-U Character Set contains all Russian
   letters defined in KOI8-R and four Ukrainian letters (#164, #180 -
   ukr. ie, #166, #182 - ukr. i, #167, #183 - ukr. yi, #173, #189 - ukr.
   ghe  with upturn) which locations are compliant with ISO-IR-111.

BOX DRAWINGS elements in the other positions (that are not used by Ukrainian letters) are the same as in KOI8-R character set. List of all differences between KOI8-R and proposed KOI8-U is given in APPENDIX A.

Specification of the upper part of KOI8-U codepage

The description of all characters of upper half part of KOI8-U codepage is given according to ISO 10646 Unicode Character Set (UCS).

KOI8-U charset table in RFC1345 format is given in Appendix B.

    <decimal>  <hex-code>  <UCS>     <description>
   
   128       80      U2500      BOX DRAWINGS  LIGHT HORIZONTAL
   129       81      U2502      BOX DRAWINGS  LIGHT VERTICAL
   130       82      U250C      BOX DRAWINGS  LIGHT DOWN AND RIGHT
   131       83      U2510      BOX DRAWINGS  LIGHT DOWN AND LEFT
   132       84      U2514      BOX DRAWINGS  LIGHT UP AND RIGHT
   133       85      U2518      BOX DRAWINGS  LIGHT UP AND LEFT
   134       86      U251C      BOX DRAWINGS  LIGHT VERTICAL AND RIGHT
   135       87      U2524      BOX DRAWINGS  LIGHT VERTICAL AND LEFT
   136       88      U252C      BOX DRAWINGS  LIGHT DOWN AND HORIZONTAL
   137       89      U2534      BOX DRAWINGS  LIGHT UP AND HORIZONTAL
   138       8A      U253C      BOX DRAWINGS  LIGHT VERTICAL AND
                                 HORIZONTAL
   139       8B      U2580      UPPER HALF BLOCK
   140       8C      U2584      LOWER HALF BLOCK
   141       8D      U2588      FULL BLOCK
   142       8E      U258C      LEFT HALF BLOCK
   143       8F      U2590      RIGHT HALF BLOCK
   144       90      U2591      LIGHT SHADE
   145       91      U2592      MEDIUM SHADE
   146       92      U2593      DARK SHADE
   147       93      U2320      TOP HALF INTEGRAL
   148       94      U25A0      BLACK SQUARE
   149       95      U2219      BULLET OPERATOR
   150       96      U221A      SQUARE ROOT
   151       97      U2248      ALMOST EQUAL TO
   152       98      U2264      LESS THAN OR EQUAL TO
   153       99      U2265      GREATER THAN OR EQUAL TO
   154       9A      U00A0      NO-BREAK SPACE
   155       9B      U2321      BOTTOM HALF INTEGRAL
   156       9C      U00B0      DEGREE SIGN
   157       9D      U00B2      SUPERSCRIPT TWO
   158       9E      U00B7      MIDDLE DOT
   159       9F      U00F7      DIVISION SIGN
   160       A0      U2550      BOX DRAWINGS  DOUBLE HORIZONTAL
   161       A1      U2551      BOX DRAWINGS  DOUBLE VERTICAL
   162       A2      U2552      BOX DRAWINGS  DOWN SINGLE AND RIGHT
                                 DOUBLE
   163       A3      U0451      CYRILLIC SMALL LETTER IO
   164       A4      U0454      CYRILLIC SMALL LETTER UKRAINIAN IE
   165       A5      U2554      BOX DRAWINGS  DOUBLE DOWN AND RIGHT
   166       A6      U0456      CYRILLIC SMALL LETTER BYELORUSSIAN-
                                 UKRAINIAN I
   167       A7      U0457      CYRILLIC SMALL LETTER YI (UKRAINIAN)
   168       A8      U2557      BOX DRAWINGS  DOUBLE DOWN AND LEFT
   169       A9      U2558      BOX DRAWINGS  UP SINGLE AND RIGHT DOUBLE
   170       AA      U2559      BOX DRAWINGS  UP DOUBLE AND RIGHT SINGLE
   171       AB      U255A      BOX DRAWINGS  DOUBLE UP AND RIGHT
   172       AC      U255B      BOX DRAWINGS  UP SINGLE AND LEFT DOUBLE
   173       AD      U0491      CYRILLIC SMALL LETTER GHE WITH UPTURN
   174       AE      U255D      BOX DRAWINGS  DOUBLE UP AND LEFT
   175       AF      U255E      BOX DRAWINGS  VERTICAL SINGLE AND
                                 RIGHT DOUBLE
   176       B0      U255F      BOX DRAWINGS  VERTICAL DOUBLE AND
                                 RIGHT SINGLE
   177       B1      U2560      BOX DRAWINGS  DOUBLE VERTICAL AND RIGHT
   178       B2      U2561      BOX DRAWINGS  VERTICAL SINGLE AND
                                 LEFT DOUBLE
   179       B3      U0401      CYRILLIC CAPITAL LETTER IO
   180       B4      U0404      CYRILLIC CAPITAL LETTER UKRAINIAN IE
   181       B5      U2563      BOX DRAWINGS DOUBLE VERTICAL AND LEFT
   182       B6      U0406      CYRILLIC CAPITAL LETTER
                                 BYELORUSSIAN-UKRAINIAN I
   183       B7      U0407      CYRILLIC CAPITAL LETTER YI (UKRAINIAN)
   184       B8      U2566      BOX DRAWINGS  DOUBLE DOWN AND HORIZONTAL
   185       B9      U2567      BOX DRAWINGS  UP SINGLE AND
                                 HORIZONTAL DOUBLE
   186       BA      U2568      BOX DRAWINGS  UP DOUBLE AND
                                 HORIZONTAL SINGLE
   187       BB      U2569      BOX DRAWINGS  DOUBLE UP AND HORIZONTAL
   188       BC      U256A      BOX DRAWINGS  VERTICAL SINGLE AND
                                 HORIZONTAL DOUBLE
   189       BD      U0490      CYRILLIC CAPITAL LETTER GHE WITH UPTURN
   190       BE      U256C      BOX DRAWINGS  DOUBLE VERTICAL AND
                                 HORIZONTAL
   191       BF      U00A9      COPYRIGHT SIGN
   192       C0      U044E      CYRILLIC SMALL LETTER YU
   193       C1      U0430      CYRILLIC SMALL LETTER A
   194       C2      U0431      CYRILLIC SMALL LETTER BE
   195       C3      U0446      CYRILLIC SMALL LETTER TSE
   196       C4      U0434      CYRILLIC SMALL LETTER DE
   197       C5      U0435      CYRILLIC SMALL LETTER IE
   198       C6      U0444      CYRILLIC SMALL LETTER EF
   199       C7      U0433      CYRILLIC SMALL LETTER GHE
   200       C8      U0445      CYRILLIC SMALL LETTER KHA
   201       C9      U0438      CYRILLIC SMALL LETTER I
   202       CA      U0439      CYRILLIC SMALL LETTER SHORT I
   203       CB      U043A      CYRILLIC SMALL LETTER KA
   204       CC      U043B      CYRILLIC SMALL LETTER EL
   205       CD      U043C      CYRILLIC SMALL LETTER EM
   206       CE      U043D      CYRILLIC SMALL LETTER EN
   207       CF      U043E      CYRILLIC SMALL LETTER O
   208       D0      U043F      CYRILLIC SMALL LETTER PE
   209       D1      U044F      CYRILLIC SMALL LETTER YA
   210       D2      U0440      CYRILLIC SMALL LETTER ER
   211       D3      U0441      CYRILLIC SMALL LETTER ES
   212       D4      U0442      CYRILLIC SMALL LETTER TE
   213       D5      U0443      CYRILLIC SMALL LETTER U
   214       D6      U0436      CYRILLIC SMALL LETTER ZHE
   215       D7      U0432      CYRILLIC SMALL LETTER VE
   216       D8      U044C      CYRILLIC SMALL LETTER SOFT SIGN
   217       D9      U044B      CYRILLIC SMALL LETTER YERU
   218       DA      U0437      CYRILLIC SMALL LETTER ZE
   219       DB      U0448      CYRILLIC SMALL LETTER SHA
   220       DC      U044D      CYRILLIC SMALL LETTER E
   221       DD      U0449      CYRILLIC SMALL LETTER SHCHA
   222       DE      U0447      CYRILLIC SMALL LETTER CHE
   223       DF      U044A      CYRILLIC SMALL LETTER HARD SIGN
   224       E0      U042E      CYRILLIC CAPITAL LETTER YU
   225       E1      U0410      CYRILLIC CAPITAL LETTER A
   226       E2      U0411      CYRILLIC CAPITAL LETTER BE
   227       E3      U0426      CYRILLIC CAPITAL LETTER TSE
   228       E4      U0414      CYRILLIC CAPITAL LETTER DE
   229       E5      U0415      CYRILLIC CAPITAL LETTER IE
   230       E6      U0424      CYRILLIC CAPITAL LETTER EF
   231       E7      U0413      CYRILLIC CAPITAL LETTER GHE
   232       E8      U0425      CYRILLIC CAPITAL LETTER KHA
   233       E9      U0418      CYRILLIC CAPITAL LETTER I
   234       EA      U0419      CYRILLIC CAPITAL LETTER SHORT I
   235       EB      U041A      CYRILLIC CAPITAL LETTER KA
   236       EC      U041B      CYRILLIC CAPITAL LETTER EL
   237       ED      U041C      CYRILLIC CAPITAL LETTER EM
   238       EE      U041D      CYRILLIC CAPITAL LETTER EN
   239       EF      U041E      CYRILLIC CAPITAL LETTER O
   240       F0      U041F      CYRILLIC CAPITAL LETTER PE
   241       F1      U042F      CYRILLIC CAPITAL LETTER YA
   242       F2      U0420      CYRILLIC CAPITAL LETTER ER
   243       F3      U0421      CYRILLIC CAPITAL LETTER ES
   244       F4      U0422      CYRILLIC CAPITAL LETTER TE
   245       F5      U0423      CYRILLIC CAPITAL LETTER U
   246       F6      U0416      CYRILLIC CAPITAL LETTER ZHE
   247       F7      U0412      CYRILLIC CAPITAL LETTER VE
   248       F8      U042C      CYRILLIC CAPITAL LETTER SOFT SIGN
   249       F9      U042B      CYRILLIC CAPITAL LETTER YERU
   250       FA      U0417      CYRILLIC CAPITAL LETTER ZE
   251       FB      U0428      CYRILLIC CAPITAL LETTER SHA
   252       FC      U042D      CYRILLIC CAPITAL LETTER E
   253       FD      U0429      CYRILLIC CAPITAL LETTER SHCHA
   254       FE      U0427      CYRILLIC CAPITAL LETTER CHE
   255       FF      U042A      CYRILLIC CAPITAL LETTER HARD SIGN

Security Considerations

This memo raises no known security issues.

Acknowledgments

The present edition of this document was prepared by joint activity of KOI8-U Working group and is a result of wide discussion in Ukrainian USENET newsgroup ukr.nodes and consensus reached among majority of Ukrainian ISPs.

Special acknowledges to:

Andrew Chernov <ache@astral.msk.su>, author of the first on Internet RFC 1489 describing KOI8-R Russian character set;

Igor Sviridov <sia@nest.org> for the initial work on establishing and support of KOI8-U character set and it's implementation in first e- mail products.

Many people have contributed to the early work on koi8-u encoding:

          Stanislav V. Voronyi <stas@uanet.kharkov.ua>
          Serge Vakulenko <vak@zebub.msk.su>
          Lena Savchenko <epsav@eps.computerland.kiev.ua>
          Igor Romanenko <igor@carrier.kiev.ua>
          Ruslan Belkin <rus@UA.net>
          Andrey Blohintsev <bag@UA.net>

References

   [1]  Chernov, A., "Registration of a Cyrillic Character Set", RFC
        1489, July 1993.
   
   [2]  UNICODE 2.0 CHARACTER DATABASE. - ftp://unicode.org/pub/2.0-
        Update/UnicodeData-2.0.14.txt
   
   [3]  Ukrainian letters in koi8-u and other character sets
        ftp://ftp.ua.net/pub/info/encodings/koi8-u/ukr_chars_in_koi8-
        u_and_others.txt, June 1995.
   
   [4]  ECMA-CYRILLIC. - ftp://dkuug.dk/i18n/charmaps.all/ECMA-
        CYRILLIC
   
   [5]  Simonsen, K., "Character Mnemonics & Character Sets" RFC 1345,
        June 1992.

KOI8-U Working Group List

   Coordinator:
   Alexander Yeremenko <koi8-u@sita.kiev.ua>
   
   Yuri Demchenko <demch@cad.ntu-kpi.kiev.ua>
   Victor Forsyuk <victor@gu.net>
   Taras Heychenko <tasic@lucky.net>
   Pavel Gulchuk <gul@lucky.net>
   Dmitry Kohmanyuk <dk@farm.org>
   Boris Mostovoy <vms@breaker.gu.net>
   Helen Panchenko <elena@alex-ua.com>
   Igor Romanenko <igor@lucky.net>
   Eugene Sherstobitov <gene@lucky.net>
   Andrew Stesin <stesin@gu.net>
   Igor Sviridov <sia@nest.org>
   Roman A. Tkachuk <roman@bit.ternopil.ua>

APPENDIX A

DIFFERENCE OF KOI8-U from KOI8-R (RFC 1489)

KOI8-U is compatible with KOI8-R in all Cyrillic Letters and completes it with four Ukrainian letters UKRAINIAN IE #164, #180, CYRILLIC SMALL LETTER BELORUSSIAN-UKRAINIAN I #166, #182, UKRAINIAN YI #167, #183, UKRAINIAN GHE WITH UPTURN #173, #189.

      <decimal> <hex-code>  <UCS>    <description>
    
    164       A4      U0454      CYRILLIC SMALL LETTER UKRAINIAN IE
    166       A6      U0456      CYRILLIC SMALL LETTER
                                 BELORUSSIAN-UKRAINIAN I
    167       A7      U0457      CYRILLIC SMALL LETTER YI (UKRAINIAN)
    173       AD      U0491      CYRILLIC SMALL LETTER UKRAINIAN GHE
                                 WITH UPTURN
    180       B4      U0403      CYRILLIC CAPITAL LETTER UKRAINIAN IE
    182       B6      U0406      CYRILLIC CAPITAL LETTER
                                 BELORUSSIAN-UKRAINIAN I
    183       B7      U0407      CYRILLIC CAPITAL LETTER YI (UKRAINIAN)
    189       BD      U0490      CYRILLIC CAPITAL LETTER UKRAINIAN GHE
                                 WITH UPTURN

APPENDIX B

KOI8-U charset table in RFC1345 format

&charset KOI8-U
&rem source: RFC 2319
&rem Mibenum: 2088
&rem source: http://www.net.ua/KOI8-U/
&bits 8
&code 0
NU SH SX EX ET EQ AK BL BS HT LF VT FF CR SO SI
DL D1 D2 D3 D4 NK SY EB CN EM SB EC FS GS RS US
SP ! " Nb DO % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? At A B C D E F G H I J K L M N O P Q R S T U V W X Y Z <( // )> '> _ '! a b c d e f g h i j k l m n o p q r s t u v w x y z (! !! !) '? DT hh vv dr dl ur ul vr vl dh uh vh TB LB FB lB RB
.S :S ?S Iu fS Sb RT ?2 =< >= NS Il DG 2S .M -:
HH VV dR io ie DR ii yi LD uR Ur UR uL g3 UL vR
Vr VR vL IO IE VL II YI DH uH Uh UH vH G3 VH Co
ju a= b= c= d= e= f= g= h= i= j= k= l= m= n= o=
p= ja r= s= t= u= z% v= %' y= z= s% je sc c% ='
JU A= B= C= D= E= F= G= H= I= J= K= L= M= N= O=
P= JA R= S= T= U= Z% V= %" Y= Z= S% JE Sc C% ="

Full Copyright Statement

Copyright © The Internet Society (1998). All Rights Reserved.

This document and translations of it may be copied and furnished to others, and derivative works that comment on or otherwise explain it or assist in its implementation may be prepared, copied, published and distributed, in whole or in part, without restriction of any kind, provided that the above copyright notice and this paragraph are included on all such copies and derivative works. However, this document itself may not be modified in any way, such as by removing the copyright notice or references to the Internet Society or other Internet organizations, except as needed for the purpose of developing Internet standards in which case the procedures for copyrights defined in the Internet Standards process must be followed, or as required to translate it into languages other than English.

The limited permissions granted above are perpetual and will not be revoked by the Internet Society or its successors or assigns.

This document and the information contained herein is provided on an "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.