Internationalization
features
The Tru64 UNIX internationalization environment, tools, and localization
features enable you to develop and execute internationalized software
without re-engineering the user application.
The Tru64 UNIX internationalization features conform to The Open Group's
CAE specifications for system interfaces and headers (XSH Issue 5),
curses (XCURSES Issue 4.2), and commands and utilities (XCU Issue 5).
These specifications align with current POSIX and ISO C standards.
The operating system also supports the X Input and Output Methods (XIM
and XOM), with functions implemented according to X11R6 specification.
Tru64 UNIX Version 5.1A conforms to the Chinese Character Input Standard
GB18030-2000, which went into effect on September 1, 2001.
Internationalization features are included as part of the base operating
system installation or with installation of the optional Worldwide
Language Support subsets. The following lists some of the more important
internationalization features offered with Tru64 UNIX. For a complete
description of features, see the Internationalization chapter of the
Tru64 UNIX Technical Overview. For a full description of the Tru64
UNIX implementation of these features, see Writing Software for the
International Market and the language-specific Technical Reference
Guides on the Tru64 UNIX Programming Documentation bookshelf.
-
32-bit wide character support
-
XPG4 Worldwide Portability Interfaces (WPI)
-
Multibyte Support Extensions (MSE) of the ISO C standard
(ISO/IEC 9899:1990/Amendment 1:1994(E))
-
Internationalized commands, such as support for the
asort utility that allows sorting of ideogrammatic languages
-
Internationalized X/Open Curses library (libcurses)
-
Iconv library (libiconv, an International Codeset
Conversion Library)
-
Locale utilities, including locale creation
-
Date, time, currency, and numeric formats in the
native languages
-
Character classification (isupper, islower, iscntrl,
is* functions) and support for Asian user defined characters
-
Collation — Character sort order of the codeset
-
Yes and No response in the native language
-
Fonts for supported character sets
-
TTY drivers support for various input functions for
native languages, including support for BSD line disciplines and
STREAMS terminal drivers for processing Asian languages
-
Translated CDE and Motif user interface
-
Keymaps for local keyboards
-
Support for all language variants using the North
American keyboard
-
Input method support for Hebrew and Asian languages
-
Printing in the native languages
Supported
languages, locales, and character sets
Locale names that include @ucs4 and UTF-8 support characters encoded
as defined by the ISO 10646 standard and the Unicode Standard. Characters
are processed internally using a 32-bit wchar_t data type. In addition,
for @ucs4 locales, data is in UCS-4 form (also known as UTF-32).
The first 256 values of ISO 10646 and the Unicode Standard contain
the same characters as are contained in the ISO 8859-1 (Latin-1) character
set. In all ISO 8859-1 locales, characters are zero-padded to 32 bits
in wchar_t and so are identical to the UCS-4 process code. ISO 8859-1
locales differ from UTF-8 locales with the same base name in terms
of data file encoding.
UTF-8 and Latin-9 (ISO 8859-15) locales support the euro currency
symbol.
To install all of these locales, you must install the optional Worldwide
Language Support subset as well as the base operating system.
Some locale names have one or more @modifier suffixes. A locale with
the suffix @ucs4 is for use by applications that require internal process
code to be in UCS-4 format. See the Unicode.5 reference page for more
information on UCS-4. The English locale names that include cp850 support
character encoding in PC code-page format.
For the most up-to-date list of supported languages and locales, refer
to the l10n_intro.5 reference page.
European, Middle Eastern,
and North American locales
Catalan
ca_ES.ISO8859-1
ca_ES.ISO8859-15
ca_ES-UTF-8 |
Czech
cs_CZ.ISO8859-2
cs_CZ.ISO8859-2@ucs4 |
Danish
da-DK.ISO8859-1
da_DK.ISO8859-15
da_DK.UTF-8 |
Dutch
nl_NL.ISO8859-1
nl_NL.ISO8859-15
nl_NL.UTF-8 |
Belgian
Dutch (Flemish)
nl_BE.ISO8859-1
nl_BE.ISO8859-15
nl_BE.UTF-8 |
US
English
en_US.ISO8859-1
en_US.ISO8859-15
en_US.cp850
en_US.UTF-8
en_US.UTF-8@euro |
GB
English
en_GB.ISO8859-1
en_GB.ISO8859-15
en_GB.UTF-8
European English
en_EU.UTF-8@euro |
Finnish
fi_FI.ISO8859-1
fi_FI.ISO8859-15
fi_FI.UTF-8 |
Belgian
French
fr_BE.ISO8859-1
fr_BE.ISO8859-15
fr_BE.UTF-8 |
French
fr_FR.ISO8859-1
fr_FR.ISO8859-15
fr_FR.UTF-8 |
Canadian
French
fr_CA.ISO8859-1
fr_CA.ISO8859-15
fr_CA.UTF-8 |
Swiss
French
fr_CH.ISO8859-1
fr_CH.ISO8859-15
fr_CH.UTF-8 |
German
de_DE.ISO8859-1
de_DE.ISO8859-15
de_DE.UTF-8 |
Swiss
German
de_CH.ISO8859-1
de_CH.ISO8859-15
de_CH.UTF-8 |
Greek
el_GR.ISO8859-7
el_GR.ISO8859-7@ucs4
el_GR.UTF-8 |
Hebrew
he_IL.ISO8859-8
he_IL.ISO8859-8@ucs4
iw_IL.ISO8859-8 |
Hungarian
hu_HU.ISO8859-2
hu_HU.ISO8859-2@ucs4 |
Icelandic
is_IS.ISO8859-1
is_IS.ISO8859-15
is_IS.UTF-8 |
Italian
it_IT.ISO8859-1
it_IT.ISO8859-15
it_IT.UTF-8 |
Lithuanian
lt_LT.ISO8859-4
lt_LT.ISO8859-4@ucs4 |
Norwegian
no_NO.ISO8859-1
no_NO.ISO8859-15
no_NO.UTF-8 |
Polish
pl_PL.ISO8859-2
pl_PL.ISO8859-2@ucs4 |
Portuguese
pt_PT.ISO8859-1
pt_PT.ISO8859-15
pt_PT.UTF-8 |
Russian
ru_RU.ISO8859-5
ru_RU.ISO8859-5@ucs4 |
Slovak
sk_SK.ISO8859-2
sk_SK.ISO8859-2@ucs4 |
Slovene
sl_SI.ISO8859-2
sl_SI.ISO8859-2@ucs4 |
Spanish
es_ES.ISO8859-1
es_ES.ISO8859-15
es_ES.UTF-8 |
Swedish
sv_SE.ISO8859-1
sv_SE.ISO8859-15
sv_SE.UTF-8 |
Turkish
tr_TR.ISO8859-9
tr_TR.ISO8859-9@ucs4 |
|
Asian locales
Simplified
Chinese - PRC
zh_CN.dechanzi
zh_CN.dechanzi@ucs4
zh_CN.dechanzi@pinyin
zh_CN.dechanzi@pinyin@ucs4
zh_CN.dechanzi@radical
zh_CN.dechanzi@radical@ucs4
zh_CN.dechanzi@stoke
zh_CN.dechanzi@stroke@ucs4
zh_CN.UTF-8
zh_CN.GBK
zh_CN.GB18030 |
Traditional
Chinese - Hong Kong
zh_HK.big5
zh_HK.dechanyu
zh_HK.dechanyu@ucs4
zh_HK.dechanzi (Simplified)
zh_HK.dechanzi@ucs4
zh_HK.eucTW
zh_HK.eucTW@ucs4
zh_HK.UTF-8 |
Japanese
ja_JP.eucJP
ja_JP.SJIS
ja_JP.SJIS@ucs4
ja_JP.deckanji
ja_JP.deckanji@ucs4
ja_JP.sdeckanji
ja_JP.UTF-8 |
Traditional
Chinese - Taiwan
zh_TW.big5
zh_TW.big5@chuyin
zh_TW.big5@radical
zh_TW.big5@stroke
zh_TW.dechanyu
zh_TW.dechanyu@ucs4
zh_TW.dechanyu@chuyin
zh_TW.dechanyu@chuyin@ucs4
zh_TW.dechanyu@radical
zh_TW.dechanyu@radical@ucs4
zh_TW.dechanyu@stroke
zh_TW.dechanyu@stroke@ucs4
zh_TW.eucTW
zh_TW.eucTW@ucs4
zh_TW.eucTW@chuyin
zh_TW.eucTW@chuyin@ucs4
zh_TW.eucTW@radical
zh_TW.eucTW@radical@ucs4
zh_TW.eucTW@stroke
zh_TW.eucTW@stroke@ucs4
zh_TW.UTF-8 |
Korean
ko_KR.deckorean
ko_KR.deckorean@ucs4
ko_KR.eucKR
ko_KR.KSC5601
ko_KO.UTF-8 |
Thai
th_TH.TACTIS |
Standard and nonstandard
codeset support
| ISO8859-1 |
Western European
Languages |
| ISO8859-2 |
Eastern European
Languages |
| ISO8859-3 |
Other Latin
Languages |
| ISO8859-4 |
Northern European
Languages |
| ISO8859-5 |
Latin/Cyrillic |
| ISO8859-6 |
Latin/Arabic |
| ISO8859-7 |
Latin/Greek |
| ISO8859-8 |
Latin/Hebrew |
| ISO8859-9 |
Latin/Turkish |
| ISO8859-15 |
Similar to ISO8859-1,
but includes euro symbol support |
| SJIS |
Shift JIS KANJI |
| big5 |
Big 5 |
| TACTIS |
Thai Industrial
Standard |
| UTF-8 |
Universal Code
Set Transformation Format |
| DEC Hanzi |
Simplified Chinese |
| DEC Hanyu |
Traditional
Chinese |
| GBK |
Simplfied Chinese,
extension of GB2312-80 |
| GB18030 |
Extension of
GBK |
| Telecode |
Traditional
Chinese Telecode |
| DEC Kanji |
Japanese |
| Super DEC Kanji |
Japanese, extended |
| eucJP |
Extended UNIX
Code (Japanese) |
| eucTW |
Extended UNIX
Code (Taiwanese) |
| eucKR |
Extended UNIX
Code (Korean) |
| JIS7 |
7-bit JIS Kanji |
| jiskanji7 |
7-bit JIS Kanji |
| sbig5 |
Shift Big 5 |
Unicode support
Tru64 UNIX supports the Unicode Standard, Version
3.0 and ISO 10646 standards through a set of UCS-4 and UTF-8 based
locales. Codeset conversion capability among UCS-4 (UTF-32), UCS-2
(UTF-16) and UTF-8 formats is provided for all supported codesets.
Conversion support between Unicode and a number of single-byte PC
code pages and from those PC code pages to the ISO Latin codeset
is provided. For more information on the Unicode locales, see the
operating system reference page, Unicode(5).
Euro currency support
Tru64 UNIX supports the euro currency symbol.
Locales that use the UTF-8 or Latin-9 (ISO 8859-15) codesets support
the euro characters while locales with a @euro suffix define the
local currency sign to be the euro character. The locale, en_EU.UTF-8@euro,
is an English locale providing support for the euro symbol, decimal
as comma, and period as thousands separator. Printer support for
the euro character is enabled by a generic PostScript print filter
(wwpsof). Keyboard entry of the euro character is supported by key
sequences defined in keymaps and through use of the Compose key.
Also, codeset converters convert file data between the various encoding
formats that support the euro character. See the operating system
euro(5) and wwpsof(8) reference pages for more information.
Japanese input system
for Tru64 UNIX
Tru64 UNIX provides a Japanese input system called
'dxjim', as a part of the operating system. It is an essential component
for users to input Japanese characters (Kana and Kanji).
The 'dxjim' input system works only on Tru64 UNIX
and Japanese ULTRIX. For users accustomed to PC-type input systems
and want to use a similar Japanese input system on Tru64 UNIX, the
following two Japanese input systems are provided:
WX3 and VJE for Tru64 UNIX are products developed by
Independent Software Vendors in Japan (A.I.Soft Inc. and VACS Corp.
respectively), and are popular in the PC world.
|