- Skip to content
- Accessibility Policy
Frequently Asked Questions
Open all Close all
-
NLS_LANG Parameter Fundamentals
A locale is a set of information addressing linguistic and cultural requirements that corresponds to a given language and country. Traditionally, the data associated with a locale provides support for formatting and parsing of dates, times, numbers, and currencies, etc. Providing current and correct locale data has historically been the responsibility of each platform owner or vendor, leading to inconsistencies and errors in locale data.
Setting the NLS_LANG environment parameter is the simplest way to specify locale behavior for Oracle software. It sets the language and territory used by the client application and the database server. It also indicates the client’s character set, which corresponds to the character set for data to be entered or displayed by a client program.
NLS_LANG is set as a local environment variable on UNIX platforms. NLS_LANG is set in the registry on Windows platforms.
The NLS_LANG parameter has three components: language, territory, and character set. Specify it in the following format, including the punctuation:
NLS_LANG = language_territory.charsetEach component of the NLS_LANG parameter controls the operation of a subset of globalization support features:
Language
Specifies conventions such as the language used for Oracle messages, sorting, day names, and month names. Each supported language has a unique name; for example, AMERICAN , FRENCH , or GERMAN . The language argument specifies default values for the territory and character set arguments. If the language is not specified, then the value defaults to AMERICANTerritory
Specifies conventions such as the default date, monetary, and numeric formats. Each supported territory has a unique name; for example, AMERICA , FRANCE , or CANADA . If the territory is not specified, then the value is derived from the language value.Charset
Specifies the character set used by the client application (normally the Oracle character set that corresponds to the user’s terminal character set or the OS character set). Each supported character set has a unique acronym, for example, US7ASCII , WE8ISO8859P1 , WE8DEC , WE8MSWIN1252 , or JA16EUC . Each language has a default character set associated with it.Note:
All components of the
definition are optional; any item that is not specified uses its default value. If you specify territory or character set, then you
NLS_LANG
must include the preceding delimiter [underscore (_) for territory, period (.) for character set]. Otherwise, the value is parsed as a language name.For example, to set only the territory portion of
, use the following format:
NLS_LANG
NLS_LANG=_JAPAN
The remainder of this document will focus on the charset component of the NLS_LANG setting, as it is the least understood and most important piece to set correctly.
-
Common NLS_LANG Myths
- Setting the NLS_LANG to the character set of the database MAY be correct but IS often not correct. DO NOT assume that NLS_LANG needs to be the same as the database character set. THIS IS OFTEN NOT TRUE.
- The character set defined with the NLS_LANG parameter does NOT CHANGE your client’s character set. It is used to let Oracle know what character set you are USING on the client side, so Oracle can do the proper conversion. You cannot change the character set of your client by using a different NLS_LANG!
- If you don’t set the NLS_LANG on the client it uses the NLS_LANG of the server. This is also NOT true! For example, if the Oracle Installer does not populate NLS_LANG , and it is not otherwise set then its value by default is A MERICAN_AMERICA.US7ASCII . The language is AMERICAN , the territory is AMERICA , and the character set is US7ASCII.
- Setting the LANGUAGE and TERRITORY parameters of NLS_LANG has nothing to do with the ability to store characters in a database. A NLS_LANG set to JAPANESE_JAPAN.WE8MSWIN1252 will not allow you to store Japanese, as WE8MSWIN1252 doesn’t support Japanese characters. However a NLS_LANG set to AMERICAN_AMERICA.JA16SJIS will allow you to store Japanese providing the input data is truly JA16SJIS and if the database is also in a character set that can store Japanese like UTF8 or JA16SJIS)
-
Checking the current NLS_LANG Setting
In many cases the NLS_LANG has been already set during the Oracle install or thereafter manually. To be sure you can use these methods to get back the value of NLS_LANG for SQL*Plus:
On UNIX:
SQL> HOST ECHO $NLS_LANG
This returns the value of the parameter.
On Windows:
On Windows you have two possible options, normally the NLS_LANG is set in the registry, but it can also be set in the environment, however this is not often done. The value in the environment takes precedence over the value in the registry and is used for ALL Oracle_Homes on the server. Also note that any USER environment variable takes precedence over any SYSTEM environment variable (this is Windows behavior, and has nothing to do with Oracle) if setTo check if it’s set in the environment:
SQL> HOST ECHO %NLS_LANG%
If this reports just %NLS_LANG% back, the variable is not set in the environment.
If it’s set it reports something like
ENGLISH_UNITED KINGDOM.WE8ISO8859P1
If NLS_LANG is not set in the environment, check the value in the registry:SQL>@.[%NLS_LANG%].
If you get something like:
Unable to open file.[ENGLISH_UNITED KINGDOM.WE8ISO8859P1].
The «file name» between the braces is the value of the registry parameter.If you get this as result:
Unable to open file «.[%NLS_LANG%].» then the parameter NLS_LANG is also not set in the registry.Note the @.[%NLS_LANG%]. technique reports the NLS_LANG known by the SQL*Plus executable, it will not read the registry itself. But if you run the HOST command first and the NLS_LANG is not set in the environment then you can be sure the variable is set in the registry if the @.[%NLS_LANG%]. returns a valid value.
All other NLS parameters can be retrieved by a:
SELECT * FROM NLS_SESSION_PARAMETERS;
Note:
SELECT USERENV (‘language’) FROM DUAL; gives the session’s <Language>_<territory> but the DATABASE character set not the client, so the value returned is not the client’s complete NLS_LANG setting!
-
The Priority of NLS Parameters related to NLS_LANG
This section explains the order in which NLS parameters are taken into account in the database client/server model. (This does NOT cover Thin JDBC connections)
There are 3 levels at which you can set NLS parameters: Database, Instance and Session. If a parameter is defined at more than one level then the rules on which one takes precedence are quite straightforward:
- 1. NLS database settings are superseded by NLS instance settings
- 2. NLS database & NLS instance settings are superseded by NLS session settings
-
Session Parameters
SELECT * from NLS_SESSION_PARAMETERS;
These are the settings used for the current SQL session.
These reflect (in this order):
- 1) The values of NLS parameters set by «ALTER SESSION «
ALTER SESSION set NLS_DATE_FORMAT = ‘DD/MM/YYYY’; - 2) If there is no explicit «ALTER SESSION » statement done then it reflects the setting of the corresponding NLS parameter on the client derived from the NLS_LANG variable.
- 3) If NLS_LANG is specified with only the part then AMERICAN is used as default .
So if you set NLS_LANG=_BELGIUM. WE8MSWIN1252 then you get this:
PARAMETER VALUE
NLS_LANGUAGE AMERICAN
NLS_TERRITORY BELGIUM
NLS_CURRENCY <euro sign here>
NLS_ISO_CURRENCY BELGIUMNote:
The difference between
NLS_LANG=_BELGIUM.WE8MSWIN1252 (correct) andNLS_LANG=BELGIUM.WE8MSWIN1252 (incorrect), you need to set the «_» as separator.
- 4) If NLS_LANG is specified with only the part then the defaults to a setting based on .
So if you set NLS_LANG=ITALIAN_.WE8MSWIN1252 then you get this:
PARAMETER VALUE
NLS_LANGUAGE ITALIAN
NLS_TERRITORY ITALY
NLS_CURRENCY <euro sign here>
NLS_ISO_CURRENCY ITALYNote:
Note the difference between
NLS_LANG=ITALIAN_.WE8MSWIN1252 (correct) andNLS_LANG=ITALIAN.WE8MSWIN1252 (incorrect), you need to set the «_» as separator.
- 5) If NLS_LANG is specified without the _ part then the _ part defaults to AMERICAN_AMERICA.
So if you set NLS_LANG=.WE8MSWIN1252 then you get this:
PARAMETER VALUE
NLS_LANGUAGE AMERICAN
NLS_TERRITORY AMERICA
NLS_CURRENCY $
NLS_ISO_CURRENCY AMERICANote: The difference between NLS_LANG=.WE8MSWIN1252 (correct) and NLS_LANG=WE8MSWIN1252 (incorrect), you need to set the "." as separator.
- 6) If the NLS_LANG is set (either like in point 3, 4 or 5) then parameters like
NLS_SORT, NLS_DATE_FORMAT, etc. can be set as a «standalone» setting and will overrule the defaults derived from NLS_LANG _ part.
So if you set NLS_LANG=AMERICAN_AMERICA.WE8MSWIN1252 and NLS_ISO_CURRENCY=FRANCE then you get this:
PARAMETER VALUE
NLS_LANGUAGE AMERICAN
NLS_TERRITORY AMERICA
NLS_CURRENCY $
NLS_ISO_CURRENCY FRANCEDefaults:
* If NLS_DATE_LANGUAGE or NLS_SORT are not set then they are derived from
NLS_LANGUAGE.
* If NLS_CURRENCY, NLS_DUAL_CURRENCY, NLS_ISO_CURRENCY, NLS_DATE_FORMAT, NLS_TIMESTAMP_FORMAT, NLS_TIMESTAMP_TZ_FORMAT, NLS_NUMERIC_CHARACTERS are not set then they are derived from NLS_TERRITORY - 7) If the NLS_LANG is not set at all, then it defaults to
<Language>_<Territory>.US7ASCII and the values for the
<Language>_<Territory> part used are the ones found inNLS_INSTANCE_PARAMETERS. Parameters like NLS_SORT defined as «standalone» on the client side are ignored.
Note: * If set, client parameters (NLS_SESSION_PARAMETERS) always take precedence over NLS_INSTANCE_PARAMETERS and NLS_DATABASE_PARAMETERS. * This behavior cannot be disabled on/from the server, so a parameter set on the client always has precedence above an instance or database parameter. * NLS_LANG cannot be changed by ALTER SESSION, NLS_LANGUAGE and NLS_TERRITORY can. However NLS_LANGUAGE and /or NLS_TERRITORY cannot be set as "standalone" parameters in the environment or registry on the client. * NLS_SESSION_PARAMETERS is NOT visible for other sessions. If you need to trace this then you have to use a logon trigger to create your own logging table (based on session parameters) * The <clients characterset> part of NLS_LANG is NOT shown in any system table or view. * On Windows you have two possible options, normally the NLS_LANG is set in the registry, but it can also be set in the environment, however this is not often done and generally not recommended to do so. The value in the environment takes precedence over the value in the registry and is used for ALL Oracle_Homes on the server if defined as a system environment variable. * NLS_LANGUAGE in the session parameters also declares the language for the client error messages. * You cannot "set" a NLS parameter in an SQL script; you need to use ALTER SESSION.
- 1) The values of NLS parameters set by «ALTER SESSION «
-
Instance Parameters
SELECT * from NLS_INSTANCE_PARAMETERS;
These are the settings in the init.ora of the database at the moment that the database was started or set through ALTER SYSTEM.
If the parameter is not explicitly set in the init.ora or defined by ALTER SYSTEM then its value is NOT derived from a «higher» parameter (we are talking about parameters like NLS_SORT that derive a default from NLS_LANGUAGE in NLS_SESSION_PARAMETERS, this is NOT the case for NLS_INSTANCE_PARAMETERS)
Note: * NLS_LANG is not an init.ora parameter; NLS_LANGUAGE and NLS_TERRITORY are so you need to set NLS_LANGUAGE and NLS_TERRITORY separately. * You cannot define the or NLS_LANG in the init.ora The client characterset is defined by the NLS_LANG on the client OS (see above). * You cannot define the database characterset in the init.ora. The database characterset is defined by the "Create Database" command. * These settings take precedence above the NLS_DATABASE_PARAMETERS. * These values are used for the NLS_SESSION_PARAMETERS if the client the NLS_LANG is NOT set. * Oracle strongly recommends that you set the NLS_LANG on the client at least to NLS_LANG=.<clients characterset> * The NLS_LANGUAGE in the instance parameters also declares the language for the server error messages in alert.log and in trace files.
-
Database Parameters
SELECT * from NLS_DATABASE_PARAMETERS;
Defaults to AMERICAN_AMERICA if there are no parameters explicitly set in the init.ora during database creation time. If there is parameters set in the init.ora during database creation you see them here. There is no way to change these after the database creation. Do NOT attempt to update system tables to bypass these settings! These settings are used to give the database a default if the INSTANCE and SESSION parameters are not set.
Note:
* NLS_LANG is not an init.ora parameter, NLS_LANGUAGE and NLS_TERRITORY are.
So you need to set NLS_LANGUAGE and NLS_TERRITORY separately.
* These parameters are overridden by NLS_INSTANCE_PARAMETERS and NLS_SESSION_PARAMETERS.
* You cannot define the <clients character set> or NLS_LANG in the init.ora. The client character set is defined by the NLS_LANG on the client OS.
* You cannot define the database character set in the init.ora.
The database (national) character set NLS_(NCHAR)_CHARACTERSET) is defined by the «Create Database» command.
* The NLS_CHARACTERSET and NLS_NCHAR_CHARACTERSET parameters cannot be overridden by instance or session parameters.
They are defined by the value specified in the «CREATE DATABASE command and are not intended to be changed afterwards dynamically. Do NOT update system tables to change the character set. This can corrupt your database and potentially make it impossible to open the database again.
* Setting the NLS_LANG during the creation of the database does not influence the NLS_DATABASE_PARAMETERS.
* The NLS_LANG set during the database creation has NO impact on the database National Characterset.
Additional SELECT statements:
A) SELECT name,value$ from sys.props$ where name like ‘%NLS%’;
This gives the same info as NLS_DATABASE_PARAMETERS.
You should use NLS_DATABASE_PARAMETERS instead of props$.
Note the UPPERCASE ‘%NLS%’B) SELECT * from v$nls_parameters;
This view shows the current session parameters and the *DATABASE* characterset as seen in the NLS_DATABASE_PARAMETERS view.C) SELECT name,value from v$parameter where name like ‘%NLS%’;
This view gives the same information as NLS_INSTANCE_PARAMETERS.
Note the LOWERCASE ‘%NLS%’D) SELECT userenv (‘language’) from dual;
and
SELECT sys_context(‘userenv’,’language’) from dual;
Both these SELECT statements give the session’s _ and the
DATABASE character set. The database character set is not the same as the character set of the NLS_LANG that you started this connection with! So don’t be fooled, although the output of this query looks like the value of a NLS_LANG variable, it is NOT.E) SELECT userenv (‘lang’) from dual;
This SELECT gives the short code that Oracle uses for the Language defined by NLS_LANGUAGE setting for this session. If NLS_LANGUAGE is set to French then this will return «F», if NLS_LANGUAGE is set to English then this will return «GB»
If NLS_LANGUAGE is set to American then this will return «US», and so on…F) SHOW parameter NLS%
This will give the same as the NLS_INSTANCE_PARAMETERS -
An example of a wrong NLS_LANG setup
A database is created on a UNIX system with the US7ASCII character set. A Windows client connecting to the database works with the WE8MSWIN1252 character set (regional settings -> Western Europe /ACP 1252) and the DBA, use the UNIX shell (ROMAN8) to work on the database. The NLS_LANG is set to american_america.US7ASCII on the clients and the server.
Note: This is an INCORRECT setup to explain character set conversion, don't use it in your environment!
A very important point (as mentioned before):
When the client NLS_LANG character set is set to the same value as the database character set, Oracle assumes that the data being sent or received are of the same (correct) encoding, so no conversions or validations may occur for performance reasons. The data is just stored as delivered by the client, bit by bit.
From Windows insert an ‘é’ (LATIN SMALL LETTER E WITH ACUTE) into a table NLS_TEST containing one column ‘TEST’ of the type ‘char’.
As long as you insert into and select from the column on Windows with the WE8MSWIN1252 character set everything runs smoothly. No conversion is done and 8 bits are inserted and read back, even if the character set of the database is defined as 7 bits. This happens because a byte is 8 bits and Oracle is ALWAYS using 8 bits even with a 7 bit character set. In a correct setup the most Significant Bit is just not used and only 7 bits are taken into account.
For one reason or another you need to insert from the UNIX server. When you SELECT from tables where data is inserted by the Windows clients you get a ‘Ò’ (LATIN CAPITAL LETTER O WITH TILDE) instead of the ‘é’.
If you insert ‘é’ on the UNIX server and you SELECT the row at the Windows client you get an ‘Å’ (LATIN CAPITAL LETTER A WITH RING ABOVE) back.
Bottom line is that you have INCORRECT data in the database. You store the numeric value for ‘é’ of the WE8MSWIN1252 character set in the database but you tell Oracle this is US7ASCII data, so Oracle is NOT converting anything and just stores the numeric value (again: Oracle thinks that the client is giving US7ASCII codes because the NLS_LANG is set to US7ASCII, and the database character set is also US7ASCII -> no conversion done).
When you SELECT the same column back on the UNIX server, Oracle is again expecting that the value is correct and passes the value to the UNIX terminal without any conversion.
Now the problem is that in the WE8MSWIN1252 character set the ‘é’ has the hexadecimal value ‘E9’and in the Roman8 character set the hexadecimal value for ‘é’ is ‘C5’. Oracle just passes the value stored in the database (‘E9’) to the UNIX terminal, and the UNIX terminal thinks this is the letter ‘?’ because in its (Roman8) character set the hexadecimal value ‘E9’ is representing the letter ‘Ò’. So instead of the ‘é’ you get ‘Ò’ on the UNIX terminal screen.
The inverse (the insert on the UNIX and the SELECT on the Windows client) is the same story, but you get other results.
The solution is creating the database with a character set that contains ‘é’
(WE8MSWIN1252, WE8ISO89859P1, UTF-8, etc.) and setting the NLS_LANG on the client to WE8MSWIN1252 and on the server to WE8ROMAN8. If you then insert an ‘é’ on both sides, you will get an ‘é’ back regardless of where you SELECT them. Oracle knows then that a hexadecimal value of ‘C5’ inserted by the UNIX and an ‘E9’ from a WE8MSWIN1252 client are both ‘é’ and inserts ‘é’ into the database (the code in the database depends on the character set you have chosen).
You don’t have to switch between UNIX, Windows or other OS clients to run into this kind of problem. The same problem appears if you add Windows clients that are using another character set and have an incorrect NLS_LANG set.
-
How to setup the NLS_LANG Properly for UNIX
To specify the locale behavior of your client Oracle software, you have to set your NLS_LANG parameter. It sets the language, territory and also the character set of your client. You need to check the locale environment settings to set your NLS_LANG 3rd field (character set) in accordance with it. To do this, use the «locale» command like this:
$ locale
Example of output: LANG=fr_FR LC_CTYPE="fr_FR.iso885915@euro" LC_COLLATE="fr_FR.iso885915@euro" LC_MONETARY="fr_FR.iso885915@euro" LC_NUMERIC="fr_FR.iso885915@euro" LC_TIME="fr_FR.iso885915@euro" LC_MESSAGES="fr_FR.iso885915@euro" LC_ALL=fr_FR.iso885915@euro
The output of this command is not exactly the same on all the Unix environments. On some platforms, it can be useful to use the following syntax to have more details about the codepage really used:
$ locale LC_CTYPE | head
Example of output in a HP-UX environment : "" "" "iso885915" "" Example of output in a Linux environment: upper;lower;alpha;digit;xdigit;space;print;graph;blank;cntrl; punct;alnum; combining;combining_level3 toupper;tolower;totitle 16 1 ISO-8859-15 70 84 1 0
In these cases, the NLS_LANG 3rd field should be set to WE8ISO8859P15. On Solaris, AIX, TRU64, this syntax doesn’t give interesting complementary information. To find more details about these settings:
On Solaris, look in /usr/lib/locale
On AIX, look in /usr/lib/nls/README
On TRU64, look in /usr/lib/nls
On HP-UX, look in /usr/lib/nls/config
On Linux, look in /usr/share/locale/locale.aliasTo set a chosen value for these «locale» settings, it’s needed to know which values are available. To know that, use the following syntax:
$ locale -a
Then, when you have chosen a value, for example UTF-8 on Linux, you can set it like this:
$ export LC_ALL=UTF-8
or
% setenv LC_ALL UTF-8
Example of output after the setenv: $ locale LANG=fr_FR LC_CTYPE="UTF-8" LC_NUMERIC="UTF-8" LC_TIME="UTF-8" LC_COLLATE="UTF-8" LC_MONETARY="UTF-8" LC_MESSAGES="UTF-8" LC_PAPER="UTF-8" LC_NAME="UTF-8" LC_ADDRESS="UTF-8" LC_TELEPHONE="UTF-8" LC_MEASUREMENT="UTF-8" LC_IDENTIFICATION="UTF-8" LC_ALL=UTF-8 $ locale LC_CTYPE | head upper;lower;alpha;digit;xdigit;space;print; graph;blank;cntrl;punct;alnum;combining;combining_level3 toupper;tolower;totitle 16 6 UTF-8 70 84 1 0 1
In this case, the 3rd field (character set) of NLS_LANG should be set to UTF8.
% setenv NLS_LANG American_America.UTF8
-
How to setup the NLS_LANG Properly for Windows and DOS Code Pages
On Windows systems, the encoding scheme (character set) is specified by a code page. Code pages are defined to support specific languages or groups of languages, which share common writing systems. From Oracle point of view the terms code page and character set mean the same. Note that in non Chinese-Japanese-Korean environments, the Windows GUI and DOS command prompt do not use the same code page.
As a result Windows uses 2 different character sets for the ANSI (sqlplusw.exe) and the OEM (dos box — sqlplus.exe) environments.
-
Where to set the NLS_LANG in Windows
In the Registry:
On Windows systems, you should make sure that you have set an NLS_LANG registry subkey for each of your Oracle Homes:
You can easily modify this subkey with the Windows Registry Editor:
Start -> Run…
Type «regedit», and click «ok»
Edit the following registry entry:For Oracle version 7:
HKEY_LOCAL_MACHINE\SOFTWARE\ORACLE
For Oracle Database versions 8, 8i and 9i:
HKEY_LOCAL_MACHINE\SOFTWARE\ORACLE\HOMEx\
where «x» is the unique number identifying the Oracle home.
HOME0 is the first installation
For Oracle Database 10g:
HKEY_LOCAL_MACHINE\SOFTWARE\ORACLE\KEY_
There you have an entry with name NLS_LANG
When starting an Oracle tools, like SQL*Plusw, it will read the content of the oracle.key file located in the same directory to determine which registry tree will be used, therefore which NLS_LANG subkey will be used.
Note:
Some people are confused by finding a NLS_LANG set to «NA» in HKEY_LOCAL_MACHINE\SOFTWARE\ORACLE when no version 7 was installed. This is used for backwards compatibility, and can be ignored.
As a System or User Environment Variable, in System properties:
Although the Registry is the primary repository for settings on Windows, it is not the only place where parameters can be set. Even if not at all recommended, you can set the NLS_LANG as a System or User Environment Variable in the System properties.
This setting will be used for ALL Oracle homes.
To check and modify them:
Right-click the ‘My Computericon -> ‘Properties’
Select the ‘Advanced Tab -> Click on ‘Environment Variables’
The ‘User Variables list contains the settings for the specific OS user currently logged on and the ‘System variables system-wide variables for all users.
Since these environment variables take precedence over the parameters already set in your Registry, you should not set Oracle parameters at this location unless you have a very good reason.
As an Environment variable defined in the command prompt:
Before using an Oracle command line tool you need to MANUALLY SET the NLS_LANG parameter. In an MS-DOS command prompt, use the set command, for example:
C:\> set NLS_LANG=american_america.WE8PC850
-
Determine your Windows ANSI code page
Now that you know what the NLS_LANG is currently set to you can check to see if it properly agrees with the current ANSI code page. The ACP (ANSI Code Page) is defined by the «default locale» setting of Windows, so if you have a UK Windows 2000 client and you want to input Cyrillic (Russian) you need to change the ACP (by changing the «default locale») in order to be able to input Russian.
You’ll find its value in the registry:
Start -> Run…
Type «regedit», and click «ok»
Browse the following registry entry:
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\NLS\CodePage\
There you have (all the way down) an entry with as name ACP. The value of ACP is your current GUI Codepage, for the mapping to the Oracle name. Since there are many registry entries with very similar names, please make sure that you are looking at the right place in the registry.
Additionally, the following URL provides a list of the default code pages for all Windows versions:
http://www.microsoft.com/globaldev/reference/
(under the REFERENCE tab on the left of the page)OEM = the command line codepage, ANSI = the GUI codepage
Note that the Honk Kong HKSCS is listed here:
http://www.microsoft.com/hk/hkscs/
Find the correspondent Oracle client character set:
Find the Oracle client character set in the table below based on the ACP you found above. Note that there is only ONE CORRECT value for a given ACP.
ANSI CodePage (ACP) Oracle Client character set (3rd part of NLS_LANG) 1250
EE8MSWIN1250
1251
CL8MSWIN1251
1252
WE8MSWIN1252
1253
EL8MSWIN1253
1254
TR8MSWIN1254
1255
IW8MSWIN1255
1256
AR8MSWIN1256
1257
BLT8MSWIN1257
1258
VN8MSWIN1258
874
TH8TISASCII
932
JA16SJIS
936
ZHS16GBK
949
KO16MSWIN949
950
ZHT16MSWIN950 — except for Hong Kong (see below)
This is the character set used by the GUI SQL*Plus (sqlplusW.exe/ plus80W.exe / plus33W.exe ) that you start through the Windows start menu. Please note the difference between the GUI SQL*Plus and the «DOS mode» SQL*Plus.
You can use UTF8 as Oracle client character set (=NLS_LANG) on Windows NT, 2000 and XP but you will be limited to use only client programs that explicitly support this configuration. This is because the user interface of Win32 is not UTF8, therefore the client program have to perform explicit conversions between UTF8 (used on Oracle side) and UTF16 (used on Win32 side).
Set it in your Registry:
Use the Windows Registry Editor to set up the NLS_LANG in your Oracle Home with the value you have just found above.
-
The correct NLS_LANG for Windows Command Line Operations
MS-DOS mode uses, with a few exceptions like CJK, (Japanese, Korean, Simplified Chinese, and Traditional Chinese) a different code page (called OEM code page) than Windows GUI (ANSI code page). Meaning that before using an Oracle command line tool such as SQL*Plus (sqlplus.exe/ plus80.exe / plus33.exe ) en svrmgrl in a command prompt then you need to MANUALLY SET the NLS_LANG parameter as an environment variable with the set DOS command BEFORE using the tool.
For Japanese, Korean, Simplified Chinese, and Traditional Chinese, the MS-DOS OEM code page (CJK) is identical to the ANSI code page meaning that, in this particular case, there is no need to set the NLS_LANG parameter in MS-DOS mode.
In all other cases, you need to set it in order to overwrite the NLS_LANG registry key already matching the ANSI code page. The new «MS-DOS dedicated» NLS_LANG needs to match the MS-DOS OEM code page that could be retrieved by typing chcp in a Command Prompt:
C:\> chcp
Active code page: 437
C:\> set NLS_LANG=american_america.US8PC437
If the NLS_LANG parameter for the MS-DOS mode session is not set appropriately, error messages and data can be corrupted due to incorrect character set conversion.
Use the following list to find the Oracle character set that fits to your MS-DOS code page in use on your locale system:
MS-DOS codepage Oracle Client character set (3rd part of NLS_LANG) 437
US8PC437
737
EL8PC737
850
WE8PC850
852
EE8PC852
857
TR8PC857
858
WE8PC858
861
IS8PC861
862
IW8PC1507
865
N8PC865
866
RU8PC866
-
List of common NLS_LANG settings used in the Windows Registry:
Note: this is the correct setting for the GUI SQL*Plus version, (sqlplusW.exe/ plus80W.exe / plus33W.exe )
if you are testing with «special» characters please DO use the GUI and not the «DOS box» sqlplus.exe !
Operating System Locale NLS_LANG Value Arabic (U.A.E.)
ARABIC_UNITED ARAB EMIRATES.AR8MSWIN1256
Bulgarian
BULGARIAN_BULGARIA.CL8MSWIN1251
Catalan
CATALAN_CATALONIA.WE8MSWIN1252
Chinese (PRC)
SIMPLIFIED CHINESE_CHINA.ZHS16GBK
Chinese (Taiwan)
TRADITIONAL CHINESE_TAIWAN.ZHT16MSWIN950
Chinese (Hong Kong HKCS)
TRADITIONAL CHINESE_HONG KONG.ZHT16HKSCS
Chinese (Hong Kong HKCS2001)
TRADITIONAL CHINESE_HONG KONG.ZHT16HKSCS2001 (new in 10gR1)
Croatian
CROATIAN_CROATIA.EE8MSWIN1250
Czech
CZECH_CZECH REPUBLIC.EE8MSWIN1250
Danish
DANISH_DENMARK.WE8MSWIN1252
Dutch (Netherlands)
DUTCH_THE NETHERLANDS.WE8MSWIN1252
Dutch (Belgium)
DUTCH_BELGIUM.WE8MSWIN1252
English (United Kingdom)
ENGLISH_UNITED KINGDOM.WE8MSWIN1252
English (United States)
AMERICAN_AMERICA.WE8MSWIN1252
Estonian
ESTONIAN_ESTONIA.BLT8MSWIN1257
Finnish
FINNISH_FINLAND.WE8MSWIN1252
French (Canada)
CANADIAN FRENCH_CANADA.WE8MSWIN1252
French (France)
FRENCH_FRANCE.WE8MSWIN1252
German (Germany)
GERMAN_GERMANY.WE8MSWIN1252
Greek
GREEK_GREECE.EL8MSWIN1253
Hebrew
HEBREW_ISRAEL.IW8MSWIN1255
Hungarian
HUNGARIAN_HUNGARY.EE8MSWIN1250
Icelandic
ICELANDIC_ICELAND.WE8MSWIN1252
Indonesian
INDONESIAN_INDONESIA.WE8MSWIN1252
Italian (Italy)
ITALIAN_ITALY.WE8MSWIN1252
Japanese
JAPANESE_JAPAN.JA16SJIS
Korean
KOREAN_KOREA.KO16MSWIN949
Latvian
LATVIAN_LATVIA.BLT8MSWIN1257
Lithuanian
LITHUANIAN_LITHUANIA.BLT8MSWIN1257
Norwegian
NORWEGIAN_NORWAY.WE8MSWIN1252
Polish
POLISH_POLAND.EE8MSWIN1250
Portuguese (Brazil)
BRAZILIAN PORTUGUESE_BRAZIL.WE8MSWIN1252
Portuguese (Portugal)
PORTUGUESE_PORTUGAL.WE8MSWIN1252
Romanian
ROMANIAN_ROMANIA.EE8MSWIN1250
Russian
RUSSIAN_CIS.CL8MSWIN1251
Slovak
SLOVAK_SLOVAKIA.EE8MSWIN1250
Spanish (Spain)
SPANISH_SPAIN.WE8MSWIN1252
Swedish
SWEDISH_SWEDEN.WE8MSWIN1252
Thai
THAI_THAILAND.TH8TISASCII
Spanish (Mexico)
MEXICAN SPANISH_MEXICO.WE8MSWIN1252
Spanish (Venezuela)
LATIN AMERICAN SPANISH_VENEZUELA.WE8MSWIN1252
Turkish
TURKISH_TURKEY.TR8MSWIN1254
Ukrainian
UKRAINIAN_UKRAINE.CL8MSWIN1251
Vietnamese
VIETNAMESE_VIETNAM.VN8MSWIN1258
-
List of common NLS_LANG settings used in the Command Prompt (DOS box)
Note: this is the correct setting for the DOS BOX SQL*Plus version, (sqlplus.exe/ plus80.exe / plus33.exe )
Operating System Locale Oracle Client character set (3rd part of NLS_LANG) Arabic
AR8ASMO8X
Catalan
WE8PC850
Chinese (PRC)
ZHS16GBK
Chinese (Taiwan)
ZHT16MSWIN950
Czech
EE8PC852
Danish
WE8PC850
Dutch
WE8PC850
English (United Kingdom)
WE8PC850
English (United States)
US8PC437
Finnish
WE8PC850
French
WE8PC850
German
WE8PC850
Greek
EL8PC737
Hebrew
IW8PC1507
Hungarian
EE8PC852
Italian
WE8PC850
Japanese
JA16SJIS
Korean
KO16MSWIN949
Norwegian
WE8PC850
Polish
EE8PC852
Portuguese
WE8PC850
Romanian
EE8PC852
Russian
RU8PC866
Slovak
EE8PC852
Slovenian
EE8PC852
Spanish
WE8PC850
Swedish
WE8PC850
Turkish
TR8PC857
-
Other Frequently asked questions regarding NLS_LANG
What does the LANGUAGE component of the NLS_LANG parameter control?
The language component of the NLS_LANG parameter controls the operation of a subset of globalization support features. It specifies conventions such as the language used for Oracle messages, sorting, day names, and month names. Each supported language has a unique name; for example, AMERICAN , FRENCH , or GERMAN . The language argument specifies default values for the territory and character set arguments. If the language is not specified, then the value defaults to AMERICAN .
What does the TERRITORY component of the NLS_LANG parameter control?
The territory component of the NLS_LANG parameter controls the operation of a subset of globalization support features. It specifies conventions such as the default date, monetary, and numeric formats. Each supported territory has a unique name; for example, AMERICA , FRANCE , or CANADA . If the territory is not specified, then the value is derived from the language value.
-
How to see what’s really stored in the database?
To find the real numeric value for a character stored in the database use the dump command:
The syntax of the function call is:
DUMP( <value> [, <format> [, <offset> [, <length> ] ] ] )
where:
value — is the value to be displayed
format — is a number which describes the format in which bytes of the value are to be displayed: 8 — means octal, 10 — means decimal, 16 — means hexadecimal; other values between 0 and 16 mean decimal; values greater then 16 are a little confusing and mean: print bytes as ASCII characters if they correspond to printable ASCII codes, print them as «^x» if they correspond to ASCII control codes and print them in hexadecimal otherwise; adding 1000 to the format number will add character set information for the character data type values to the return value offset — is the offset of the first byte of the value to display; negative values mean counting from the end length — is the number of bytes to display. So for example,
SQL> SELECT DUMP(col,1016)FROM table;
Typ=1 Len=39 CharacterSet=UTF8: 227,131,143,227,131,170
returns the value of a column consisting of 3 Japanese characters in UTF8 encoding . For example the 1st char is 227(*255)+131. You will probably need to convert this to UCS2 to verify the codepoint value with the Unicode Standard codepage.
Where is the Character Conversion Done?
Normally conversion is done at client side for performance reasons. This is true from Version 8.0.4 onwards. If the database is using a character set not known by the client then the conversion is done at server side. This is true from Version 8.1.6 onwards.
Windows SQL*Plus is not showing all my extended characters?
You see black squares instead of the characters you probably don’t have the right font defined for your codepage. A font is a collection of glyphs (from «hieroglyphs») that share common appearance (typeface, character size). A font is used by the operating system to convert a numeric value into a graphical representation on screen. A font does not necessarily contain a graphical representation for all numeric values defined in the code page you are using. That’s why you sometimes get black squares on the screen if you change fonts and the new font has no representation for a certain symbol.
The Windows «Character Set Map» utility can be used to see which glyphs are parts of a certain font.
On Windows 2000 and XP:
Start -> Run…
Type «charmap», and click «ok».
I get a question mark or inverted question mark when selecting back just inserted characters?
When characters are converted between the client and the database character set, or vice versa, the character should exist in both. If it does not exist in the character set being converted to (the destination) then a replacement character is used. Some character sets have specific replacement characters defined when translating from other specific character sets but where this is not done a default replacement character, such as a ?, is used. Conversion from a replacement character back to the original character is not possible.
Is iSQL*Plus the only UTF8/Unicode enabled client we support?
On Windows OS, yes, On Unix OS’s, no. All the database utilties, including Import, Export, SQL*Loader, SQL*Plus, can act as a UTF-8 client if the OS locale is UTF-8 (e.g., en_US.UTF-8 on Linux) and NLS_LANG character set is set to UTF8 or AL32UTF8.
How to check the code points managed by a UNIX Operating System?
To know which code point is generated for a character in a Unix Environment, you can use the «od» command:
$ echo "" | od -xc
You can also check the character corresponding to a code point using the «echo» command like this:
for Solaris, AIX, HP-UX, TRU64:
$echo '\0351'
for Linux:
$echo -e '\0351'
You can use Locale Builder or a printed code page (see links below) to verify that your native code page and NLS_LANG setting properly correspond. If there is any ambiguity then use the command above to get the values for more than one character. For printed code page:
http://www.unicode.org
http://www.iso.org
http://czyborra.com/charsets/iso8859.html
-
What about command line tools like SQL*Loader, Import, Export, utilities?
Typically the NLS_LANG needs to match the MS-DOS OEM code page that could be retrieved by typing chcp in a Command Prompt:
C:\> chcp
Active code page: 437
C:\> set NLS_LANG=american_america.US8PC437
For tools like SQL*Loader you can temporarily change the NLS_LANG to the character set of the FILE you are loading. An alternative to changing NLS_LANG is to specify the character set of the data in the datafile using the characterset keyword in the .ctl file. In that case, SQL*Loader will interpret the data in the datafile as that character set regardless of the client character set setting of NLS_LANG. Here is an example .ctl file for utf16. This example ships in the demo area:
-- Copyright (c) 2001 by Oracle Corporation -- NAME -- ulcase11.ctl - Load Data in the Unicode Character Set UTF-16 -- DESCRIPTION -- Loads data in the Unicode character set UTF-16. The data is in little -- Endean byte order. This means that depending on whether SQL*Loader is -- running on a little Endean or a big Endean system, it will have to -- byte swap the UTF-16 character data as necessary. This load uses -- character length semantics, the default for the character set UTF-16. -- -- This case study is modeled after case study 3 (ulcase3), which loads -- variable length delimited (terminated and enclosed) data. -- -- RETURNS -- -- NOTES -- None -- MODIFIED (MM/DD/YY) -- rpfau 02/06/01 - Merged rpfau_sqlldr_add_case_study_11 -- rpfau 01/30/01 - Creation -- LOAD DATA CHARACTERSET utf16 BYTEORDER little INFILE ulcase11.dat REPLACE INTO TABLE EMP FIELDS TERMINATED BY X'002c' OPTIONALLY ENCLOSED BY X'0022' (empno integer external (5), ename, job, mgr, hiredate DATE(20) "DD-Month-YYYY", sal, comm, deptno CHAR(5) TERMINATED BY ":", projno, loadseq SEQUENCE(MAX,1) )
In Oracle9i the Export utility always exports user data, including Unicode data, in the character set of the database. The Import utility automatically converts the data to the character set of the target database.
In Oracle8i the Export utility exports user data converting them from the database character set to the character set of the NLS_LANG of the Export session. The Import utility first converts the data to the character set of the NLS_LANG of the Import session and then converts them to the character set of the target database. Care must be taken that the character set of the NLS_LANG for Export and Import sessions contain all characters to be migrated. This character set is usually chosen to be either the source database or the target database character set and it is usually the same for both Export and Import sessions. This choice is recommended especially with multibyte character sets, which pose some restrictions on export files. The Oracle8i conversions to and from the NLS_LANG character set happen in Oracle9i for DDL statements contained in the Export file.
-
What about database links?
The NLS_LANG on the server (or client) has no influence on character set conversion through a database link. Oracle will do the conversion from the character set of the source database to the character set of the target database (or reverse).
-
What about Multiple Homes on Windows?
There is nothing special with NLS_LANG and the multiple homes on Windows. The parameter taken into account is the one specified in the ORACLE_HOME registry key used by the executable. If the NLS_LANG is set in the environment, it takes precedence over the value in the registry and is used for ALL Oracle_Homes on the server/client.
The NLS_LANG can be found in these registry keys:
HKEY_LOCAL_MACHINE\SOFTWARE\ORACLE
or
HKEY_LOCAL_MACHINE\SOFTWARE\ORACLE\HOMEx
-
Is there an Oracle Unicode Client on Windows?
On Windows there are two kinds of tools / applications:
- 1) A fully Unicode enabled applications which accepts Unicode codepoints and which can render them. It’s the application that needs to deal with the Unicode. Windows provides the Unicode API but the GUI system itself is NOT Unicode «by nature».
A fully Unicode application can only show one glyph for a given Unicode code point. So there is NO confusion possible here, this application will need to use a full Unicode font. If you have a full Unicode application, then you need to set the NLS_LANG to UTF8.
Note that there are currently not many applications like this and if not explicitly stated by the vendor it’s most likely an ANSI application. So don’t set the NLS_LANG to UTF8 if you are not sure!The only Unicode capable client that is included in the Oracle database is iSQL*Plus.
- 2) Standard ANSI application (like sqlplusw.exe) cannot use Unicode code points. So the Unicode code point stored in the database needs to be CONVERTED to an ANSI code point based on the correct setting of the NLS_LANG. This allows Oracle to map the Unicode code point to the character set of the client. If the Unicode code point does not have a mapping to the character set of the client then a replacement character is used.
- 1) A fully Unicode enabled applications which accepts Unicode codepoints and which can render them. It’s the application that needs to deal with the Unicode. Windows provides the Unicode API but the GUI system itself is NOT Unicode «by nature».
-
What is a Character set or Code Page?
A character set is just an agreement on what numeric value a symbol has. A computer does not know ‘A’ or ‘B ‘, it only knows the (binary) numeric value for that symbol, defined in the character set used by its Operating System (OS) or in hardware (firmware) for terminals. A computer can only manipulate numbers, which is why there is a need for character sets. An example is ‘ASCII’, an old 7 bit character set, ‘ROMAN8’ a 8 bit character set on UNIX or ‘UTF8’ a multibyte character set.
A code page is the name for the Windows/DOS encoding schemes, for Oracle NLS you can consider it the same as a character set. You also have to distinguish between a FONT and a character set/codepage. A font is used by the OS to convert a numeric value into a graphical ‘print’ on screen. The Wingdings Font on Windows is the best example of a font where an ‘A’ is NOT shown as an ‘A’ on screen, but for the OS the numeric value represents an ‘A‘. So you don’t SEE it as an ‘A’, but for Windows it’s an ‘A’ and will be saved (or used) as an ‘A’.
To better understand the explanation above, just open MS Word, choose the Wingdings Font, type your name (you will see symbols) and save this as html, if you open the html file with Notepad you will see that in the <style> section the fonts are declared and lower in the <body> section you will find your name in plain text but with style=’font-family: Wingdings’ attribute. If you open it in Internet Explorer or Netscape, you will again see the Wingdings symbols. It’s the presentation that changes, not the data itself.
It’s also possible that you don’t see with a particular font, all the symbols defined in the codepage you are using, just because the creator of the FONT did not include a graphical representation for all the symbols in that font. That’s why you sometimes get black squares on the screen if you change fonts. On Windows you can use the ‘Character Map’ tool to see all the symbols defined in a font.
-
Why Are There Different Character sets?
Two main reasons:
Historically vendors have defined different ‘character sets’ for their hardware and software, mainly because there were no official standards.
New character sets have been defined to support new languages. With an 8 bit character set, you are limited in the number of symbols you can support so there are different character sets for different written languages.
-
What is the difference between 7 bit, 8 bit and Unicode Character sets?
A 7 bit character set only knows 128 symbols (2^7)
An 8 bit character set knows 256 symbols (2^8)
Unicode (UTF-8) is a multibyte character set. Unicode has the capability to define over a million characters. For more information on Unicode see the white paper Oracle Unicode Database Support (PDF)
-
How to choose the right database character set?
A basic consideration for choosing a character set is to make sure it can handle any language that needs to be supported immediately and in the indeterminate future. Another overlooked consideration is to think about what applications and technologies you may want to utilize or interact with the database. Use locale builder (from Oracle Database 9i onwards) to view what characters are defined for a particular Oracle character set.
Choosing Unicode as the database character set ensures a strong foundation for whatever is built into and on top of the database. Oracle recommends using Unicode for all new system deployment. Migrating legacy systems to Unicode is also recommended. Deploying your systems today in Unicode offers many advantages in usability, compatibility, and extensibility. Oracle’s comprehensive support allows you to deploy high-performing systems faster and more easily while leveraging the true power of Unicode. Even if you don’t need to support multilingual data today or have any requirement for Unicode, it is still likely to be the best choice for a new system in the long run and will ultimately save you time and money and give you competitive advantages.
How to set NLS_LANG value for Oracle to support all special characters you need in your language/application.
- Download source code — 652 B
Introduction
Many times, when you have an Oracle application and you have to support special characters like ö,ä,ü,é,è
or currency symbols (e.g. €
), you encounter problems with proper display. Mostly, this problem is caused by improper setting of NLS_LANG
value.
NLS_LANG
sets the language and territory used by the client application and the database server. It also sets the client’s character set, which is the character set for data entered or displayed by a client program.
Character Set of Database
When an Oracle Database is created, the DBA has to specify the CHARACTER SET
and the NATIONAL CHARACTER SET
.
Nowadays, the default values are:
AL32UTF8
forCHARACTER SET
andAL16UTF16
forNATIONAL CHARACTER SET
These Database character sets define which characters (in which format) can be stored in CHAR
, CLOB
, VARCHAR2
resp. in NCHAR
, NCLOB
, NVARCHAR2
column. On an existing database, you can query the values with:
SELECT * FROM NLS_DATABASE_PARAMETERS WHERE PARAMETER LIKE '%CHARACTERSET'; PARAMETER VALUE ========================================== NLS_CHARACTERSET AL32UTF8 NLS_NCHAR_CHARACTERSET AL16UTF16 2 row(s) selected.
The database character sets do not define if and how charaters are displayed in your client application!
Some Facts of NLS_LANG
Format of NLS_LANG
definition is NLS_LANG = LANGUAGE_TERRITORY.CHARSET
All components of the NLS_LANG
definition are optional; any item that is not specified uses its default value. If you specify territory or character set, then you must include the preceding delimiter [underscore (_
) for territory, period (.
) for character set]. Otherwise, the value is parsed as a language name.
Following definitions are all valid:
NLS_LANG=.WE8ISO8859P1
NLS_LANG=_GERMANY
NLS_LANG=AMERICAN
NLS_LANG=ITALIAN_.WE8MSWIN1252
NLS_LANG=_BELGIUM.US7ASCII
If NLS_LANG
value is not provided, then Oracle defaults it to AMERICAN_AMERICA.US7ASCII
.
LANGUAGE
and TERRITORY
set the default value for many other NLS Parameters, see this table to get an overview. CHARSET
is used to let Oracle know what character set you are using on the client side, so Oracle can do the proper conversion. Setting the LANGUAGE
and TERRITORY
parameters of NLS_LANG
has nothing to do with the ability to store characters in a database. Here, you see a list of available Languages, Territories and Character Sets.
You can change the language and territory of your session by:
ALTER SESSION SET NLS_LANGUAGE = '...'; respective ALTER SESSION SET NLS_TERRITORY = '...';
However, you cannot change your client charset
with any SQL command, it is set only by the NLS_LANG
value.
Some setting can be explicitly set in SQL functions, for example:
SELECT TO_CHAR(SYSDATE, 'DD Month', 'NLS_DATE_LANGUAGE = FRENCH') FROM dual;
other can not, e.g.:
SELECT TRUNC(SYSDATE, 'DY', 'NLS_TERRITORY = AMERICA') AS FIRST_DAY_OF_WEEK FROM dual;
does not work.
You cannot query your client charset
by any dictionary or dynamic performance view or any other SQL command. Also, dictionary view NLS_SESSION_PARAMETERS
shows the database character set, not the client character set!
You can run query:
SELECT CLIENT_CHARSET FROM V$SESSION_CONNECT_INFO;
However, the values appear not reliable. Sometimes, it shows NULL
or «unknown».
Definition of NLS_LANG
NLS_LANG
can be set by Environment
variable (e.g. SET NLS_LANG=AMERICAN_AMERICA.WE8MSWIN1252
) or by your Registry at HKEY_LOCAL_MACHINE\Software\Oracle\KEY_{ORACLE_HOME_NAME}\NLS_LANG
, resp. HKEY_LOCAL_MACHINE\Software\Wow6432Node\Oracle\KEY_{ORACLE_HOME_NAME}\NLS_LANG
for 32-bit Oracle Client on a 64-bit Windows. The Environment
variable takes precedence over Registry entry.
You can interrogate existing values with:
Windows: reg query HKEY_LOCAL_MACHINE\Software\Oracle\KEY_{ORACLE_HOME_NAME} /f NLS_LANG reg query HKEY_LOCAL_MACHINE\Software\Wow6432Node\Oracle\KEY_{ORACLE_HOME_NAME} /f NLS_LANG set NLS_LANG Unix/Linux: echo $NLS_LANG
Proper Value of NLS_LANG
Usually, the values for LANGUAGE
and TERRITORY
are obvious and less critical in the application. The most interesting is the CHARACTER SET
value. Many times, you read in forums (and sometimes even in official documentation): «The client NLS_LANG
character set must be the same value as the database character set» — This is simply not true! Consider the database has two character sets, the «normal» and the national character set. On the client side, you have only one value, so actually they cannot be equal. Some character sets are available only on Client side which also vindicates my statement.
There are two requirements for the NLS_LANG
character set:
- The NLS_LANG character set must support the characters you like to use in your application.
- The NLS_LANG character set must match the character set (or encoding) of your application.
Some applications/drivers load NLS_LANG
definition when at launch and derive their character set from NLS_LANG
value. In such case, it becomes easier and only the first requirement applies.
NLS_LANG with SQL*Plus
SQL*Plus inherits the character set from the terminal session where you started it. On Windows, you get the current character set (here called «Codepage
«) with chcp
, the Linux/Unix equivalent is locale charmap
or echo $LANG
. Thus, a proper setting would be for example:
C:\>chcp Active Codepage: 850. C:\>set NLS_LANG=.WE8PC850 C:\>sqlplus ...
With chcp
, you can also change your codepage, e.g., chcp 1252
. You can use the small batch file to change the codepage
of your command line window permanently:
@ECHO off SET ROOT_KEY="HKEY_CURRENT_USER" FOR /f "skip=2 tokens=3" %%i in _ ('reg query HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Nls\CodePage /v OEMCP') do set OEMCP=%%i ECHO. ECHO ............................................... ECHO Select Codepage ECHO ............................................... ECHO. ECHO 1 - CP1252 ECHO 2 - UTF-8 ECHO 3 - CP850 ECHO 4 - ISO-8859-1 ECHO 5 - ISO-8859-15 ECHO 6 - US-ASCII ECHO. ECHO 9 - Reset to System Default (CP%OEMCP%) ECHO 0 - EXIT ECHO. SET /P CP="Select a Codepage: " if %CP%==1 ( echo Set default Codepage to CP1252 reg add "%ROOT_KEY%\Software\Microsoft\Command Processor" /v Autorun /t REG_SZ /d "chcp 1252" /f ) else if %CP%==2 ( echo Set default Codepage to UTF-8 reg add "%ROOT_KEY%\Software\Microsoft\Command Processor" /v Autorun /t REG_SZ /d "chcp 65001" /f ) else if %CP%==3 ( echo Set default Codepage to CP850 reg add "%ROOT_KEY%\Software\Microsoft\Command Processor" /v Autorun /t REG_SZ /d "chcp 850" /f ) else if %CP%==4 ( echo Set default Codepage to ISO-8859-1 add "%ROOT_KEY%\Software\Microsoft\Command Processor" /v Autorun /t REG_SZ /d "chcp 28591" /f ) else if %CP%==5 ( echo Set default Codepage to ISO-8859-15 add "%ROOT_KEY%\Software\Microsoft\Command Processor" /v Autorun /t REG_SZ /d "chcp 28605" /f ) else if %CP%==5 ( echo Set default Codepage to ASCII add "%ROOT_KEY%\Software\Microsoft\Command Processor" /v Autorun /t REG_SZ /d "chcp 20127" /f ) else if %CP%==9 ( echo Reset Codepage to System Default reg delete "%ROOT_KEY%\Software\Microsoft\Command Processor" /v AutoRun /f ) else if %CP%==0 ( echo Bye ) else ( echo Invalid choice pause )
Note, the settings will apply only for the current user. If you like to set it for all users, replace line:
SET ROOT_KEY="HKEY_CURRENT_USER"
by:
SET ROOT_KEY="HKEY_LOCAL_MACHINE"
Be careful with codepage UTF-8 (chcp 65001
) there is a bug, see this discussion. I do not know whether this has been fixed in more recent Windows / SQL*Plus versions.
NLS_LANG with .sql Files
When you run sql files in SQL*Plus, check the save options of your editor. Typically, you can choose values like ISO-8859-1
, UTF-8
, ANSI
, CP1252
as encoding. Term «ANSI» denotes the default Windows code pages. On a western PC, this is CP1252
.
You can interrogate default Windows code pages with:
C:\>reg query HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Nls\CodePage /v ACP HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Nls\CodePage ACP REG_SZ 1252 C:\>
or read «ANSI Codepage» from this table National Language Support (NLS) API Reference for any locale.
You must set character set of NLS_LANG
according to the encoding of your text editor. Here is a list of available Code Pages.
NLS_LANG in Your .NET Application
- ODP.NET Managed Driver is not
NLS_LANG
sensitive.
It is only .NET locale sensitive. (See Data Provider for .NET Developer’s Guide) - ODBC, ODP.NET and OLE DB providers from Oracle read
NLS_LANG
value when they are loaded and inherit the definition, resp. ensures proper character conversion for any client/database character setting. - ODBC, ADO.NET and OLE DB providers from Microsoft also read
NLS_LANG
value when they are loaded. However, they have some limitations, especially in terms of Unicode.
How to Determine the Character Set of My Application If Not Known?
First of all, you should consult the documentation of your application and used drivers.
I developed the following approach if you still have no clue about the used character set.
- Set your
NLS_LANG
toNLS_LANG=.AL32UTF8
- Connect with SQL*Plus to a database with UTF-8 support, i.e., character set
AL32UTF8
When the client character set is equal to the database character set, then no character conversion takes place and all bytes are transferred «as they are» - In your application, run a query with special character like this:
select dump('€') from dual; DUMP('€') ----------------- Typ=96 Len=1: 164
Then you can estimate the character set with a function written in C# like this:
byte[] o = new byte[] { 164 }; foreach ( var enc in Encoding.GetEncodings() ) { var convertedString = enc.GetEncoding().GetBytes("€"); if ( convertedString.SequenceEqual(o) ) Console.WriteLine(String.Format("{0}\t{1}\t{2}", enc.CodePage, enc.Name, enc.DisplayName)); }
The function will print a list of potential character sets used by your application. Sometimes, the printout gives you obviously used character set, sometimes you have to use more other special characters. Some Codepages differ only in a single character!
What To Do If My Characters Are Still Not Properly Displayed?
- Check carefully the documentation of your application and used drivers. Perhaps they are old and do not support Unicode yet. Make an update to the latest version of drivers.
- Check if your font supports desired characters. You can use for example this page Font Support for Unicode Characters to verify used fonts.
- Check the real content of your database. Run query like
SELECT DUMP(THE_COLUMN, 1016) FROM ...
to see the bytes in the table. Perhaps the data have been inserted by a client with wrongNLS_LANG
definition. Don’t be scared, usually you have to investigate only a few characters/bytes to get a result.
- SiteMap
«Prev
Next»
- Extended Features
- Partitioned Tables
- Oracle Partitioning Explained
- Partitioning Advantages
- Table Range Partitioning
- Index Partitioned Table
- Local Index Partition
- Global Index Partition
- Modifying Partition Features
- Dropping Oracle Partitions
- Splitting Merging Partitions
- Clustering Tables
- Clustered Tables Advantages
- Creating Oracle Cluster
- Size Cluster Properly
- Creating Cluster Key
- Oracle Hash Cluster
- Creating Hash Cluster
- Deleting Hash Cluster
- Clustering Conclusion
- Index Organized
- Index Organized Tables
- Advantages Disadvantages IOT
- Creating Table Index
- Allocate Overflow Space
- Deleting Altering IOT
- Reorganize IOT
- IOT Conclusion
- Auditing Features
- Auditing Oracle Entities
- Enable Audit Feature
- Audit SQL Statement
- Access Audit Trail
- Oracle System Privileges
- Auditing Database Objects
- Purge Audit Trail
- Oracle Audit Conclusion
- Language Support
- Oracle National LanguageSupport
- Nls_lang Parameter
- Define Character Set
- National Character Set
- Character Set Conversion
- Character Sets Conclusion
- Using Profiles
- Product Profile
- Profile Table
- Limit Command Access
- Security Scheme
- SQL WildCards
- SQL Wildcard Characters
- Removing Restriction
- Target Profile
- SqlPlus Utility
Lesson 3 | The NLS_LANG Parameter |
Objective | Select a runtime environment for Oracle |
A user can set a language environment for his or her own runtime environment through the use of the NLS_LANG
parameter.
The `NLS_LANG` parameter is an important environment variable in Oracle that specifies the language, territory, and character set used by a client application. It affects how data is displayed, stored, and interpreted in a particular session. Here’s how a user can set the `NLS_LANG` parameter for their runtime environment:
- Understanding `NLS_LANG` Structure:
The `NLS_LANG` parameter has the following structure:
NLS_LANG = language_territory.characterset
- `language`: Specifies the language for Oracle messages and day/month names. Example: `AMERICAN`, `FRENCH`.
- `territory`: Defines the conventions for date and numeric formats. Example: `AMERICA`, `FRANCE`.
- `characterset`: Specifies the character set used by the client application. Example: `WE8ISO8859P1`, `AL32UTF8`.
- Setting the `NLS_LANG` Parameter:
The `NLS_LANG` parameter can be set in the environment of the operating system, typically before starting the Oracle client application. Here’s how to do it on different operating systems:
On Unix/Linux:
You can set `NLS_LANG` in the shell environment using the `export` command:
export NLS_LANG=AMERICAN_AMERICA.AL32UTF8
On Windows: You can set `NLS_LANG` in the Command Prompt or permanently in the System Environment Variables:
Using Command Prompt:
cmdset NLS_LANG=AMERICAN_AMERICA.AL32UTF8
Using System Environment Variables:
- Go to Control Panel > System > Advanced System Settings.
- Click on «Environment Variables».
- Under «System Variables» or «User Variables», click «New».
- Add `NLS_LANG` as the variable name and the desired value (e.g., `AMERICAN_AMERICA.AL32UTF8`) as the variable value.
-
Verifying the Setting: After setting `NLS_LANG`, you can verify that it is correctly set by running:
On Unix/Linux:echo $NLS_LANG
On Windows: (cmd)
echo %NLS_LANG%
- Impact of `NLS_LANG`:
- Language: Determines the language for Oracle messages and the day and month names.
- Territory: Influences default date and numeric formats.
- Character Set: Defines the character encoding used by the client, ensuring that characters are correctly interpreted and stored.
- Example of Use: If you want your client environment to use American English with U.S. date formats and UTF-8 character encoding, you would set:
NLS_LANG=AMERICAN_AMERICA.AL32UTF8
This setup would ensure that your Oracle session handles data according to these regional settings and character encoding.
By properly setting `NLS_LANG`, users can ensure that their Oracle client applications interact correctly with the database, particularly in terms of language, territory-based formatting, and character set compatibility.
Component of NLS_LANG
There are three components of the NLS_LANG
parameter:
- Language determines the language used by Oracle for error messages, day names, and month names.
- Territory determines the local conventions, such as date formats and currency symbols.
- Charset indicates the actual character set used by the client.
The language and territory portions of the parameter are optional. If a language or territory is not specified, these portions default to American. However, if you have a language setting, you must also have a territory setting and separate the two with an underscore. The territory value is separated from the charset value by a period.
- Example:
If you wanted to setNLS_LANG
to support Canadian French, you would use the strinNLS_LANG = FRENCH_CANADA.WE8DEC
- Setting NLS_LANG:
You set theNLS_LANG
parameter as an environment variable for your particular operating system. If you do not specify a value for theNLS_LANG
parameter, Oracle will use the default value for language and territory for the database you are using, which are specified in the initialization file for that database with the parametersNLS_LANGUAGE
and NLS_TERRITORY, respectively. You can change the value for theNLS_LANGUAGE
or theNLS_TERRITORY
parameters at any time in a session with theALTER SESSION
command. You have learned some of the important national language parameters, but additional parameters are worth mentioning.
Other National Language Parameters
You can use a number of other national language parameters to indicate specific attributes of the national language setting:
NLS_CALENDAR
indicates the calendar system used for the language.NLS_CURRENCY
indicates the local currency symbol.NLS_DATE_FORMAT
indicates the local date format.NLS_DATE_LANGUAGE
indicates the default language used for dates.NLS_ISO_CURRENCY
indicates the ISO international currency symbol.NLS_SORT
indicates the sort order for the language.
NLS_CALENDAR specifies which calendar system Oracle uses. It can have one of the following values:
- Arabic Hijrah
- English Hijrah
- Gregorian
- Japanese Imperial
- Persian
- ROC Official (Republic of China)
- Thai Buddha
For example, suppose NLS_CALENDAR is set to «Japanese Imperial», the date format is «E YY-MM-DD».
(«E» is the date format element for the abbreviated era name.) If the date is May 15, 1997, then the SYSDATE is displayed as follows:
SELECT SYSDATE FROM DUAL; SYSDATE -------- H 09-05-15
Each of these can be set in the Oracle initialization file or on a session-by-session basis.
You do not have to set any of them unless you wish their values to be different from the normal values for NLS_LANGUAGE
and NLS_TERRITORY
.
In the next lesson, you will learn about character sets, which determine how data is stored and interpreted in the Oracle database.
National Language Support — Quiz
Sometimes, it is difficult to deal with Windows Platform as it drains the hell out of us!! Recently, I have come across a situation where one of my client’s requirement was to input Arabic language into Oracle Database [or] read/retrieve the output into Various client Applications such as PL/SQL Developer/SQL Developer/TOAD etc.. Inputting the non-english language into Database has never been difficult as we are given plenty of language options within our beloved Oracle Database, but, the problem lies within the complex Windows OS when user wanted to view the data in his/her beloved language, in Applications such as PL/SQL Developer or SQL Developer etc.. This post is all about it.
Character Set, Character encoding, & Code point
Yes, it is a group of characters that is recognised by the Hardware through the OS Interface. Every character is allocated a number, called a code point, these code points will be represented in the computer by one or more bytes. So, every character has to be unlocked or cracked by reading this code in order to either understandable by machine or readable by the User, which is called Character encoding
How Windows OS deals with this Character Set?
In Windows, Character Sets are called Code Pages (CP). A Code page is set of characters that used to be the conventional approach of Character encoding within Windows OS, was replaced by «Unicode» later point in time otherwise still very much compatible to use. There are basically 3 types of Code Pages we see within Windows Registry,
- ACP
ACP is ANSI Code Page used for native non-unicode Application using a Graphical User Interface on Windows Systems.
- OEMCP
OEMCP is used by Win32 Applications which is originated from IBM PC Architecture
- MACCP
Indicates the default code page for characters in non-Unicode Macintosh programs.
How to do it?
Step (1) : Change the ANSI Code Page (ACP) (Change the Locale)
Changing the Code Page could be same as changing the locale of the system for almost all times. It can be done within the registry or Control Panel settings, but Control Panel method is strongly recommended. Changing it within the registry may cause system to malfunction or even get crashed on the boot that obviously requires a Backup to restore, and Recovery which is quite dangerous and time consuming.
This post’s objective is to enable the non-english(Arabic in this case) code page for third party GUI applications, so reflecting the ACP to desired code page number will do the job for us. Following is the procedure I have followed successfully, the desired ANSI code page for Arabic language is 1256. You may check what is desired for your environment here
- Start->Control Panel
- Double click the Regional Language and Options setting
- Choose Administrative tabe
- Click on Change System Locale
- Choose the desired locale, Arabic(Lebanon) in this case
- It will prompt you to restart to reflect the change in effect, restart the system
How to determine the change if reflected, please follow below procedure,
- Start->run
- Type «regedit», and enter, it will take you to the Windows Registry
- HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Nls\CodePage, Scroll down to the bottom
- Below screenshot locates the same,
Step (2) : Change the NLS_LANG Parameter within the Windows Registry of ORACLE_HOME
Our next step is to take care of our Oracle environment respect to Operating System or what we need,
Please follow procedure below for the same to get it done,
- Start->run
- Type «regedit», and enter, it will take you to the Windows Registry
- Then, HKEY_LOCAL_MACHINE\SOFTWARE\ORACLE\KEY_<oracle_home_name>
- Change the NLS_LANG to the desired value, in this case, it is «AMERICAN_AMERICA.AR8MSWIN1256».
- Below screenshot locates the same,
Step (3) : Change NLS_LANG in System’s Environment Variables to the same value that has been set in the ORACLE_HOME registry
- My Computer->Right Click->Properties
- Choose advanced system settings
- Click on Environment Variables
- Click New on both User Variables and System Variables, and add NLS_LANG to the same value that has been set in both ORACLE_HOME registry and Windows Code Pages, im this case, it is «AMERICAN_AMERICA.AR8MSWIN1256»
- Below screenshot locates the same,
Step (4) : Set the Character Set of the Database to the same character set that has been set as above during the Database Creation
Below screenshot locates the same,
Step (5) : Check on PL/SQL Developer or SQL Developer If Database is able take the Arabic, and Client PL/SQL Developer is able to read it and display in Arabic
Below screenshot shows the same,
I wish this helps those who seek to set correct NLS_LANG on Windows platform, Your feedback on the blog post is very much appreciated. Many Many Thanks!!
NLS_LANG is set during installation from currently Oracle home or default language of the operating system.
On windows platform, its defined in registry “HKEY_LOCAL_MACHINE\SOFTWARE\ORACLE\HOMEID\NLS_LANG”
On unix platform, Set environment variable as “setenv NLS_LANG FRENCH_FRANCE.WE8ISO8859P1”
NLS_LANG parameter syntax:
NLS_LANG = LANGUAGE_TERRITORY.CHARACTER_SET
LANGUAGE : Specifies the language and conventions for displaying messages, day name, and month name.
TERRITORY : Specifies the territory and conventions for calculating week and day numbers.
CHARACTER_SET : Controls the character set used for displaying messages.
Example:
NLS_LANG = AMERICAN_AMERICA.WE8MSWIN1252
NLS_LANG = FRENCH_CANADA.WE8ISO8859P1
NLS_LANG = JAPANESE_JAPAN.JA16EUC
Check current value of these parameters
set line 200 pages 0
Col parameter for a20
col value for a20
select parameter,value from V$NLS_PARAMETERS where parameter in ('NLS_LANGUAGE','NLS_TERRITORY','NLS_CHARACTERSET');
Parameter Value
------------------ -----------
NLS_LANGUAGE AMERICAN
NLS_TERRITORY AMERICA
NLS_CHARACTERSET AL32UTF8
Find list of values for following parameter get from the view V$NLS_VALID_VALUES.
set line 200 pages 0
Col parameter for a20
col value for a20
select parameter,value from v$NLS_VALID_VALUES;
--Get all valid value for these parameters
NLS_LANGUAGE
NLS_SORT
NLS_TERRITORY
NLS_CHARACTERSET
Table for defining language and NLS_LANG value
Language NLS_LANG Value
Arabic ARABIC_UNITED ARAB EMIRATES.AR8MSWIN1256
Brazilian Portuguese BRAZILIAN PORTUGUESE_BRAZIL.WE8ISO8859P1
Bulgarian BULGARIAN_BULGARIA.CL8MSWIN1251
Canadian French CANADIAN FRENCH_CANADA.WE8ISO8859P1
Catalan CATALAN_CATALONIA.WE8ISO8859P1
Croatian CROATIAN_CROATIA.EE8MSWIN1250
Czech CZECH_CZECH REPUBLIC.EE8MSWIN1250
Danish DANISH_DENMARK.WE8ISO8859P1
Dutch DUTCH_THE NETHERLANDS.WE8ISO8859P1
Egyptian ARABIC_UNITED ARAB EMIRATES.AR8MSWIN1256
English AMERICAN_AMERICA.WE8ISO8859P1
English(United Kingdom) ENGLISH_UNITED KINGDOM.WE8ISO8859P1
Estonian ESTONIAN_ESTONIA.BLT8MSWIN1257
Finnish FINNISH_FINLAND.WE8ISO8859P1
French FRENCH_FRANCE.WE8ISO8859P1
German GERMAN_GERMANY.WE8ISO8859P1
Greek GREEK_GREECE.EL8MSWIN1253
Hebrew HEBREW_ISRAEL.IW8MSWIN1255
Hungarian HUNGARIAN_HUNGARY.EE8MSWIN1250
Icelandic ICELANDIC_ICELAND.WE8ISO8859P1
Indonesian INDONESIAN_INDONESIA.WE8ISO8859P1
Italian ITALIAN_ITALY.WE8ISO8859P1
Japanese JAPANESE_JAPAN.JA16SJIS
Korean KOREAN_KOREA.KO16KSC5601
Latin American Spanish LATIN AMERICAN SPANISH_AMERICA.WE8ISO8859P1
Latvian LATVIAN_LATVIA.BLT8MSWIN1257
Lithuanian LITHUANIAN_LITHUANIA.BLT8MSWIN1257
Mexican Spanish MEXICAN SPANISH_MEXICO.WE8ISO8859P1
Norwegian NORWEGIAN_NORWAY.WE8ISO8859P1
Polish POLISH_POLAND.EE8MSWIN1250
Portuguese PORTUGUESE_PORTUGAL.WE8ISO8859P1
Romanian ROMANIAN_ROMANIA.EE8MSWIN1250
Russian RUSSIAN_CIS.CL8MSWIN1251
Simplified Chinese SIMPLIFIED CHINESE_CHINA.ZHS16GBK
Slovak SLOVAK_SLOVAKIA.EE8MSWIN1250
Spanish SPANISH_SPAIN.WE8ISO8859P1
Swedish SWEDISH_SWEDEN.WE8ISO8859P1
Thai THAI_THAILAND.TH8TISASCII
Traditional Chinese TRADITIONAL CHINESE_TAIWAN.ZHT16BIG5
Turkish TURKISH_TURKEY.WE8ISO8859P9
Ukrainian UKRAINIAN_UKRAINE.CL8MSWIN1251
Vietnamese VIETNAMESE_VIETNAM.VN8VN3
Others AMERICAN_AMERICA.US7ASCII
Note:
NLS_DATE_FORMAT Derived from NLS_TERRITORY
NLS_DATE_LANGUAGE derived from NLS_LANGUAGE
NLS_TIMESTAMP_FORMAT derived from NLS_TERRITORY
NLS_LANGUAGE derived from NLS_LANG
For more details:
https://docs.oracle.com/cd/A84870_01/doc/nt.816/a73010/appc.htm