Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Locale support #36

Open
leosurban opened this issue Feb 17, 2017 · 22 comments
Open

Locale support #36

leosurban opened this issue Feb 17, 2017 · 22 comments
Assignees

Comments

@leosurban
Copy link

Hi,
I have trouble with our locales. We are using czech locale for Central Europe, CP1250 on Windows, with PC-LATIN-2 on DOS command window.

So my settings is:
SET DB_LOCALE=cs_CZ.cp1250
SET CLIENT_LOCALE=cs_CZ.pc-latin-2

This leads to correct result:

echo SELECT nazevmater FROM cenmat | dbaccess mydb
All characters are correctly displayed, including diactritic, see screenshot.

Same setting, same window:

node ixtest.js

var opts = {
  database : 'mydb@ixlur',
  username : 'informix',
  password : '***'
}; 
var informix = require( 'informix' )( opts );
informix
  .query( "select nazevmater from cenmat" )
  .then( function ( cursor ) 
  {
    return  cursor.fetchAll();
    
  } )
  .then( function ( results )
  {
     console.log( 'results:', results );
  })
  .catch( function ( err ) 
  {
    console.log( err );
  } );

It is working but diacritic characters are missing (? is displayed, see screenshot).

My database is encoded as CP1250, so DB_LOCALE is correct.
In our Windows application with CP1250 is all ok, in DBACCESS, with pc-latin-2, too.

With node-informix I tried CLIENT_LOCALE set to cs_CZ.cp1250, cs_CZ.pc-latin-2, cs_CZ.UTF8, result same - ? characters.

But your application (ESQL) read locales, because when I set wrong locale:
SET CLIENT_LOCALE=cs_CZ.bad
then error is displayed:

node ixtest.js
Error: [-23101] Unable to load locale categories.
at Error (native)

python -V
Python 2.7.13

node -v
v6.7.0

npm list informix
node-api@ C:\buffer\jsrest
`-- [email protected]

node-informix-locales1
node-informix-locales2

@uatuko uatuko added the bug label Feb 17, 2017
@uatuko
Copy link
Member

uatuko commented Feb 18, 2017

@leosurban, can you give a sample SQL script to create a DB and a table with some data for me to test this out (similar to test/support/test-db.sql)?

Most likely this is due to ASCII/UTF-8 conversion issues 😃.

@uatuko uatuko self-assigned this Feb 18, 2017
@leosurban
Copy link
Author

leosurban commented Feb 18, 2017

Command script for Windows (for dbimport, dbexport, dbaccess, ...) ixlur.cmd:
Important for LOCALE is DB_LOCALE and CLIENT_LOCALE.
Last command is setting console locale for switch window to PC-LATIN-2 (=code page 852).

set INFORMIXDIR=C:\PROGRA~1\IBMINF~1
set REGMACHINE=\\LUR-PC
set INFORMIXSERVER=ixlur
set ONCONFIG=onconfig.ixlur
set INFORMIXSQLHOSTS=C:\Program Files\IBM Informix Software Bundle\etc\sqlhosts.ixlur
set GL_USEGLU=1
set PATH=%INFORMIXDIR%\bin;%PATH%
set DB_LOCALE=cs_CZ.cp1250
set CLIENT_LOCALE=cs_CZ.pc-latin-2
set DBDATE=dmy4.
set DBMONEY=.Kc
mode con codepage select=852

LocaleTest.exp.zip

unzip LocaleTest.zip

dbimport LocaleTest

echo select somename from exampletable | dbaccess LocaleTest

Database selected.
somename  Pink earthworm
somename  Růžová žížala
somename  Beehive
somename  Včelí úl
4 row(s) retrieved.
Database closed.

To be sure I attached screenshot with correct result:
locale_1

node ixtest.js

node ixtest.js

[ [ 1, 'Pink earthworm', 42783 ],
  [ 2, 'R??ov? ???ala', 42783 ],
  [ 3, 'Beehive', 42783 ],
  [ 4, 'V?el? ?l', 42783 ] ]

// note: I replaced invalid characters with ?, see screenshot.

locale_2

Javascript Node/Code:

var opts = {
  database : 'LocaleTest@ixlur',
  username : 'informix',
  password : '***'
};
var informix = require( 'informix' )( opts );
informix
  .query( "select id, somename, somedate from exampletable" )
  .then( function ( cursor ) 
  {
    return  cursor.fetchAll();    
  } )
  .then( function ( results )
  {
   console.log(results);
  })
  .catch( function ( err ) 
  {
    console.log( err );
  } );

If you will have some problem to simulate it (I do not know if you are using Windows, if your computer has support for EastEurope codesets etc..) I can prepare IX server with this database, with communication port opened to internet. I can also prepare similar setting for client on Unix/Linux OS (ISO-LATIN-2 code set).

Thanks.

@leosurban
Copy link
Author

And ODT file (Zipped) because with Unicode encoded contetent may be better way than UNL single-codeset file. See end of document for INSERT statements.

LocaleTest.odt.zip
.

@uatuko
Copy link
Member

uatuko commented Feb 18, 2017

It seems to work OK with UTF-8.

Using dbaccess (data from LocaleTest.odt):

create table exampletable
(
  id integer not null,
 somename nvarchar(100) not null,
  somedate date not null,
  primary key(id)
  somename nvarchar(100) not null,
  somedate date not null,
  primary key(id)
> );

Table created.

insert into exampletable values(1,"Pink earthworm", today);

1 row(s) inserted.

insert into exampletable values(2,"Růžová žížala", today);

1 row(s) inserted.

insert into exampletable values(3,"Beehive", today);

1 row(s) inserted.

insert into exampletable values(4,"Včelí úl", today);

1 row(s) inserted.

> select * from exampletable;



id        1
somename  Pink earthworm
somedate  02/18/2017

id        2
somename  Růžová žížala
somedate  02/18/2017

id        3
somename  Beehive
somedate  02/18/2017

id        4
somename  Včelí úl
somedate  02/18/2017

4 row(s) retrieved.

> 

Using Node.js:

informix                                                                                                                                                                                                                                                                                                                      
    .query( 'select * from exampletable' )                                                                                                                                                                                                                                                                                    
    .then( ( cursor ) => {                                                                                                                                                                                                                                                                                                    
        return cursor.fetchAll( { close : true } );                                                                                                                                                                                                                                                                           
    } )                                                                                                                                                                                                                                                                                                                       
    .then( ( results ) => {                                                                                                                                                                                                                                                                                                   
        console.log( results );                                                                                                                                                                                                                                                                                               
    } )                                                                                                                                                                                                                                                                                                                       
    .catch( ( err ) => {                                                                                                                                                                                                                                                                                                      
        console.log( err );                                                                                                                                                                                                                                                                                                   
    } );   
$ node tmp/test.js 
[ [ 1, 'Pink earthworm', '2017-02-18T00:00:00.000Z' ],
  [ 2, 'Růžová žížala', '2017-02-18T00:00:00.000Z' ],
  [ 3, 'Beehive', '2017-02-18T00:00:00.000Z' ],
  [ 4, 'Včelí úl', '2017-02-18T00:00:00.000Z' ] ]

Here's a UTF-8 data unload for reference:
exampletable.txt

@uatuko
Copy link
Member

uatuko commented Feb 18, 2017

@leosurban, can you set the windows command prompt and CLIENT_LOCAL to utf-8 and check whether it resolves the issue?

Although ESQL/C support CLIENT_LOCAL when retrieving data, within Node.js all strings are treated as utf-8 so CLIENT_LOCAL should be set to utf-8. But there's no conversion from utf-8 to PC-LATIN-2 which I believe is why you are seeing encoding issues on command prompt.

Either setting command prompt to utf-8 or using a conversion library for display purposes should resolve your issue.

@leosurban
Copy link
Author

Sorry, no, it is not working.

mode con codepage select=65001
set CLIENT_LOCALE=cs_CZ.UTF8
echo select somename from exampletable | dbaccess LocaleTest
… correct result …

node ixtest.js
… bad result …

What is setting of your Linux (?) terminal when you receive this (correct) result?

@leosurban
Copy link
Author

locale_3
locale_4

@uatuko
Copy link
Member

uatuko commented Feb 18, 2017 via email

@leosurban
Copy link
Author

With Lucida Console font is the result the same. And because DBACCESS result with UTF-8 setting is correct (see screenshot), I mean that UTF8 is displayed correctly.
locale_5

I tried also your exampletable.txt - correct.
locale_6

Only result from ESQLC/JS is wrong :(

@uatuko
Copy link
Member

uatuko commented Feb 18, 2017

Looks like there's a conversion issue somewhere on windows. Don't have access to a windows PC now to debug this but will try to take a look later this week.

@leosurban
Copy link
Author

Probably. I created small test with important result.

var opts = {
  database : 'LocaleTest@ixlur',
  username : 'informix',
  password : '***'
}; 
var informix = require( 'informix' )( opts );
var moment = require('moment');
{
  //build string "Ruzova zizala" with diacritic with unicode characters
  console.log("Static string with diactritic:");
  var str = "R"+String.fromCharCode(367)+String.fromCharCode(382)+"ov"+String.fromCharCode(225)
           +" "
           +String.fromCharCode(382)+String.fromCharCode(237)+String.fromCharCode(382)+"ala"; 
  for (var i=0; i<str.length; i++)
  {
    console.log( str[i] + '...('+ str.charCodeAt(i) + ')' );
  }
}
// now retrieve tha same string from ESQLC
informix
  .query( "select somename from exampletable where id=2" )
  .then( function ( cursor ) 
  {
    return  cursor.fetchAll();    
  } )
  .then( function ( results )
  {
    // display this "static" string...
    console.log("The same string from SQL result:");
    var str = results[0][0];
    for (var i=0; i<str.length; i++)
    {
      console.log( str[i] + '...('+ str.charCodeAt(i) + ')' );
    }
  })
  .catch( function ( err ) 
  {
    console.log( err );
  } );

Result:
locale_7

What is important:

  1. Unicode string is correctly
  2. that CLIENT_LOCALE does not matter. All variants (cs_CZ.UTF-8, cs_CZ.cp1250 and cs_CZ.pc-latin-2) return the same values, for diacritic characters return value 65533.

Thank you.
L.

@uatuko
Copy link
Member

uatuko commented Feb 18, 2017

Tested this on a Windows 7 VM and encoding works fine.

Node.js command prompt:

ConEmu shell:

However my DB is in en_us.819 (default?).

> select * from systables where tabid = 90;



tabname           GL_COLLATE
owner            
partnum          0
tabid            90
rowsize          0
ncols            0
nindexes         0
nrows            0.00
created          02/18/2017
version          0
tabtype          
locklevel        
npused           0.00
fextsize         0
nextsize         0
flags            0
site             en_US.819
dbname           
type_xid         0
am_id            0
pagesize         0
ustlowts         
secpolicyid      0
protgranularity  
statchange       0
statlevel        

1 row(s) retrieved.

@uatuko
Copy link
Member

uatuko commented Feb 18, 2017

So need to check,

  • Is this an issue with DB_LOCALE being cs_CZ.cp1250?
    • cs_CZ.UTF8 seems to work, most likely this issue is related to some additional features presented in cs_CZ.cp1250 (see #issuecomment-280939778).
  • Is this an issue with Windows 8/10?
  • Can we reproduce this in Linux/Mac?

@leosurban
Copy link
Author

However my DB is in en_us.819 (default?).

This is wrong and probably reason, why it is working for you.
You must have some czech locale on database side to have correct collation (ordering, lower/uppercase converting etc). I can be any cs_CZ locale ( cs_CZ.cp1250, cs_CZ.pc-latin-2, cs_CZ.8859-2, probable cs_CZ.UTF8 too). Your locale en_US.819 is 8859-1, standard ASCII.

This will not work in your database:
locale_8

You need set DB_LOCALE before you import/create database. It will not change nothing in your current databases. Only this one will have this settings.

I am sure, that this is not problem with "czech" locale and I can prepare the same example for russian locales and CP1251. I used czech locale because I need it and for you is better to test because there is mix of diacritic and "normal" characters.

echo select * from systables where tabid = 90 | dbaccess LocaleTest
tabname GL_COLLATE
owner
partnum 0
tabid 90
rowsize 0
ncols 0
nindexes 0
nrows 0.00
created 18.02.2017
version 0
tabtype
locklevel
npused 0.00
fextsize 0
nextsize 0
flags 0
site cs_CZ.1250
dbname
type_xid 0
am_id 0
pagesize 0
ustlowts
secpolicyid 0
protgranularity
statchange 0
statlevel
1 row(s) retrieved.

@uatuko
Copy link
Member

uatuko commented Feb 19, 2017

Took a little bit of time to update my Informix images to include extended GLS support but finally managed to reproduce this issue on Linux 😃.

  1. DB_LOCALE=cs_CZ.UTF8
> create database gls with buffered log;

Database created.

> select * from systables where tabid = 90;



tabname           GL_COLLATE
owner            
partnum          0
tabid            90
rowsize          0
ncols            0
nindexes         0
nrows            0.00
created          19/02/17
version          0
tabtype          
locklevel        
npused           0.00
fextsize         0
nextsize         0
flags            0
site             cs_CZ.57372
dbname           
type_xid         0
am_id            0
pagesize         0
ustlowts         
secpolicyid      0
protgranularity  
statchange       0
statlevel        

1 row(s) retrieved.

If I don't set CLIENT_LOCALE, both dbaccess and Node.js will display invalid characters but setting CLIENT_LOCALE=en_gb.utf8 (or to cs_CZ.UTF8) will solve both dbaccess and Node.js display issues.

dbaccess

node.js

@uatuko
Copy link
Member

uatuko commented Feb 19, 2017

I can't get cs_CZ.cp1250, cs_CZ.pc-latin-2, cs_CZ.8859-2 locales to work. They throw -23101/-23104 errors on my 12.10 FC8DE installation (client SDK 4.10 FC8DE) even through I selected all available GLS components during installation.

@leosurban, If you have a DB that I can access I can look into this more (probably will need a client SDK that supports your locales as well) but it does seem to suggest a settings mismatch is the likely culprit.

For reference, you can use the docker images I used (docker pull nukedzn/informix:12.10.FC8DE) to validate my results.

@leosurban
Copy link
Author

leosurban commented Feb 20, 2017

@uditha-atukorala Please send me mail to **** at kolin.cz to get access to test server. Thanks.

(UA) Edit: Mask email.

@uatuko
Copy link
Member

uatuko commented Feb 26, 2017

Looking at $INFORMIXDIR/gls/lc11/os, on Linux/Mac there aren't any OS specific files.

$ ls $INFORMIXDIR/gls/lc11/os
C  C.lc  POSIX  POSIX.lc  _readme
$ head -15 $INFORMIXDIR/gls/lc11/os/POSIX.lc 
##########################################################################
#                                                                        #
#  Licensed Materials - Property of IBM                                  #
#                                                                        #
#  "Restricted Materials of IBM"                                         #
#                                                                        #
#  IBM Informix GLS                                                      #
#  (c) Copyright IBM Corporation 1996, 2004 All rights reserved.         #
#                                                                        #
##########################################################################
<operating_system> "portable" 
<source_version>		100
<modified_date>		"05-04-2004"

On Windows there are a bunch of files which seems to be OS specific.

e.g. czech.lc

##########################################################################
#                                                                        #
#  Licensed Materials - Property of IBM                                  #
#                                                                        #
#  "Restricted Materials of IBM"                                         #
#                                                                        #
#  IBM Informix GLS                                                      #
#  (c) Copyright IBM Corporation 1996, 2004 All rights reserved.         #
#                                                                        #
##########################################################################
<operating_system>  "winnt351"
<source_version>                100
<modified_date>         "05-04-2004"

@taviroquai
Copy link

taviroquai commented Sep 19, 2019

Hello,

I need to use node-informix for a new project and I'm having issues with locales.

The following configuration works well in PHP:

  1. Database in en_US.819
  2. Client connection with option "DB_LOCALE=en_US.819"
  3. Queries results converted successfully using php function "uft8_encode()"

Using node-informix I can't convert query results to UTF-8 correctly.
I have tried:

  1. Buffer.from(value, 'latin1').toString('utf-8');
  2. NPM package "utf8", function "encode"

Characters like the "MASCULINE ORDINAL INDICATOR" and latin accents appear wrong.

Example String: Drtº
Here is what it looks like in the console:
image

Here here what it looks like in the browser with "meta charset=utf-8":
image

Would it be possible to pass the LC_LOCALE param to the connection options?

Any help is appreciated. Thanks.

@taviroquai
Copy link

Hello,

My previous comment looks it is solved for now with the following environment variables:
CLIENT_LOCALE="en_US.utf8"
DB_LOCALE="en_US.819"

@ctahk0
Copy link

ctahk0 commented Sep 24, 2019

Hi,
did you ever solve this cs_CZ.cp1250 locale?
I have exactly the same problem like @leosurban and can't find any solution.
Thanks

@uatuko
Copy link
Member

uatuko commented Sep 24, 2019

Unfortunately not, I didn't have a client SDK with the necessary locale to test this fully.

If at all possible, using a UTF8 variant seem to solve the issue (#issuecomment-280939778).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants