Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Garbled Unicode input #41

Open
ghost opened this issue Jul 28, 2017 · 8 comments
Open

Garbled Unicode input #41

ghost opened this issue Jul 28, 2017 · 8 comments
Milestone

Comments

@ghost
Copy link

ghost commented Jul 28, 2017

I'm running this:

liblouis.translateString(
  ['controlchars.cti','unicode.dis','vi-g1.ctb'].join(','),
  'Trăm năm trong cõi người ta'
);

My logs show this:

[ALL] Inbuf=0x0054... <TRUNCATED FOR CLARITY>
Tr m n m trong c�i ng  i ta

It looks like it can't take in Unicode characters directly. Any thoughts on how to best handle this?

@ghost
Copy link
Author

ghost commented Jul 28, 2017

Oh, I made some progress by converting it to UTF-8.

liblouis.translateString(
  ['controlchars.cti','unicode.dis','vi-g1.ctb'].join(','),
  unescape(encodeURIComponent('Trăm năm trong cõi người ta'))
);

Logs show it correctly:

[ALL] Performing translation: tableList=controlchars.cti,unicode.dis,vi-g1.ctb, inlen=68
easy-api.js:520 [ALL] Inbuf=0x0054 0x0072 0x00C4 0x0083 0x006D 0x0020 0x006E 0x00C4 0x0083 0x006D 0x0020 0x0074 0x0072 0x006F 0x006E 0x0067 0x0020 0x0063 0x00C3 0x00B5 0x0069 0x0020 0x006E 0x0067 0x00C6 0x00B0 0x00E1 0x00BB 0x009D 0x0069 0x0020 0x0074 0x0061 0x0000 0x004B 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 ~ Trăm năm trong cõi người ta
easy-api.js:520 [DEBUG] found table controlchars.cti
easy-api.js:520 [DEBUG] found table unicode.dis
easy-api.js:520 [DEBUG] found table vi-g1.ctb
easy-api.js:520 [DEBUG] found table text_nabcc.dis
easy-api.js:520 [DEBUG] found table latinLetterDef6Dots.uti
easy-api.js:520 [DEBUG] found table digits6DotsPlusDot6.uti
easy-api.js:520 [DEBUG] found table litdigits6DotsPlusDot6.uti
easy-api.js:520 [ALL] Translation complete: outlen=62
easy-api.js:520 [ALL] Outbuf=0x2828 0x281E 0x2817 0x2804 0x2873 0x282D 0x2834 0x2834 0x2809 0x2832 0x2804 0x2804 0x2873 0x282D 0x2834 0x2834 0x2826 0x2812 0x2804 0x280D 0x0020 0x281D 0x2804 0x2873 0x282D 0x2834 0x2834 0x2809 0x2832 0x2804 0x2804 0x2873 0x282D 0x2834 0x2834 0x2826 0x003E 0x0000 0x280D 0x0020 0x281E 0x2817 0x2815 0x281D 0x0019 0x0000 0x2809 0x2828 0x2824 0x2801 0x2804 0x2873 0x282D 0x2834 0x2834 0x2803 0x2822 0x2804 0x280A 0x0020 0x281D 0x281B ~                                     >

But still getting an abort() at Error message.

build-no-tables-utf16.js:60648 Uncaught abort() at Error
    at jsStackTrace (http://localhost:8000/viet-braille/build-no-tables-utf16.js:1104:13)
    at stackTrace (http://localhost:8000/viet-braille/build-no-tables-utf16.js:1121:12)
    at Object.abort (http://localhost:8000/viet-braille/build-no-tables-utf16.js:60642:44)
    at _abort (http://localhost:8000/viet-braille/build-no-tables-utf16.js:1725:22)
    at _free (http://localhost:8000/viet-braille/build-no-tables-utf16.js:57648:3)
    at Object.asm._free (http://localhost:8000/viet-braille/build-no-tables-utf16.js:60273:19)
    at LiblouisEasyApi.translateString (http://localhost:8000/viet-braille/easy-api.js:233:13)
    at LiblouisEasyApi.translateString (http://localhost:8000/viet-braille/easy-api.js:484:24)
    at onmessage (http://localhost:8000/viet-braille/liblouis-viet-worker.js:16:31)

@reiner-dolp
Copy link
Collaborator

Which build and easy-api version are you using?

@ghost
Copy link
Author

ghost commented Jul 28, 2017

I'm using easy-api.js from the 0.3.0 release, and the liblouis-build is from commit-b78eff.

@reiner-dolp
Copy link
Collaborator

reiner-dolp commented Jul 28, 2017

Have you seen issue #3? We are passing UCS-2 as UTF-8 into liblouis, which is implemented here for 16bit builds and here for 32bit builds. I would be happy to accept a pull request that adds a proper conversion :)

I am currently however not 100% positive thats your issue.

@ghost
Copy link
Author

ghost commented Jul 28, 2017

Actually, I was using NPM to get the "latest" build, before you asked me what version I was using. So it was actually 0.2.0 easy-api with some older liblouis-build. Realizing I was using an older build, I downloaded 0.3.0 and commit-b78eff. It was still giving me an error, so I removed the UTF-8 encoding and passed in the string directly again. And it's now working!

@ghost ghost closed this as completed Jul 28, 2017
@reiner-dolp
Copy link
Collaborator

reiner-dolp commented Jul 28, 2017

I can actually reproduce the issue using v0.4.0/master branch. I am keeping this issue open. Code for reproduction:

<!doctype html>
<script src="node_modules/liblouis/easy-api.js"></script>
<script>
louAsync = new LiblouisEasyApiAsync({
	easyapi: "node_modules/liblouis/easy-api.js",
	capi: "node_modules/liblouis-build/build-no-tables-utf16.js"
});

louAsync.enableOnDemandTableLoading("node_modules/liblouis-build/tables/", function() {});
louAsync.setLogLevel(louAsync.LOG.ALL, function() {});


louAsync.registerLogCallback(function(logLevel, msg){
	console.log(logLevel, msg);
});

louAsync.version(function(version) {
	console.info("Liblouis Version:", version);
})

louAsync.translateString(
  ['controlchars.cti','unicode.dis','vi-g1.ctb'].join(','),
  'Trăm năm trong cõi người ta', function(msg) { console.log(msg); }
);
</script>

Chrome Dev Tools Log:

Liblouis Version: 3.2.0
(index):22 ALL Performing translation: tableList=controlchars.cti,unicode.dis,vi-g1.ctb, inlen=68
(index):22 ALL Inbuf=0x0054 0x0072 0x00C4 0x0192 0x006D 0x0020 0x006E 0x00C4 0x0192 0x006D 0x0020 0x0074 0x0072 0x006F 0x006E 0x0067 0x0020 0x0063 0x00C3 0x00B5 0x0069 0x0020 0x006E 0x0067 0x00C6 0x00B0 0x00E1 0x00BB 0x009D 0x0069 0x0020 0x0074 0x0061 0x0000 0x004B 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 0x0000 ~ Tr� m n� m trong cõi người ta
(index):22 DEBUG found table controlchars.cti
(index):22 DEBUG found table unicode.dis
(index):22 DEBUG found table vi-g1.ctb
(index):22 DEBUG found table text_nabcc.dis
(index):22 DEBUG found table latinLetterDef6Dots.uti
(index):22 DEBUG found table digits6DotsPlusDot6.uti
(index):22 DEBUG found table litdigits6DotsPlusDot6.uti
(index):22 ALL Translation complete: outlen=62
(index):22 ALL Outbuf=0x2828 0x281E 0x2817 0x2804 0x2873 0x282D 0x2834 0x2834 0x2809 0x2832 0x2804 0x2804 0x2873 0x282D 0x2834 0x2802 0x2814 0x2806 0x2804 0x280D 0x0020 0x281D 0x2804 0x2873 0x282D 0x2834 0x2834 0x2809 0x2832 0x2804 0x2804 0x2873 0x282D 0x2834 0x2802 0x2814 0x003E 0x0000 0x280D 0x0020 0x281E 0x2817 0x2815 0x281D 0x0019 0x0000 0x2809 0x2828 0x2824 0x2801 0x2804 0x2873 0x282D 0x2834 0x2834 0x2803 0x2822 0x2804 0x280A 0x0020 0x281D 0x281B ~ 
Uncaught abort() at Error
    at jsStackTrace (http://localhost:8089/node_modules/liblouis-build/build-no-tables-utf16.js:1104:13)
    at stackTrace (http://localhost:8089/node_modules/liblouis-build/build-no-tables-utf16.js:1121:12)
    at Object.abort (http://localhost:8089/node_modules/liblouis-build/build-no-tables-utf16.js:60642:44)
    at _abort (http://localhost:8089/node_modules/liblouis-build/build-no-tables-utf16.js:1725:22)
    at _free (http://localhost:8089/node_modules/liblouis-build/build-no-tables-utf16.js:57648:3)
    at Object.asm._free (http://localhost:8089/node_modules/liblouis-build/build-no-tables-utf16.js:60273:19)
    at LiblouisEasyApi.translateString (http://localhost:8089/node_modules/liblouis/easy-api.js:233:13)
    at LiblouisEasyApi.translateString (http://localhost:8089/node_modules/liblouis/easy-api.js:484:24)
    at Object.lou (blob:http://localhost:8089/f4202d7d-72d2-4c60-ac6e-6671107f883f:11:26)
    at self.onmessage (blob:http://localhost:8089/f4202d7d-72d2-4c60-ac6e-6671107f883f:21:23)

@reiner-dolp reiner-dolp reopened this Jul 28, 2017
@reiner-dolp reiner-dolp added this to the v0.4.0 milestone Jul 28, 2017
@baohouse
Copy link

baohouse commented Oct 22, 2023

Testing on 0.4.0 with liblouis-build#commit-4c8fbf

translateString works: "Trăm năm trong cõi n\n"

easy-api.js:522 [0] Performing translation: tableList=controlchars.cti,vi-vn-g0.utb, inlen=88
easy-api.js:522 [0] Inbuf=0x00000054 0x00000072 0x00000103 0x0000006D 0x00000020 0x0000006E 0x00000103 0x0000006D 0x00000020 0x00000074 0x00000072 0x0000006F 0x0000006E 0x00000067 0x00000020 0x00000063 0x000000F5 0x00000069 0x00000020 0x0000006E 0x0000000A 0x00000000 0x00000060 0x0000001B 0x6474696C 0x74696769 0x6F443673 0x752E7374 0x00006974 0x00000063 0x00002828 0x0000281E 0x00002817 0x0000281C 0x0000280D 0x00002800 0x0000281D 0x0000281C 0x0000280D 0x00002800 0x0000281E 0x00002817 0x00002815 0x0000281D 0x0000281B 0x00002800 0x00002809 0x00002824 0x00002815 0x0000280A 0x00002800 0x0000281D 0x0000281B 0x0000000A 0x6C2D6976 0x65747465 0x65647372 0x74752E66 0x00000069 0x00000063 0x00000054 0x00000072 0x00000103 0x0000006D 0x00000020 0x0000006E 0x00000103 0x0000006D 0x00000020 0x00000074 0x00000072 0x0000006F 0x0000006E 0x00000067 0x00000020 0x00000063 0x000000F5 0x00000069 0x00000020 0x0000006E 0x00000067 0x0000000A 0x00000000 0x0000001B 0x702D6976 0x73636E75 0x2E666564 0x00697475 ~ Tr m n m trong c�i n

easy-api.js:522 [0] Translation complete: outlen=23
easy-api.js:522 [0] Outbuf=0x00002828 0x0000281E 0x00002817 0x0000281C 0x0000280D 0x00002800 0x0000281D 0x0000281C 0x0000280D 0x00002800 0x0000281E 0x00002817 0x00002815 0x0000281D 0x0000281B 0x00002800 0x00002809 0x00002824 0x00002815 0x0000280A 0x00002800 0x0000281D 0x0000000A ~                    

But as soon as I add the letter g: "Trăm năm trong cõi ng\n"

easy-api.js:522 [0] Performing translation: tableList=controlchars.cti,vi-vn-g0.utb, inlen=92
easy-api.js:522 [0] Inbuf=0x00000054 0x00000072 0x00000103 0x0000006D 0x00000020 0x0000006E 0x00000103 0x0000006D 0x00000020 0x00000074 0x00000072 0x0000006F 0x0000006E 0x00000067 0x00000020 0x00000063 0x000000F5 0x00000069 0x00000020 0x0000006E 0x00000067 0x0000000A 0x00000000 0x0000001B 0x702D6976 0x73636E75 0x2E666564 0x00697475 0x00000000 0x00000049 0x005288D8 0x0000967C 0x74746170 0x736E7265 0x6974632E 0x69756F00 0x61742F73 0x73656C62 0x00000028 0x00000011 0x00009644 0x00009644 0x00000038 0x00000012 0x00524C70 0x00000000 0x00000048 0x00000022 0x69617262 0x2D656C6C 0x74746170 0x736E7265 0x6974632E 0x61742000 0x20656C62 0x0000003B 0x00000000 0x0052DCF8 0x0000001D 0x746E6F63 0x636C6F72 0x73726168 0x6974632E 0x2D69762C 0x672D6E76 0x74752E30 0x64207962 0x6E696665 0x28206465 0x0000003B 0x00000000 0x00528DC8 0x0000001D 0x746E6F63 0x636C6F72 0x73726168 0x6974632E 0x2D69762C 0x672D6E76 0x74752E30 0x65727062 0x65646563 0x2E65636E 0x0000101B 0x00000000 0x00000000 0x00000001 0x00000002 0x00000003 0x00000004 0x00000005 0x00000006 ~ Tr m n m trong c�i ng

easy-api.js:522 [0] Translation complete: outlen=24
easy-api.js:522 [0] Outbuf=0x00002828 0x0000281E 0x00002817 0x0000281C 0x0000280D 0x00002800 0x0000281D 0x0000281C 0x0000280D 0x00002800 0x0000281E 0x00002817 0x00002815 0x0000281D 0x0000281B 0x00002800 0x00002809 0x00002824 0x00002815 0x0000280A 0x00002800 0x0000281D 0x0000281B 0x0000000A ~         

3a92b9c30d10b7e46ec9a11d051c1f77.js:68888 Uncaught abort() at Error
    at jsStackTrace (http://localhost:8080/build/3a92b9c30d10b7e46ec9a11d051c1f77.js:1104:13)
    at stackTrace (http://localhost:8080/build/3a92b9c30d10b7e46ec9a11d051c1f77.js:1121:12)
    at Object.abort (http://localhost:8080/build/3a92b9c30d10b7e46ec9a11d051c1f77.js:68882:44)
    at _abort (http://localhost:8080/build/3a92b9c30d10b7e46ec9a11d051c1f77.js:1726:22)
    at _free (http://localhost:8080/build/3a92b9c30d10b7e46ec9a11d051c1f77.js:65837:3)
    at asm._free (http://localhost:8080/build/3a92b9c30d10b7e46ec9a11d051c1f77.js:68548:19)
    at LiblouisEasyApi.translateString (http://localhost:8080/build/1e2709bdd659364377c2f9f5bc2dab80.js:233:13)
    at LiblouisEasyApi.translateString (http://localhost:8080/build/1e2709bdd659364377c2f9f5bc2dab80.js:484:24)
    at Object.lou (blob:http://localhost:8080/18e7ad47-e460-44fd-bbea-65bf2f2d5af6:11:26)
    at self.onmessage (blob:http://localhost:8080/18e7ad47-e460-44fd-bbea-65bf2f2d5af6:21:23)

I tried stripping out line breaks and just simplifying to the vi-vn-g0.utb table only, and got further

"Trăm năm trong cõi ngư"

easy-api.js:522 [0] Performing translation: tableList=vi-vn-g0.utb, inlen=92
easy-api.js:522 [0] Inbuf=0x00000054 0x00000072 0x00000103 0x0000006D 0x00000020 0x0000006E 0x00000103 0x0000006D 0x00000020 0x00000074 0x00000072 0x0000006F 0x0000006E 0x00000067 0x00000020 0x00000063 0x000000F5 0x00000069 0x00000020 0x0000006E 0x00000067 0x000001B0 0x00000000 0x0000001B 0x702D6976 0x73636E75 0x2E666564 0x00697475 0x00000000 0x00000049 0x00528880 0x0000967C 0x7420676E 0x736E6172 0x6974616C 0x203A6E6F 0x6C626174 0x73694C65 0x69763D74 0x2D6E762D 0x752E3067 0x202C6274 0x656C6E69 0x32393D6E 0x00524C00 0x00000000 0x00000048 0x00000022 0x69617262 0x2D656C6C 0x74746170 0x736E7265 0x6974632E 0x61742000 0x20656C62 0x00000023 0x00000000 0x0052DCA0 0x0000000C 0x762D6976 0x30672D6E 0x6274752E 0x61684320 0x00000023 0x00000000 0x00528D70 0x0000000C 0x762D6976 0x30672D6E 0x6274752E 0x752E7365 0x0000101B 0x00000000 0x00000000 0x00000001 0x00000002 0x00000003 0x00000004 0x00000005 0x00000006 0x00000007 0x00000008 0x00000009 0x0000000A 0x0000000B 0x0000000C 0x0000000D 0x0000000E 0x0000000F 0x00000010 0x00000010 0x00000011 ~ Tr m n m trong c�i ng 
easy-api.js:522 [0] Translation complete: outlen=24
easy-api.js:522 [0] Outbuf=0x00002828 0x0000281E 0x00002817 0x0000281C 0x0000280D 0x00002800 0x0000281D 0x0000281C 0x0000280D 0x00002800 0x0000281E 0x00002817 0x00002815 0x0000281D 0x0000281B 0x00002800 0x00002809 0x00002824 0x00002815 0x0000280A 0x00002800 0x0000281D 0x0000281B 0x00002833 ~                         

And then it breaks as soon as I added

"Trăm năm trong cõi ngườ"

easy-api.js:522 [0] Performing translation: tableList=vi-vn-g0.utb, inlen=96
easy-api.js:522 [0] Inbuf=0x00000054 0x00000072 0x00000103 0x0000006D 0x00000020 0x0000006E 0x00000103 0x0000006D 0x00000020 0x00000074 0x00000072 0x0000006F 0x0000006E 0x00000067 0x00000020 0x00000063 0x000000F5 0x00000069 0x00000020 0x0000006E 0x00000067 0x000001B0 0x00001EDD 0x00000000 0x00000068 0x00000023 0x762D6976 0x30672D6E 0x6274752E 0x00000000 0x00527F18 0x00000000 0x00000020 0x000004B1 0x00527EF8 0x00527EF8 0x005283C8 0x00000000 0x0000974C 0x00000004 0x005283C8 0x005283C8 0x00000000 0x00000000 0x00000000 0x00000004 0x00000000 0x00000000 0x00000015 0x00000007 0x00000008 0x00527F90 0x00000400 0x00000000 0x00000000 0x00000003 0x00000000 0x00000000 0xFFFF0000 0xFFFFFFFF 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0xC6207379 0xA1BBE1B0 0x2D363320 0x36353231 0x3634322D 0x776C610A 0x20737961 0xBBE1B0C6 0x2D3620A3 0x36353231 0x3634322D 0x6C610A0A 0x73796177 0xBAE17920 0x353320BF 0x3433312D 0x312D3635 0x610A3632 0x7961776C 0xE1792073 0x352081BB 0x33312D36 0x2D363534 0x0A363231 0x61776C61 0x79207379 ~ Tr m n m trong c�i ng  
easy-api.js:522 [0] Translation complete: outlen=26
easy-api.js:522 [0] Outbuf=0x00002828 0x0000281E 0x00002817 0x0000281C 0x0000280D 0x00002800 0x0000281D 0x0000281C 0x0000280D 0x00002800 0x0000281E 0x00002817 0x00002815 0x0000281D 0x0000281B 0x00002800 0x00002809 0x00002824 0x00002815 0x0000280A 0x00002800 0x0000281D 0x0000281B 0x00002830 0x00002833 0x0000282A ~                           
3a92b9c30d10b7e46ec9a11d051c1f77.js:68888 Uncaught abort() at Error
    at jsStackTrace (http://localhost:8080/build/3a92b9c30d10b7e46ec9a11d051c1f77.js:1104:13)
    at stackTrace (http://localhost:8080/build/3a92b9c30d10b7e46ec9a11d051c1f77.js:1121:12)
    at Object.abort (http://localhost:8080/build/3a92b9c30d10b7e46ec9a11d051c1f77.js:68882:44)
    at _abort (http://localhost:8080/build/3a92b9c30d10b7e46ec9a11d051c1f77.js:1726:22)
    at _free (http://localhost:8080/build/3a92b9c30d10b7e46ec9a11d051c1f77.js:65837:3)
    at asm._free (http://localhost:8080/build/3a92b9c30d10b7e46ec9a11d051c1f77.js:68548:19)
    at LiblouisEasyApi.translateString (http://localhost:8080/build/1e2709bdd659364377c2f9f5bc2dab80.js:233:13)
    at LiblouisEasyApi.translateString (http://localhost:8080/build/1e2709bdd659364377c2f9f5bc2dab80.js:484:24)
    at Object.lou (blob:http://localhost:8080/d502f23f-f2af-42f3-9afc-210422cde721:11:26)
    at self.onmessage (blob:http://localhost:8080/d502f23f-f2af-42f3-9afc-210422cde721:21:23)

@baohouse
Copy link

baohouse commented Oct 23, 2023

I don't know what happened. So I was running easy-api.js that I manually copied over so I can edit it, and got the abort error above. Just for curiosity, I disabled these four lines inside the translateString method:

this.capi._free(outbuff_ptr);
this.capi._free(inbuff_ptr);
this.capi._free(bufflen_ptr);
this.capi._free(strlen_ptr);

And I was able to get it to translate without an abort! Then I restored those lines and... it was able to run just fine without an abort. This makes no sense! I mean, it's working now, but it does bother me when something just starts working without a logical explanation, because we'll never know if the bug resurfaces.

Anyway, I have a working single-page app at
https://labs.baohouse.net/viet-braille
with the tagged source at
https://github.com/baohouse/labs.baohouse.net/tree/do5np8

Right now I copy over the files from liblouis-build into my own folder, so ideally I would migrate over to using 0.4.0's EasyApiAsync and just have Webpack automatically copy over the necessary files into the production build. One bug at a time. 🐛

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants