| [ Return to Bugs & Features | SVN ⇄ GIT ]
STR #3355
Application: | FLTK Library |
Status: | 1 - Closed w/Resolution |
Priority: | 1 - Request for Enhancement, e.g. asking for a feature |
Scope: | 3 - Applies to all machines and operating systems |
Subsystem: | FLUID |
Summary: | Support generation of UTF-8 file from FLUID |
Version: | 1.4-feature |
Created By: | JYG |
Assigned To: | matt |
Fix Version: | 1.4.0 |
Fix Commit: | b490ce3463e9008d03224feb44c8b365a8e21954 |
Update Notification: | |
Trouble Report Files:
Trouble Report Comments:
|
| FLUID generated cxx files with ASCII encoded UTF-8 using octal values. It's annoying to see "\303\251" instead of "é" and impossible to search string in the code. I think FLUID may have an option to use more modern file generation using utf-8 file with BOM or without BOM. | |
|
#2 | AlbrechtS 05:24 Nov 22, 2016 |
| For more information and the full discussion of this topic please see this thread in fltk.general: https://groups.google.com/forum/#!topic/fltkgeneral/gf0Z3BW-zuc
This an edited excerpt of one of my replies:
There can always be characters inside a string that must be quoted (decimal 0-31, e.g. 10 = 0x0a = <LF> = '\n') or DEL (decimal 127). The current fluid code does also quote all values in the range 128 to 255.
I did not write the code, but I can only assume that this [was done because it] is always safe for all compilers...
The patch I append should work for all Unicode characters if the compiler interprets strings as UTF-8.
Now to the patch: I attach three files to this post for later reference:
(1) test.fl: a fluid file with all ISO-8859-1 characters encoded as UTF-8 (only extended range, not ASCII part). This is also a subset of Microsoft's Windows Codepage 1252 ("Western"). Unicode range U+00a0 to U+00ff).
(2) main.cxx: a main program to compile test.cxx. This #include's test.cxx and indirectly test.h generated by fluid from test.fl.
(3) fluid_write_code_utf8.patch: the patch against FLTK 1.3.4 (stable release).
This patch basically does three things:
- Fix reading character string bytes "unsigned", i.e. in range 0-255. - Don't limit line length to avoid breaking lines inside UTF-8 char's. - Write all ASCII and UTF-8 characters literally, i.e. without quoting.
You may use this patch if it works for you. Note that this is tested with the posted test cases, but I'm not sure if this will be okay for all users and compilers.
A "complete" solution would split strings (limit line length) w/o breaking inside UTF-8 characters and would presumably have an option to switch literal UTF-8 output on and off (on: literal/new vs. off: octal-quoted/old behavior).
Note: the posted patch is for FLTK 1.3 and contains only the minimal changes. The complete solution should be in FLTK 1.4 with an option to switch formats as described above. | |
|
#3 | matt 12:29 Dec 17, 2021 |
| Fixed in Git repository. | |
|
#4 | matt 12:29 Dec 17, 2021 |
| Fixed in Git repository. | |
[ Return to Bugs & Features ]
|
| |