summaryrefslogtreecommitdiff
path: root/charset (follow)
Commit message (Collapse)AuthorAge
* Update .gitignore.Simon Tatham2017-05-24
| | | | | Also updates libcharset to the latest revision, which updates _its_ .gitignore (and pulls in other previous fixes too).
* Replace Halibut's makefiles with autotools.Simon Tatham2017-05-20
| | | | | | | | | | | | | | | | | | | | | | | This commit updates the libcharset submodule to incorporate the autotools-ification that I just pushed to that subproject, and builds on it by replacing Halibut's own makefile system similarly with an autotools setup. The new Makefile.am incorporates both of the old Makefile and doc/Makefile, so a single run of 'make' should now build Halibut itself and all the formats of its own documentation, which also means that the automake-generated 'make install' target can do the right thing in terms of putting an appropriate subset of those documentation formats in the assorted installation directories. The old Makefiles are gone, as is release.sh (which is now obsolete because autotools's 'make dist' doesn't do anything obviously wrong). The bob build script is comprehensively rewritten, but should still work - even the clang-based Windows build can use the autotools-generated makefile system, provided I do the libcharset build with a manual override of bin_PROGRAMS to prevent it trying to build the libcharset supporting utilities (which are not completely Windows-portable).
* Update to the latest libcharset commit.Simon Tatham2017-05-15
| | | | | | | | | | | Most of the changes since Halibut's last update have been to libcharset's supporting utility collection or have added extra API functions that Halibut doesn't need, but one actually relevant thing that this change brings in is the expanded set of easy-to-type character set encoding names, so that for example you can now say -Ctext-charset:mac-roman where you would previously have had to put a space in the middle of 'Mac Roman' and faff about with quoting on the shell command line.
* Pull in libcharset update with parallel-build fixes.Simon Tatham2016-03-16
|
* Silly of me to overlook it: another obvious way you might like toSimon Tatham2012-07-19
| | | | | | | | | specify characters to 'confuse' is to just put them on the command line in the system multibyte encoding! In a UTF-8 terminal environment this may very well be the easiest thing. [originally from svn r9584] [this svn revision also touched charset,filter,timber]
* A slightly silly new utility: 'confuse'. You provide it with someSimon Tatham2012-07-18
| | | | | | | | | | | | | Unicode values (typically two of them), and it finds cases in which the provided characters are all encoded as the same thing in different charsets and prints those charsets. So if you encounter, for example, some piece of text which has U+0153 LATIN SMALL LIGATURE OE where you might have expected U+00A3 POUND SIGN, simply run 'confuse 153 a3' and it'll tell you which character sets the sender and receiver of the text might have got confused between. [originally from svn r9581] [this svn revision also touched charset,filter,timber]
* Mechanism for iterating over all supported charsets.Simon Tatham2012-07-18
| | | | | [originally from svn r9580] [this svn revision also touched charset,filter,timber]
* Fix an integer-type mismatch between %04x in a printf format stringSimon Tatham2012-05-03
| | | | | | | | and a long int. Spotted by Ubuntu 12.04's gcc, and probably would have caused trouble on 64-bit machines. [originally from svn r9489] [this svn revision also touched charset,filter,timber]
* Correct a comment.Simon Tatham2011-11-09
| | | | | | | | | | | | | | | I had wrongly believed my TYPECHECK macro double-evaluated one of its arguments and hence would cause side effects to happen twice. But in fact I've just realised that although it double-_expands_ the argument, it doesn't double-_evaluate_ it: the two expansions occur in mutually exclusive branches of a ?:, and hence cannot both be executed. So I've removed the comment that says my macro is rubbish. My macro is in fact great :-) [originally from svn r9328] [this svn revision also touched charset,filter,timber]
* Merge PuTTY r9326, adding CP852 support.Simon Tatham2011-10-14
| | | | | | [originally from svn r9327] [r9326 == c72d4b413f024e3c50645caceaddbb65401fb06a in putty repository] [this svn revision also touched charset,filter,timber]
* I've just seen the MIME charset name 'x-sjis' in the wild. Add it toSimon Tatham2009-04-17
| | | | | | | the list. [originally from svn r8498] [this svn revision also touched charset,filter,timber]
* ctype functions require their argument to be EOF or representable as anBen Harris2009-01-11
| | | | | | | | | unsigned char. On platforms were char is signed, passing plain char won't cut it. Make sure we case chars to unsigned char before passing them to tolower(). [originally from svn r8404] [this svn revision also touched charset,filter,timber]
* Oh, all right. Put in the implicit zero elements at the ends of theSimon Tatham2008-11-21
| | | | | | | initialisers, so that gcc stops whining. [originally from svn r8311] [this svn revision also touched charset,filter,timber]
* I've just had some spam in Windows-874, a Thai SBCS. Add libcharsetSimon Tatham2008-08-21
| | | | | | | support for it. [originally from svn r8151] [this svn revision also touched charset,filter,timber]
* Just in case sbcsgen.pl is fed an sbcs.dat with the wrong lineSimon Tatham2008-07-09
| | | | | | | endings, remove \r from input lines. [originally from svn r8113] [this svn revision also touched charset,filter,timber]
* Add the ability to pass a NULL output buffer and/or an unlimitedSimon Tatham2007-08-05
| | | | | | | | | output length to charset_{to,from}_unicode, permitting convenient dry-running of conversions to determine the required output length and/or test for the presence of difficult characters. [originally from svn r7677] [this svn revision also touched charset,filter,timber]
* Add rule to compile emacsenc.c. Noticed by David Leonard.Ben Harris2007-04-30
| | | | | [originally from svn r7495] [this svn revision also touched charset,filter,timber]
* Add a mechanism for translating to and from the coding system symbolsBen Harris2007-04-09
| | | | | | | | used by GNU Emacs. This is likely to be useful for generating or interpreting "coding:" entries in file local variables. [originally from svn r7455] [this svn revision also touched charset,filter,timber]
* I've apparently had this lying around for months but forgotten toSimon Tatham2006-06-13
| | | | | | | | | commit it. Add `-i' option to cstable, which causes charset names to be output as CS_* constants where meaningful. (Doesn't apply to MBCS base charsets, because CS_* constants identify _encodings_.) [originally from svn r6728] [this svn revision also touched charset,filter,timber]
* Remove an outright lie I've just noticed in the comment at the topSimon Tatham2006-05-18
| | | | | | | of this file! [originally from svn r6705] [this svn revision also touched charset,filter,timber]
* sbcsgen.pl was giving different results on different machines in the caseJacob Nevins2006-04-26
| | | | | | | | | | | where two SBCS code points mapped to a single Unicode point. Changed so that by default it favours the lower SBCS code point. On ixion, this highlighted ambiguities in CS_MAC_THAI, CS_MAC_SYMBOL, and CS_VISCII. Guessed at a preference for the first two and added "sortpriority" directives. (No idea about VISCII.) [originally from svn r6641] [this svn revision also touched charset,filter,putty,timber]
* CP866 is popular and small. Add it to both the general and PuTTYJacob Nevins2005-12-18
| | | | | | | | implementations of libcharset, since we've had at least one request for it in PuTTY. [originally from svn r6499] [this svn revision also touched charset,filter,putty,timber]
* Reinstate the DEPLANARISE macros, this time in what I believe is aSimon Tatham2005-11-15
| | | | | | | | | genuinely portable form. (Thanks to IWJ for ideas.) While I'm here, add a couple of explicit `unsigned' casts and U suffixes to prevent more pedantic compilers from warning. [originally from svn r6463] [this svn revision also touched charset,filter,timber]
* Fix various compiler warnings and errors. In particular, my cunningSimon Tatham2005-11-13
| | | | | | | | auto-type-checking DEPLANARISE and REPLANARISE macros have turned out to only work in gcc, which is a shame. [originally from svn r6455] [this svn revision also touched charset,filter,timber]
* write_utf8() is used in iso2022.c as of r6378; declare it.Jacob Nevins2005-10-23
| | | | | | | | (Fixes a warning in iso2022.c. There are lots more.) [originally from svn r6424] [r6378 == 41e50e9f2e3e67da805c5d9037cc650f363e5279] [this svn revision also touched charset,filter,timber]
* Working ISO 2022 output function. Outputs full ISO 2022 (not sureSimon Tatham2005-10-07
| | | | | | | | | | | | | | | what that's useful for but it seemed a pity not to do it) and compound text. I've completely removed the compound text implementation from iso2022s.c in favour of using the more flexible iso2022.c, meaning we can cope with nastiness such as DOCS. This is largely untested: I've checked it on small examples as I went along, but it lacks anything resembling a proper test suite. [originally from svn r6378] [this svn revision also touched charset,filter,timber]
* PostScript StandardEncoding might occasionally come in handy. WhileSimon Tatham2005-10-06
| | | | | | | I'm here, I've updated the URL to the Adobe Glyph List. [originally from svn r6376] [this svn revision also touched charset,filter,timber]
* Correct the URL of the X Registry to the one given in the Registry, whichBen Harris2005-09-26
| | | | | | | works (unlike our old one). [originally from svn r6358] [this svn revision also touched charset,filter,timber]
* Never loop up to _and including_ lenof(array).Simon Tatham2005-09-26
| | | | | [originally from svn r6357] [this svn revision also touched charset,filter,timber]
* Correct copy and paste error.Simon Tatham2005-09-24
| | | | | [originally from svn r6354] [this svn revision also touched charset,filter,timber]
* EUC-TW implementation, plus an explanation of why ISO-2022-CN is difficult.Ben Harris2005-09-24
| | | | | [originally from svn r6353] [this svn revision also touched charset,filter,timber]
* Space-saving restructure of the CNS 11643 data tables. Reduces theSimon Tatham2005-09-24
| | | | | | | | | | | | | | | RO data size in cns11643.o from 400k to 240k. Relies on there being at most seven planes (7*94*94 <= 64k) and on the character set not encoding any Unicode code point above U+40000; if either of these becomes untrue later on we can always fall back to the previous approach, or to somewhere between that and here. The new version passes all the same tests as the old one did, and generates the same output under the new `cstable -v'. I'm confident that I haven't broken it. [originally from svn r6351] [this svn revision also touched charset,filter,timber]
* Fix a couple of warnings.Ben Harris2005-09-24
| | | | | [originally from svn r6350] [this svn revision also touched charset,filter,timber]
* Introduce the -v flag which outputs the actual index of each codeSimon Tatham2005-09-24
| | | | | | | point in every charset. [originally from svn r6349] [this svn revision also touched charset,filter,timber]
* Add support for CNS 11643.Ben Harris2005-09-24
| | | | | [originally from svn r6348] [this svn revision also touched charset,filter,timber]
* Include CNS 11643 in the cstable diagnostic utility.Simon Tatham2005-09-24
| | | | | [originally from svn r6347] [this svn revision also touched charset,filter,timber]
* IRG source T3 includes not only plane 3 of CNS 11643, but also "some additionalBen Harris2005-09-24
| | | | | | | characters". We now filter out the latter from our mapping table. [originally from svn r6345] [this svn revision also touched charset,filter,timber]
* CNS 11643 goes above the BMP, so the test code should take that intoSimon Tatham2005-09-24
| | | | | | | | account when checking the reverse mapping for every potentially relevant Unicode character. [originally from svn r6343] [this svn revision also touched charset,filter,timber]
* Add a mapping table for CNS 11643-1992. It's a bit big, and nothingBen Harris2005-09-24
| | | | | | | uses it yet. [originally from svn r6342] [this svn revision also touched charset,filter,timber]
* Support for the ESC $ ( 0 and ESC $ ( 1 sets that Emacs uses to embedBen Harris2005-09-21
| | | | | | | | Big5 in COMPOUND_TEXT. Emacs does lots of other rude things to COMPOUND_TEXT, but this one is supported by XLib as well. [originally from svn r6336] [this svn revision also touched charset,filter,timber]
* Add support for COMPOUND_TEXT extended segments encoding ISO 98859-14,Ben Harris2005-09-21
| | | | | | | ISO 8859-15, and BIG5. [originally from svn r6335] [this svn revision also touched charset,filter,timber]
* Add two new SBCSes: BS 4730 (alias UK-ASCII) and DEC graphics (alias VT100Ben Harris2005-09-18
| | | | | | | | line-drawing). I think this means that libcharset supports all the character sets that PuTTY supports, which is nice. [originally from svn r6330] [this svn revision also touched charset,filter,timber]
* When documenting s0 and s1, get then the right way around.Ben Harris2005-09-18
| | | | | [originally from svn r6329] [this svn revision also touched charset,filter,timber]
* 1: Better documentation of how read_iso2022() stores its state.Ben Harris2005-09-18
| | | | | | | | 2: Minimal write_iso2022(): it can't encode anything, but promises not to segfault. [originally from svn r6328] [this svn revision also touched charset,filter,timber]
* Ben points out that ESC ( J in ISO-2022-JP should encode theSimon Tatham2005-09-18
| | | | | | | | | _bottom_ half of JIS X 0201 (the one that's almost identical to ASCII, equivalent to the bottom half of Shift-JIS), not the top half. [originally from svn r6327] [this svn revision also touched charset,filter,timber]
* Make read_utf8(), like read_sbcs(), accessible to the rest of the library,Ben Harris2005-09-18
| | | | | | | so it can be used directly in iso2022.c. [originally from svn r6326] [this svn revision also touched charset,filter,timber]
* Undo another change that leaked through with the ISO-2022 commit.Ben Harris2005-09-18
| | | | | [originally from svn r6325] [this svn revision also touched charset,filter,timber]
* Update comment to reflect state of DOCS support.Ben Harris2005-09-18
| | | | | [originally from svn r6324] [this svn revision also touched charset,filter,timber]
* Undo accidental change in previous commit.Ben Harris2005-09-18
| | | | | [originally from svn r6323] [this svn revision also touched charset,filter,timber]
* Support for using DOCS to switch to and from UTF-8 mode.Ben Harris2005-09-17
| | | | | [originally from svn r6321] [this svn revision also touched charset,filter,timber]