summaryrefslogtreecommitdiff
path: root/charset (follow)
Commit message (Collapse)AuthorAge
* Silly of me to overlook it: another obvious way you might like toSimon Tatham2012-07-19
| | | | | | | | | specify characters to 'confuse' is to just put them on the command line in the system multibyte encoding! In a UTF-8 terminal environment this may very well be the easiest thing. [originally from svn r9584] [this svn revision also touched charset,filter,timber]
* A slightly silly new utility: 'confuse'. You provide it with someSimon Tatham2012-07-18
| | | | | | | | | | | | | Unicode values (typically two of them), and it finds cases in which the provided characters are all encoded as the same thing in different charsets and prints those charsets. So if you encounter, for example, some piece of text which has U+0153 LATIN SMALL LIGATURE OE where you might have expected U+00A3 POUND SIGN, simply run 'confuse 153 a3' and it'll tell you which character sets the sender and receiver of the text might have got confused between. [originally from svn r9581] [this svn revision also touched charset,filter,timber]
* Mechanism for iterating over all supported charsets.Simon Tatham2012-07-18
| | | | | [originally from svn r9580] [this svn revision also touched charset,filter,timber]
* Fix an integer-type mismatch between %04x in a printf format stringSimon Tatham2012-05-03
| | | | | | | | and a long int. Spotted by Ubuntu 12.04's gcc, and probably would have caused trouble on 64-bit machines. [originally from svn r9489] [this svn revision also touched charset,filter,timber]
* Correct a comment.Simon Tatham2011-11-09
| | | | | | | | | | | | | | | I had wrongly believed my TYPECHECK macro double-evaluated one of its arguments and hence would cause side effects to happen twice. But in fact I've just realised that although it double-_expands_ the argument, it doesn't double-_evaluate_ it: the two expansions occur in mutually exclusive branches of a ?:, and hence cannot both be executed. So I've removed the comment that says my macro is rubbish. My macro is in fact great :-) [originally from svn r9328] [this svn revision also touched charset,filter,timber]
* Merge PuTTY r9326, adding CP852 support.Simon Tatham2011-10-14
| | | | | | [originally from svn r9327] [r9326 == c72d4b413f024e3c50645caceaddbb65401fb06a in putty repository] [this svn revision also touched charset,filter,timber]
* I've just seen the MIME charset name 'x-sjis' in the wild. Add it toSimon Tatham2009-04-17
| | | | | | | the list. [originally from svn r8498] [this svn revision also touched charset,filter,timber]
* ctype functions require their argument to be EOF or representable as anBen Harris2009-01-11
| | | | | | | | | unsigned char. On platforms were char is signed, passing plain char won't cut it. Make sure we case chars to unsigned char before passing them to tolower(). [originally from svn r8404] [this svn revision also touched charset,filter,timber]
* Oh, all right. Put in the implicit zero elements at the ends of theSimon Tatham2008-11-21
| | | | | | | initialisers, so that gcc stops whining. [originally from svn r8311] [this svn revision also touched charset,filter,timber]
* I've just had some spam in Windows-874, a Thai SBCS. Add libcharsetSimon Tatham2008-08-21
| | | | | | | support for it. [originally from svn r8151] [this svn revision also touched charset,filter,timber]
* Just in case sbcsgen.pl is fed an sbcs.dat with the wrong lineSimon Tatham2008-07-09
| | | | | | | endings, remove \r from input lines. [originally from svn r8113] [this svn revision also touched charset,filter,timber]
* Add the ability to pass a NULL output buffer and/or an unlimitedSimon Tatham2007-08-05
| | | | | | | | | output length to charset_{to,from}_unicode, permitting convenient dry-running of conversions to determine the required output length and/or test for the presence of difficult characters. [originally from svn r7677] [this svn revision also touched charset,filter,timber]
* Add rule to compile emacsenc.c. Noticed by David Leonard.Ben Harris2007-04-30
| | | | | [originally from svn r7495] [this svn revision also touched charset,filter,timber]
* Add a mechanism for translating to and from the coding system symbolsBen Harris2007-04-09
| | | | | | | | used by GNU Emacs. This is likely to be useful for generating or interpreting "coding:" entries in file local variables. [originally from svn r7455] [this svn revision also touched charset,filter,timber]
* I've apparently had this lying around for months but forgotten toSimon Tatham2006-06-13
| | | | | | | | | commit it. Add `-i' option to cstable, which causes charset names to be output as CS_* constants where meaningful. (Doesn't apply to MBCS base charsets, because CS_* constants identify _encodings_.) [originally from svn r6728] [this svn revision also touched charset,filter,timber]
* Remove an outright lie I've just noticed in the comment at the topSimon Tatham2006-05-18
| | | | | | | of this file! [originally from svn r6705] [this svn revision also touched charset,filter,timber]
* sbcsgen.pl was giving different results on different machines in the caseJacob Nevins2006-04-26
| | | | | | | | | | | where two SBCS code points mapped to a single Unicode point. Changed so that by default it favours the lower SBCS code point. On ixion, this highlighted ambiguities in CS_MAC_THAI, CS_MAC_SYMBOL, and CS_VISCII. Guessed at a preference for the first two and added "sortpriority" directives. (No idea about VISCII.) [originally from svn r6641] [this svn revision also touched charset,filter,putty,timber]
* CP866 is popular and small. Add it to both the general and PuTTYJacob Nevins2005-12-18
| | | | | | | | implementations of libcharset, since we've had at least one request for it in PuTTY. [originally from svn r6499] [this svn revision also touched charset,filter,putty,timber]
* Reinstate the DEPLANARISE macros, this time in what I believe is aSimon Tatham2005-11-15
| | | | | | | | | genuinely portable form. (Thanks to IWJ for ideas.) While I'm here, add a couple of explicit `unsigned' casts and U suffixes to prevent more pedantic compilers from warning. [originally from svn r6463] [this svn revision also touched charset,filter,timber]
* Fix various compiler warnings and errors. In particular, my cunningSimon Tatham2005-11-13
| | | | | | | | auto-type-checking DEPLANARISE and REPLANARISE macros have turned out to only work in gcc, which is a shame. [originally from svn r6455] [this svn revision also touched charset,filter,timber]
* write_utf8() is used in iso2022.c as of r6378; declare it.Jacob Nevins2005-10-23
| | | | | | | | (Fixes a warning in iso2022.c. There are lots more.) [originally from svn r6424] [r6378 == 41e50e9f2e3e67da805c5d9037cc650f363e5279] [this svn revision also touched charset,filter,timber]
* Working ISO 2022 output function. Outputs full ISO 2022 (not sureSimon Tatham2005-10-07
| | | | | | | | | | | | | | | what that's useful for but it seemed a pity not to do it) and compound text. I've completely removed the compound text implementation from iso2022s.c in favour of using the more flexible iso2022.c, meaning we can cope with nastiness such as DOCS. This is largely untested: I've checked it on small examples as I went along, but it lacks anything resembling a proper test suite. [originally from svn r6378] [this svn revision also touched charset,filter,timber]
* PostScript StandardEncoding might occasionally come in handy. WhileSimon Tatham2005-10-06
| | | | | | | I'm here, I've updated the URL to the Adobe Glyph List. [originally from svn r6376] [this svn revision also touched charset,filter,timber]
* Correct the URL of the X Registry to the one given in the Registry, whichBen Harris2005-09-26
| | | | | | | works (unlike our old one). [originally from svn r6358] [this svn revision also touched charset,filter,timber]
* Never loop up to _and including_ lenof(array).Simon Tatham2005-09-26
| | | | | [originally from svn r6357] [this svn revision also touched charset,filter,timber]
* Correct copy and paste error.Simon Tatham2005-09-24
| | | | | [originally from svn r6354] [this svn revision also touched charset,filter,timber]
* EUC-TW implementation, plus an explanation of why ISO-2022-CN is difficult.Ben Harris2005-09-24
| | | | | [originally from svn r6353] [this svn revision also touched charset,filter,timber]
* Space-saving restructure of the CNS 11643 data tables. Reduces theSimon Tatham2005-09-24
| | | | | | | | | | | | | | | RO data size in cns11643.o from 400k to 240k. Relies on there being at most seven planes (7*94*94 <= 64k) and on the character set not encoding any Unicode code point above U+40000; if either of these becomes untrue later on we can always fall back to the previous approach, or to somewhere between that and here. The new version passes all the same tests as the old one did, and generates the same output under the new `cstable -v'. I'm confident that I haven't broken it. [originally from svn r6351] [this svn revision also touched charset,filter,timber]
* Fix a couple of warnings.Ben Harris2005-09-24
| | | | | [originally from svn r6350] [this svn revision also touched charset,filter,timber]
* Introduce the -v flag which outputs the actual index of each codeSimon Tatham2005-09-24
| | | | | | | point in every charset. [originally from svn r6349] [this svn revision also touched charset,filter,timber]
* Add support for CNS 11643.Ben Harris2005-09-24
| | | | | [originally from svn r6348] [this svn revision also touched charset,filter,timber]
* Include CNS 11643 in the cstable diagnostic utility.Simon Tatham2005-09-24
| | | | | [originally from svn r6347] [this svn revision also touched charset,filter,timber]
* IRG source T3 includes not only plane 3 of CNS 11643, but also "some additionalBen Harris2005-09-24
| | | | | | | characters". We now filter out the latter from our mapping table. [originally from svn r6345] [this svn revision also touched charset,filter,timber]
* CNS 11643 goes above the BMP, so the test code should take that intoSimon Tatham2005-09-24
| | | | | | | | account when checking the reverse mapping for every potentially relevant Unicode character. [originally from svn r6343] [this svn revision also touched charset,filter,timber]
* Add a mapping table for CNS 11643-1992. It's a bit big, and nothingBen Harris2005-09-24
| | | | | | | uses it yet. [originally from svn r6342] [this svn revision also touched charset,filter,timber]
* Support for the ESC $ ( 0 and ESC $ ( 1 sets that Emacs uses to embedBen Harris2005-09-21
| | | | | | | | Big5 in COMPOUND_TEXT. Emacs does lots of other rude things to COMPOUND_TEXT, but this one is supported by XLib as well. [originally from svn r6336] [this svn revision also touched charset,filter,timber]
* Add support for COMPOUND_TEXT extended segments encoding ISO 98859-14,Ben Harris2005-09-21
| | | | | | | ISO 8859-15, and BIG5. [originally from svn r6335] [this svn revision also touched charset,filter,timber]
* Add two new SBCSes: BS 4730 (alias UK-ASCII) and DEC graphics (alias VT100Ben Harris2005-09-18
| | | | | | | | line-drawing). I think this means that libcharset supports all the character sets that PuTTY supports, which is nice. [originally from svn r6330] [this svn revision also touched charset,filter,timber]
* When documenting s0 and s1, get then the right way around.Ben Harris2005-09-18
| | | | | [originally from svn r6329] [this svn revision also touched charset,filter,timber]
* 1: Better documentation of how read_iso2022() stores its state.Ben Harris2005-09-18
| | | | | | | | 2: Minimal write_iso2022(): it can't encode anything, but promises not to segfault. [originally from svn r6328] [this svn revision also touched charset,filter,timber]
* Ben points out that ESC ( J in ISO-2022-JP should encode theSimon Tatham2005-09-18
| | | | | | | | | _bottom_ half of JIS X 0201 (the one that's almost identical to ASCII, equivalent to the bottom half of Shift-JIS), not the top half. [originally from svn r6327] [this svn revision also touched charset,filter,timber]
* Make read_utf8(), like read_sbcs(), accessible to the rest of the library,Ben Harris2005-09-18
| | | | | | | so it can be used directly in iso2022.c. [originally from svn r6326] [this svn revision also touched charset,filter,timber]
* Undo another change that leaked through with the ISO-2022 commit.Ben Harris2005-09-18
| | | | | [originally from svn r6325] [this svn revision also touched charset,filter,timber]
* Update comment to reflect state of DOCS support.Ben Harris2005-09-18
| | | | | [originally from svn r6324] [this svn revision also touched charset,filter,timber]
* Undo accidental change in previous commit.Ben Harris2005-09-18
| | | | | [originally from svn r6323] [this svn revision also touched charset,filter,timber]
* Support for using DOCS to switch to and from UTF-8 mode.Ben Harris2005-09-17
| | | | | [originally from svn r6321] [this svn revision also touched charset,filter,timber]
* Reasonably complete ISO 2022 support. Huge and hairy, but it seems toBen Harris2005-09-17
| | | | | | | largely work. It might even be useful for something. [originally from svn r6320] [this svn revision also touched charset,filter,timber]
* Use standard "WILD" markers for unregistered Big 5 aliases.Ben Harris2005-09-17
| | | | | [originally from svn r6319] [this svn revision also touched charset,filter,timber]
* Fix stupid typo.Ben Harris2005-09-17
| | | | | [originally from svn r6318] [this svn revision also touched charset,filter,timber]
* Names for ASCII and JIS X 0201 that appear both in the X registry and inBen Harris2005-09-17
| | | | | | | the usual X fonts. [originally from svn r6317] [this svn revision also touched charset,filter,timber]