halibut - My halibut tree

	Commit message (Collapse)	Author	Age
*	New utility function rdaddc_rep().	Simon Tatham	2017-05-30
\| \| \| \| \| \| \|	This is a function I should have introduced a lot earlier while writing the CHM output code, because I ended up with quite a lot of annoying loops to add zero-padding of various sizes by going round and round on the one-byte rdaddc().
*	New output mode to write CHM files directly.	Simon Tatham	2017-05-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I became aware a few months ago that enough is known about CHM files that free software _can_ write them without benefit of the MS HTML Help compiler - in particular there's a thing called 'chmcmd' in the Free Pascal Compiler software distribution which is more or less a drop-in replacement for hhc.exe itself. But although depending on chmcmd would be a bit nicer than depending on hhc.exe, Halibut has always preferred to do the whole job itself if it can. So here's my own from-scratch code to generate CHM directly from Halibut source. The new output mode is presented as a completely separate top-level thing independent of HTML mode. Of course, in reality, the two back ends share all of the HTML-generation code, differing only in a few configuration defaults and the minor detail of what will be _done_ with each chunk of HTML as it's generated (this is what the recent refactoring in b3db1cce3 was in aid of). But even so, the output modes are properly independent from a user-visible-behaviour perspective: they use parallel sets of config directives rather than sharing the same ones (you can set \cfg{html-foo} and \cfg{chm-foo} independently, for a great many values of 'foo'), and you can run either or neither or both as you choose in a given run of Halibut. The old HTML Help support, in the form of some config directives for HTML mode to output the auxiliary files needed by hhc.exe, is still around and should still work the same as it always did. I have no real intention of removing it, partly for the reasons stated in the manual (someone might find it useful to have Halibut generate the .HHP file once and then make manual adjustments to it, so that they can change styling options that the direct CHM output doesn't permit), and mostly because it wouldn't save a great deal of code or complexity in any case - the big two of the three auxiliary files (the HHC and HHK) have to be generated _anyway_ to go inside the .CHM, so all the code would have to stay around regardless.
*	Add \s for 'strong' text, i.e. bold rather than italics. I've missed	Simon Tatham	2013-03-10
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	this a couple of times in Halibut markup recently (in particular, it's handy to have a typographical distinction between 'this term is emphasised because it's new' and 'this term is emphasised because I want you to pay attention to it'), so here's an implementation, basically parallel to \e. One slight oddity is that strong text in headings will not be distinguished in some output formats, since they already use bolded text for their headings. [originally from svn r9772]
*	Revamp of the Halibut error handling mechanism.	Simon Tatham	2012-08-29
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	I'm not quite sure why I ever thought it was a good idea to have a central variadic error() function taking an integer error code followed by some list of arguments that depend on that code. It now seems obvious to me that it's a much more sensible idea to have a separate function per error, so that we can check at compile time that the arguments to each error call are of the right number and type! So I've done that instead. A side effect is that the errors are no longer formatted into a fixed-size buffer before going to stderr, so I can remove all the %.200s precautions in the format strings. [originally from svn r9639]
*	Enable Halibut to read a .but file from standard input, by supplying	Simon Tatham	2009-10-24
\| \| \| \| \| \|	the special filename '-'. [originally from svn r8728]
*	Cope with TrueType fonts with slightly broken cmaps, just ignoring code points	Ben Harris	2007-12-02
\| \| \| \| \| \|	that can't be resolved (apart from warning about it). [originally from svn r7800]
*	Add a --list-fonts option, since getting PostScript names out of TrueType	Ben Harris	2007-02-13
\| \| \| \| \| \|	fonts is difficult. [originally from svn r7281]
*	Improved error handling in sfnt support. No more calls to abort()!	Ben Harris	2007-02-11
\| \| \| \|	[originally from svn r7269]
*	Add a "const" to the argument of dupstr() so that GCC doesn't complain when	Ben Harris	2007-02-04
\| \| \| \| \| \|	we pass it a constant string. [originally from svn r7218]
*	Add support for using TrueType fonts, including embedding in PostScript but	Ben Harris	2007-02-03
\| \| \| \| \| \| \|	not yet in PDF. There's a lot of cleaning up to be done, especially in the area of error, but I think it would be better committed gradually. [originally from svn r7198]
*	Add support for PFB files. This seems to have caused me to completely	Ben Harris	2007-01-27
\| \| \| \| \| \| \| \|	rewrite the Type 1 font support, and I'm sure the result is more complex than it needs to be, but it seems to work correctly, so I shouldn't complain. [originally from svn r7175]
*	Fix the behaviour of constructions like \e{index \i{term}} -- the index tag	Jacob Nevins	2007-01-01
\| \| \| \| \| \| \| \| \| \| \|	was causing emphasis to be broken (particularly noticeable in text-like backends). There are some instances of this in the PuTTY manual, for instance. Unfortunately, \k references in similar situations still aren't quite right, but fixing that will be more involved, and I haven't found any instances yet. [originally from svn r7049]
*	Support for the MS HTML Help system in the HTML back end. As yet I	Simon Tatham	2006-12-11
\| \| \| \| \| \| \| \|	don't know how to write out a .CHM directly, but I am at least able to have the HTML back end write out the three auxiliary files which enable a .CHM to be generated using the MS HTML Help compiler. [originally from svn r6991]
*	Correct embedding of Type 1 fonts in PDF. Error cases (e.g. invalid Type 1	Ben Harris	2006-12-09
\| \| \| \| \| \|	fonts) may not be well handled, and may emit invalid PDF. [originally from svn r6974]
*	Add support for compressed PDF streams, using Simon's new deflate library.	Ben Harris	2006-11-30
\| \| \| \|	[originally from svn r6931]
*	Fairly ropey font-embedding support. In particular, the PDF output is	Ben Harris	2006-05-14
\| \| \| \| \| \| \| \|	technically incorrect, though it works perfectly well with xpdf. To do it properly requires actually parsing the unencrypted part of a Type 1 font, which will be a bit tedious in C. [originally from svn r6685]
*	Initial support for adding fonts at run-time. Currently we only support	Ben Harris	2006-05-13
\| \| \| \| \| \| \| \|	loading AFM files, we recognise them by name, and we can't embed fonts in the output (which is also invalid, though accepted by xpdf, in the PDF case). Oh, and there's no documentation. Still, it's a start. [originally from svn r6681]
*	Add font-selection mechanism to the paper backend. Since we have no way to	Ben Harris	2006-05-08
\| \| \| \| \| \| \| \| \| \| \| \|	load font metrics dynamically, we're restricted to the fonts whose metrics are compiled into Halibut. Font structures aren't reused when the same font is specified twice, nor are unused fonts removed from the output. Finally, the default configuration overflows lines in the manual, but this would need a change to Halibut's grammar to fix. Still, what's there works. [originally from svn r6667]
*	Just to be on the safe side about avoiding other portability hazards	Simon Tatham	2005-11-13
\| \| \| \| \| \| \| \| \| \| \| \|	in future, add `-ansi -pedantic' to the Halibut default compile options and fix the few resulting warnings (mostly signed/unsigned char mismatches and commas at the ends of enums). The one remaining warning I'm still seeing is `missing initializer' for the big table in charset/iso2022.c, but I think the code genuinely is more readable this way, and I haven't found a gcc option to disable that specific warning. [originally from svn r6458]
*	`version' needs to be declared `extern'.	Simon Tatham	2005-11-13
\| \| \| \|	[originally from svn r6457]
*	`style.c' appears to have been around since 1999 and never had	Simon Tatham	2005-11-12
\| \| \| \| \| \| \| \| \| \|	anything in it! In its current form it presents the portability hazards of an empty structure and an empty source file. Therefore, I'm removing it; if I ever have a clear idea of what a user style mechanism ought to look like, it might make a reappearance, but don't hold your breath. [originally from svn r6453]
*	Remove the error message `no text found in paragraph'. Aaron Brown	Simon Tatham	2005-09-29
\| \| \| \| \| \| \| \| \| \| \| \| \|	points out that it's perfectly possible to generate an empty paragraph using legal Halibut syntax: a paragraph containing nothing but a \#{...} comment will do the job, and is quite likely to happen if you've commented out a load of Halibut code. Therefore, an empty paragraph is now silently ignored rather than being an error condition in itself; if you create an empty paragraph due to it containing an unrecognised directive, then you'll get an error for _that_ and only that. [originally from svn r6361]
*	input.c was capable of generating a paragraph structure with no text	Simon Tatham	2005-04-12
\| \| \| \| \| \| \| \| \| \|	in it, if the input paragraph contained (say) an unrecognised control command and nothing else. Such paragraphs can confuse back ends later on, so input.c should refrain from generating them. Added a check and a polite error message (just in case the user manages to generate an empty paragraph using otherwise legal syntax). [originally from svn r5629]
*	Ability to specify multiple arguments to \cfg{html-template-fragment};	Jacob Nevins	2005-03-08
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Halibut will output fragment names in all specified formats. (I forget now precisely why I thought this was necessary, but it seems potentially useful.) Also ensure that legal fragment names are generated even if none of the characters from the original turn out to be legal (e.g., %k with an entirely numeric keyword), and correct an untruth I inserted in the documentation of this. (This commit hits more than just the HTML backend as I've generalised an error message, and fixed a fault in the info backend's error handling while there.) [originally from svn r5457]
*	Add a `--list-charsets' option to Halibut to enumerate canonical names of known	Jacob Nevins	2005-02-18
\| \| \| \| \| \| \| \| \|	character sets. (Also make libcharset `return_in_enum' values saner.) [originally from svn r5341] [this svn revision also touched charset,filter,timber]
*	Changes/additions to input character set handling:	Jacob Nevins	2005-02-17
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	- After discussion with Simon, change the default input charset back to ASCII, rather than trying to work it out from the locale, for the sake of promoting .but file portability. - Add a new command-line option "--input-charset=csname", which overrides the ASCII default for all input files (since there's no other way to use a non-ASCII-compatible input file). - Output a warning if -Cinput-charset:foo is specified that it has no effect. - Update the docs to match all this. Also try to clarify some other things in this area that caught me out. [originally from svn r5332]
*	Sort out error handling everywhere a charset name is converted into	Simon Tatham	2004-06-27
\| \| \| \| \| \|	an integer charset ID. [originally from svn r4317]
*	Introduce a configurable option to select the HTML flavour. Also	Simon Tatham	2004-06-20
\| \| \| \| \| \| \|	fiddle with various small aspects of the output so that it actually validates in all supported flavours. [originally from svn r4307]
*	The Halibut manual contained at least one instance of two index	Simon Tatham	2004-06-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \|	terms (intentionally) differing only in case, which were being silently folded into one by the case-insensitive index tag comparison. Halibut now warns in this situation (but then folds them anyway, which I think is better than silently generating an index containing many case-distinct forms of the same word - I imagine it's very easy to do that by mistake). The manual has been fixed to explicitly define distinct keywords (in the case I spotted and in five other cases picked up by the new warning!), and also documents this issue and how to work with it. [originally from svn r4279]
*	Enforce that \q may not be used anywhere within \c. It shouldn't be	Simon Tatham	2004-06-12
\| \| \| \| \| \| \| \| \| \| \| \| \|	necessary since the whole point of \c should be that the user wants to exercise exact control over the glyphs used, and forbidding it has the useful effect of relieving some backends of having to make difficult decisions: it means the text backend doesn't have to nest two pairs of identical quotes, and the paper backends don't have to downgrade their quote characters if (as is perfectly plausible) the fixed-pitch font doesn't support the same range as the body text fonts. [originally from svn r4277]
*	Switch the memory allocation macros from the Halibut ones	Simon Tatham	2004-06-12
\| \| \| \| \| \| \| \| \| \| \| \| \|	(mknew/mknewa/resize) to the PuTTY ones (snew/snewn/sresize). snewn and mknewa have their arguments opposite ways round; this may make the change initially painful but in the long term will free me of a nasty context switch every time I move between codebases. Also sresize takes an explicit type operand which is used to cast the return value from realloc, thus enforcing that it must be correct, and arranging that if anyone tries to compile Halibut with a C++ compiler there should be a lot less pain. [originally from svn r4276]
*	Initial checkin of the shiny new rewritten-from-scratch HTML back	Simon Tatham	2004-06-12
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	end. There's a lot more _potentiality_ for new features than there are actual new features just yet, but future highlights include: configurable flavour of HTML (3.2, 4, XHTML Transitional or Strict), proper character set support (this is half way there already), and more flexible allocation of sections between multiple HTML files. Meanwhile, immediate benefits include correct handling of special characters within `author' and `description' strings, omission of the filename part in hyperlinks within the same HTML file (in particular, this means a single output file is now totally independent of its filename), and hyperlinks to the index from the top-level contents page (I'm amazed nobody has complained at the lack of this yet!). There are no doubt some shiny new bugs as well, but I'll never find them unless people start using the thing... [originally from svn r4275]
*	All measurements in the paper backend are now configurable, as are	Simon Tatham	2004-05-23
\| \| \| \| \| \|	bullet and quote characters. [originally from svn r4249]
*	Character-set-isation and configurability in the WinHelp backend.	Simon Tatham	2004-05-23
\| \| \| \| \| \| \| \| \|	Newly configurable things are: bullet and quote characters as usual, the ": " that goes between a section number and its title, the "." coming after numbered-list item numbers, and the text "Title page" that appears at the top of the .cnt file. [originally from svn r4248]
*	Enhance the text backend to support configurable quote characters,	Simon Tatham	2004-04-23
\| \| \| \| \| \| \| \| \| \| \|	configurable emphasis characters, various other configurable bits which have been marked FIXME in the code for a while, and also to warn when a code paragraph line is too long (because that was the only other thing labelled FIXME). Fallback options are implemented, and defaults set accordingly. A UTF-8 text output file now looks like proper UTF-8. [originally from svn r4128]
*	Rewrite ustrftime(), so that (a) it uses wcsftime() where available,	Simon Tatham	2004-04-22
\| \| \| \| \| \| \|	and (b) it doesn't trip over strange Unicode characters in the format string. [originally from svn r4120]
*	Instead of traversing a list of paragraphs, mark_attr_ends() now	Simon Tatham	2004-04-22
\| \| \| \| \| \| \| \| \|	merely traverses a list of words, and main() takes responsibility for applying it to each paragraph in the document. This is so that it can _also_ be applied to the display form of each index entry, which Jacob spotted wasn't previously being done. [originally from svn r4117]
*	bk_text and bk_info both need to know the on-screen width of	Simon Tatham	2004-04-22
\| \| \| \| \| \| \| \| \| \| \| \|	characters in order to wrap and align them properly. Therefore, they should be using wcwidth(). So here are a couple of wrappers on wcwidth(), one which filters out the Unicode characters not representable in the target charset, and one which converts _from_ a charset to Unicode before calling wcwidth(). bk_text and bk_info should now align correctly even in the face of unsupported characters and Japanese. [originally from svn r4116]
*	Support the locale-supplied character set where appropriate. It's	Simon Tatham	2004-04-22
\| \| \| \| \| \| \| \| \| \|	used for converting command-line -C directives into Unicode; it's used for outputting Unicode strings to stderr in error messages; and it's used as the default character set for input files (although I'd be inclined to recommend everyone use \cfg{input-charset} in all their source files to ensure their portability). [originally from svn r4114]
*	Charset support for the info backend (\cfg{info-charset}). (This	Simon Tatham	2004-04-21
\| \| \| \| \| \| \|	checkin touches other files because a function in bk_text.c turned out to be of more general use so I moved it out into ustring.c.) [originally from svn r4111]
*	Infrastructure changes for character set support. ustrtoa,	Simon Tatham	2004-04-20
\| \| \| \| \| \| \| \| \| \| \|	ustrfroma, utoa_dup and ufroma_dup now take a charset parameter, and also have a variety of subtly distinct forms. Also, when a \cfg directive is seen in the input file, the precise octet strings for each parameter are kept in their original form as well as being translated into Unicode, so that when they represent filenames they can be used verbatim. [originally from svn r4097]
*	Support for \cfg{input-charset}. Input files can now be in ASCII,	Simon Tatham	2004-04-19
\| \| \| \| \| \| \|	8859-*, UTF-8, or a variety of more fun encodings including various multibyte ones. [originally from svn r4095]
*	And now the page numbers in the index are PDF cross-references too.	Simon Tatham	2004-04-14
\| \| \| \| \| \| \|	Funny, I thought that would be as hard again as the main index processing, and it turned out to be nearly trivial. [originally from svn r4073]
*	Initial work on PS and PDF output. Because these two backends share	Simon Tatham	2004-04-13
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	an enormous amount of preprocessing and differ only in their final output form, I've introduced a new type of layer called a `pre-backend' (bk_paper.c is one). This takes all the information passed to a normal backend and returns an arbitrary void *, which is cached by the front end and passed on to any backend(s) which state a desire for the output of that particular pre-backend. Thus, all the page layout is done only once, and the PS and PDF backends process the same data structures into two output files. Note that these backends are _very_ unfinished; all sorts of vital things such as section numbers, list markers, and title formatting are missing, the paragraph justification doesn't quite work, and advanced stuff like indexes and PDF interactive features haven't even been started. But this basic framework generates valid output files and is a good starting point, so I'm checking it in. [originally from svn r4058]
*	The Emacs and Jed info readers don't like my index format: Info menu	Simon Tatham	2004-04-10
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	items of the form `* stuff: Section 1.2.' are parsed by standalone info as `Section 1.2' followed by a period, but are parsed by other readers as `Section 1' followed by a period and then some spare text. Therefore, I've changed strategy, and the index is now full of *Note cross-references rather than menu items. On the plus side, this means there are no longer any special characters which we can't tolerate in an index entry; on the minus side, my shiny new infrastructure for tracking the filepos of index entries is now rendered pointless. I'll leave it in, though, since it may come in handy again. [originally from svn r4053]
*	Info backend now takes care to avoid magic characters in node names	Simon Tatham	2004-04-10
\| \| \| \| \| \| \| \| \| \|	and index terms (the Info format doesn't like them). In the course of this I've had to introduce some infrastructure for carrying a filepos forward from the definition of every RHS index term so that a particular backend can provide a usefully localised report of which index term had a problem. [originally from svn r4051]
*	Add a config directive to generate the INFO-DIR-ENTRY things that	Simon Tatham	2004-04-09
\| \| \| \| \| \|	appear to be used to automatically construct /usr/info/dir. [originally from svn r4049]
*	Added an info(1) backend, which constructs .info files directly	Simon Tatham	2004-04-09
\| \| \| \| \| \| \|	without going through the .texi source stage. A few things left to do, notably documentation, but the basics all seem to be there. [originally from svn r4047]
*	GCC 3.0 doesn't like you not including <string.h> if you use things in it.	James Aylett	2004-04-01
\| \| \| \| \| \|	We do, so let's. [originally from svn r4029]
*	Arrange a mechanism whereby each backend can be passed a filename	Simon Tatham	2004-04-01
\| \| \| \| \| \| \| \| \|	from its command-line option (`--text=foo.txt') and automatically convert it into one or more notional \cfg directives. In the HTML case this mechanism enables single-file mode as well as setting the filename. [originally from svn r4018]