summaryrefslogtreecommitdiff
path: root/huffman.h
diff options
context:
space:
mode:
authorSimon Tatham <anakin@pobox.com>2017-05-10 07:10:14 +0100
committerSimon Tatham <anakin@pobox.com>2017-05-13 18:22:09 +0100
commit715a3bef377aeee898c427be99b1acf440b4a5e5 (patch)
tree6ff1d6eecb42df91a4e174644adb0954da44d9fa /huffman.h
parente446ba3cf1f72dca390e9c9a5fe987f3dcccd440 (diff)
downloadhalibut-715a3bef377aeee898c427be99b1acf440b4a5e5.zip
halibut-715a3bef377aeee898c427be99b1acf440b4a5e5.tar.gz
halibut-715a3bef377aeee898c427be99b1acf440b4a5e5.tar.bz2
halibut-715a3bef377aeee898c427be99b1acf440b4a5e5.tar.xz
Factor LZ77 and Huffman routines out of deflate.c.
The general routines for analysing a buffer into an LZ77ish stream of literals and matches, and for constructing a Huffman tree in canonical format, now live in their own source files so that they can be reused for other similar compression formats. Deflate-specific details like the exact file encoding are left in deflate.c.
Diffstat (limited to 'huffman.h')
-rw-r--r--huffman.h33
1 files changed, 33 insertions, 0 deletions
diff --git a/huffman.h b/huffman.h
new file mode 100644
index 0000000..91faa8e
--- /dev/null
+++ b/huffman.h
@@ -0,0 +1,33 @@
+/*
+ * huffman.h: Huffman tree-building routines common to Deflate and LZX.
+ */
+
+/*
+ * Take an input array 'freqs' of size 'nsyms' giving each symbol's
+ * frequency. Return an output array 'lengths' of the same size giving
+ * each symbol's code length. Enforce during construction that no code
+ * length is greater than 'limit'.
+ *
+ * The 'freqs' array may be modified, as a side effect of the limit
+ * enforcement.
+ */
+void build_huffman_tree(int *freqs, unsigned char *lengths,
+ int nsyms, int limit);
+
+/*
+ * Given a sequence of Huffman code lengths, compute the actual code
+ * values. Each one is returned in codes[i] and occupies the bottom
+ * lengths[i] bits of the integer. They are in natural big-endian
+ * format, i.e. the initial bit of the code, governing the choice of
+ * direction at the root node of the notional decode tree, is in the
+ * most significant bit position.
+ *
+ * Returns the maximum code length found. Can also return -1 to
+ * indicate the table was overcommitted (too many or too short codes
+ * to exactly cover the possible space), which is a fatal error (the
+ * output codes will not form a usable Huffman tree), or -2 to
+ * indicate it was undercommitted (too few or too long codes), which
+ * is a non-fatal error (the resulting tree would be usable, just
+ * inefficient).
+ */
+int compute_huffman_codes(const unsigned char *lengths, int *codes, int nsyms);