puzzles/dsf.c, branch devel

Actually rewrite the dsf implementation.

2023-04-20T17:42:50+00:00

This rewrite improves the core data structure implementation in two
ways. Firstly, when merging two equivalence classes, we check their
relative sizes, and choose the larger class's canonical element to be
the overall root of the new class tree. This minimises the number of
overlong paths to the root after the merge. Secondly, we defer path
compression until _after_ the two classes are merged, rather than do
it beforehand (via using edsf_canonify as a subroutine) and then have
to do it wastefully again afterwards.

The size-based root selection was what we _used_ to do, and delivers
the better asymptotic performance. I reverted it so that Keen could
track the min of each equivalence class. But since then I've realised
you can have the asymptotic goodness _and_ min-tracking if you store
the minima separately from the main data structure. So now Keen does
that, and other clients don't have to pay the cost.

Similarly, the flip tracking is now a cost that only users of flip
dsfs have to pay, because a normal one doesn't store that information
at all.

Reorganise the dsf API into three kinds of dsf.

2023-04-20T17:39:41+00:00

This is preparing to separate out the auxiliary functionality, and
perhaps leave space for making more of it in future.

The previous name 'edsf' was too vague: the 'e' stood for 'extended',
and didn't say anything about _how_ it was extended. It's now called a
'flip dsf', since it tracks whether elements in the same class are
flipped relative to each other. More importantly, clients that are
going to use the flip tracking must say so when they allocate the dsf.

And Keen's need to track the minimal element of an equivalence class
is going to become a non-default feature, so there needs to be a new
kind of dsf that specially tracks those, and Keen will have to call it.

While I'm here, I've renamed the three dsf creation functions so that
they start with 'dsf_' like all the rest of the dsf API.

Introduce a new dsf_equivalent() function.

2023-04-20T17:39:35+00:00

Not very interesting, but the idiom for checking equivalence via two
calls to dsf_canonify is cumbersome enough to be worth abbreviating.

Remove conditioned-out dsf diagnostic code.

2023-04-20T16:30:03+00:00

print_dsf was declared in puzzles.h, but never called, and its
definition was commented out. So it probably wouldn't still have
worked anyway. The other commented-out printfs in that file don't look
very useful either, and they just mean more stuff will need messing
about with as I continue to refactor.

Remove size parameter from dsf init and copy functions.

2023-04-20T16:30:03+00:00

Now that the dsf knows its own size internally, there's no need to
tell it again when one is copied or reinitialised.

This makes dsf_init much more about *re*initialising a dsf, since now
dsfs are always allocated using a function that will initialise them
anyway. So I think it deserves a rename.

Store a size field inside the DSF type.

2023-04-20T16:30:01+00:00

This permits bounds-checking of all inputs to dsf_canonify and
dsf_merge, so that any out-of-range values will provoke assertion
failure instead of undefined behaviour.

Actually make DSF an opaque structure type.

2023-04-20T16:23:23+00:00

This makes good on all the previous preparatory commits, which I did
separately so that each one individually has a reasonably readable
diff, and all the mechanical changes are separated out from the
rewrites that needed actual thought.

Still no functional change, however: the DSF type wraps nothing but
the same int pointer that 'DSF *' used to store directly.

Declare all dsfs as a dedicated type name 'DSF'.

2023-04-20T16:23:21+00:00

In this commit, 'DSF' is simply a typedef for 'int', so that the new
declaration form 'DSF *' translates to the same type 'int *' that dsfs
have always had. So all we're doing here is mechanically changing type
declarations throughout the code.

Use a dedicated copy function to copy dsfs.

2023-04-20T16:21:54+00:00

Previously we were duplicating the contents of a dsf using straight-up
memcpy. Now there's a dsf_copy function wrapping the same memcpy.

For the moment, this still has to take a size parameter, because the
size isn't stored inside the dsf itself. But once we make a proper
data type, it will be.

Use a dedicated free function to free dsfs.

2023-04-20T16:21:12+00:00

No functional change: currently, this just wraps the previous sfree
call.