This final post (apart from a wrap-up) is about the largest and most
complex script, a 6,253-line file integrity monitoring program called
sigtree.pl, first written in January 2000. It was inspired by the
original free version of tripwire and
by OpenBSD’s mtree, in the OpenBSD daily security script
to compare a specification of file attributes and report divergences,
mainly for checking permissions on key system
files. One of the main issues with tripwire was keeping it up-to-date
and the high volume of regular, expected changes. The modern solution
to this issue is ensuring that changes are authorized/expected and that
configurations meet policies rather than verifying file hashes. I think
that is a better solution for enterprises, but I developed an alternate
solution that keeps the granular monitoring and relies on the fact that
I’m my only administrator and recognize my own changes. That solution
is a dual-specification model, where I have a primary specification for
each system and a secondary one, where the secondary one is automatically
checked, changes reported via email, and then updated each night, so I
get a daily report of changes per host. The primary specification is
updated less frequently and manually, either weekly or every few weeks.
sigtree.pl Overview
In addition to primary and secondary specifications (or “specs”),
sigtree.pl defines “sets” containing “trees.” A tree is generally a
directory and all of its contents (recursively), though it can also be
an individual file. Each tree’s state is stored in a spec (primary and
secondary). Sets define keyword lists of attributes that are monitored
by recording them in the spec’s contents, and exceptions can be made
for any individual item or subtree in the tree to assign a different
set. The sets that define the monitored attributes are known as
“primary sets.” The default config defines primary sets by type of
file (root’s files, system binaries, system libraries, kernel files,
ignored files, sigtree files, logs, critical logs, documentation, web
files, and device files), and two secondary sets (daily and
weekly). Log files do not include file hashes, since those can change
extremely frequently for log files. Critical log files include a
keyword called “mtimestasis” which has an appended time interval
(default is “mtimestasis-24h”); this means that it is significant if
the file’s modification time does NOT change within a 24-hour
period. That is, this is a case of identifying a potentially
suspicious absence of change.
File hashes supported are SHA-2 or SHA-3 with configurable key lengths.
The secondary sets (“daily” and “weekly” by default) do not determine what attributes are monitored for that tree but can be used to perform operations on a subset of file system contents; e.g., you can check the more critical files in the “daily” set on a daily basis and check less critical files in the “weekly” set on a weekly basis. I run a check against my primary specs weekly for all files, a check against my secondary specs for more critical files daily, and against my secondary specs for all files weekly, at a day opposite the primary spec test, which effectively means all files get checked twice a week and more critical files daily.
The main three operations that can be performed are “initialize”, which creates specs for the specified sets; “check”, which checks for changes between the spec and the current file system state; “update”, which updates the spec for any changes identified in the last “check” (but also compares to the current file system state and notes if a change is no longer relevant). If “update” is used on a subset of what was last checked, then only the files in that subset will be updated and the remaining identified changes from “check” will still be preserved for a future “update”.
There is also a primary and secondary spec for the set of spec files themselves which are included in each of the above operations, but can also be initialized, checked, or updated separately with “initialize_specs”, “check_specs”, and “update_specs”, respectively. It is also possible to check an individual file with “check_file”. The spec-of-specs is always the last to be initialized or updated, so that it properly reflects the changes made to the individual specs and their respective signature files.
A final operation, “changes”, reports the details of the changes identified in the last check which have not yet been updated to the specs.
For additional security, the recommended setup keeps primary specs
locked with system immutable flags and secondary specs locked with
user immutable flags, and all specs are signed. When signatures are
used, both the initialize and update functions require the passphrase
to the private key (signify or gpg; signify is recommended) in
order to do the signing; the check function checks the signatures on
the specs for validity. With this setup, initialize and update for
primary sets on OpenBSD requires shutting down the system to Insecure
Mode (single-user mode).
Although sigtree.pl uses immutable flags (and monitors them), it
manages the flags on its files directly rather than via syslock.
Spec Location, Parallel Processing
The specs are, by default, placed in
/var/db/sigtree/specs/<hostname> and
/var/db/sigtree/secondary/<hostname>, with the <hostname>
directory and the specs inside locked with immutable flags, but no
locking on the specs or secondary directories. This permits the
use of rsync to collect primary and secondary specs from multiple
hosts into /var/db/sigtree. I’ve thought about but have not yet
built the capability to check specs against a slightly different
directory location (such as backup snapshots or an /altroot backup).
In August 2024 I added parallel processing to the “initialize” and “check” operations (the two most time-consuming operations) in order to speed them up. The parallel workers obtain the file attributes for individual trees, including computing the file hashes as required for files in each tree; the results are then saved to individual specs (for “initialize”) or to a “changed file” (for “check”). In the former case each worker builds a complete spec, and in the latter case the workers’ changed are aggregated into a single “changed file.”
Use of LLMs for Security Assessment and Feature Development
The main security changes suggested by LLMs for sigtree.pl were to
change system calls to avoid use of shell, e.g., by replacing
use of backticks to the use of “open” pipes. I also used it to ensure that
Linux and macOS support was consistent and complete.
As with reportnew, the biggest feature addition for sigtree.pl done
with LLMs was to add privilege separation, which was complicated by
the existing use of worker processes. In using Claude to come up with
a design for privilege separation, I went through 83 iterations of
creating a design document before it covered all features the way
that I wanted, without introducing problems, but it still introduced
one major bug that only appeared on one OpenBSD host and which Claude
was unable to identify the root cause of. The symptoms were that
the communication between unprivileged worker processes and the
privileged process would work for a while but then start failing
with evident corruption in the interprocess communications channel.
Claude kept suggesting additional debugging output and trying
variations of changes to the protocol some of which were related
to differences between OpenBSD and Linux. After going around in
circles for a few hours, I took the dog for a walk and thought
about it, and realized Claude was completely missing the root
cause, which was that the algorithm it produced for assigning a
Unix socket from an array to new worker processes was not re-using
the first available free slot but was just assigning slots in
sequence regardless of the order that worker processes terminated
(i.e., it assumed they would start and terminate in the same
sequence). This was a case where Claude created a problem it
was unable to correctly debug or solve, but was able to fix after
I identified the actual issue. Some of Claude’s proposed solutions
to the problem were not just irrelevant to the underlying issue but
were terrible ideas, like passing the entire contents of specs
across the privileged/nonprivileged boundary rather than just
file descriptors (it wrongly suggested that OpenBSD was somehow
mangling the communications because of problems with file descriptor
passing).
I started work on privilege separation on January 5, 2026, had a mostly
complete implementation by January 24 (with Claude design revision 50),
finished the design revision 83 by January 28, then fixed some
additional issues involving OpenBSD and Linux differences by February 1.
A final issue introduced by these changes was that it broke signing
of specs using gpg because the prompt for a passphrase with
gpg-agent doesn’t work in the privileged process, which has no
connected tty; this required building a separate helper script,
gpg-noagent, to invoke gpg in batch mode since the necessary
options can’t otherwise be passed to the PGP::Sign perl module; this is
one of several reasons to prefer using signify (the others including
that there’s much less complex code involved).
Getting Started and Further Reading
sigtree.pl is available on my website
and on Github. Sample
sigtree.conf config files are provided for OpenBSD, Linux, and macOS along with the
above-mentioned gpg-noagent script if you want to use GPG signing. The Github README
has more information about usage and implementation.
This covers all of the scripts I intended to cover in this series (though not all that I’ve written or that I’ve publicly released). I’ll have one more post with a wrap-up and overview.