The last post was about a tool for locking and unlocking files using immutable and append-only flags; this one is about copying files from one place to another, between hosts or (sometimes) on the same host, as a building block for automations without creating unnecessary security exposures. Both of these components will be used in later tools.
rsync Basics
rsync is a common tool used for efficiently copying files and
directories from one location to another, including between systems
and for backup purposes. I’ve used it for almost as long as I’ve been
managing home systems. The base use case is to synchronize files owned
by the same user, but for my main use cases, distributing
administration-related configuration files and performing backups,
more access is required. Since I don’t allow root logins,
I need a non-privileged remote user with enough access to deploy files
where they belong and to copy everything I want backed
up. The most common way to restrict a non-privileged remote user to
specific functions is to limit it to specific authorized commands that
can be sent via SSH (secure shell), and to configure sudo or doas
on the remote system for that non-privileged user to allow it to do
what it needs to do, as specifically as possible. Specific commands
are defined in the destination user’s .ssh/authorized_keys file. I
used to use a very simple wrapper on the remote side that would only
allow rsync to run, but came to the conclusion that I wanted
stricter controls that limited what rsync could do. (An alternative
tool for restricting remote SSH commands that I came across during
research for this post is
sshdo.) I
set up a non-privileged user, _rsyncu, on each host where it’s
needed and use separate SSH keys with specific commands for each
specific function, and define privileged commands that it might need
to execute, as narrowly as possible, using doas (I formerly used the
more popular sudo but switched to doas when OpenBSD did; it is
simpler and has had fewer vulnerabilities)
rsync-client.pl / rsync-server.pl
My first foray into building something that provided more specific
controls and enabled automated synchronization was to write a script
in 2003 that runs on both sides of an SSH connection, initiating
rsync commands to set up the session and corresponding rsync
server-side commands on the remote host, called via a specific
authorized command, and using a common config file on each side. This
script, rsync-client.pl and rsync-server.pl (two names on the same
script) is intended for automated synchronization of files (common
config files between systems, deploying web server files from a
staging server to a production server, pulling data files from remote
systems for additional processing, doing backups, and so forth)
without human intervention. This is done using SSH keys on the client
side that are protected by file access permissions but not encrypted
with a passphrase (since there is no one to enter the passphrase). The
risk of key theft is mitigated by the limitation of access and
capability of the non-privileged account, monitoring and alerting on
usage of the key, and monitoring and alerting on commands executed by
the non-privileged account. (Those monitoring and alerting
capabilities are done in my environment using my reportnew script that
will be covered in a blog post on June 16.) A possible alternative to
a passphraseless SSH key would be storing a private key on separate
hardware (a hardware security module (HSM), a security key, a secure
enclave, a credential vault) and setting up a mechanism that allowed
the non-privileged account to make use of it but not extract it, but
that would not prevent use on the compromised host and I get similar
mitigation by restricting use of the key to the specific host in the
authorized_keys file along with the restricted command, for much
lower cost. Another mitigation is to make the credentials extremely
short-lived, which requires controls over the rotation process. I’ve
occasionally thought about, but have not used, SSH certificates.
The config file for the script is defined to support multiple hosts so
that the same config file can be used on a set of hosts. An entry in
the config identifies source and destination hosts, files and
directories to be sent from the source to the destination via rsync,
commands to be executed before and after the synchronization, a
specific SSH identity to use (optional), and whether doas or sudo
is required on the source or destination side. The synchronization can
be from the source to the destination (pull) or from the destination
to the source (push), or both. One of my use cases is to push Response
Policy Zones (RPZ) from my primary internal domain name system (DNS)
resolver to secondary servers. RPZ is used to allow DNS-based blocking
or redirection on specific domain names or patterns, e.g., to block
tracking domains, ads, and malware. In this use case, setup and
cleanup commands in the config use sysunlock (covered in an upcoming
blog post) to unlock the old RPZ files before synchronizing and lock
the updated RPZ files afterward. Another use case is that I push macro
include files and their corresponding detached signature files used
with reportnew to other hosts, and similarly unlock the old ones
(user immutability flags), copy the new ones, and re-lock.
I keep my rsync.conf file for rsync-client.pl/rsync-server.pl
locked with syslock and system immutability flags so that it cannot
be changed even by root without shutting down into single user mode.
The file permissions on the config file are root-only, as the config
file potentially reveals file and network topology information.
These scripts support the use of ED25519 and ECDSA SSH keys; support for DSA and RSA keys was removed since these have lower security and I no longer have any systems that only work with those.
rrsync
After writing an early version of this script, I came across rrsync,
which has been distributed alongside rsync since the mid 2000s, which
is a relatively simple wrapper intended for use as the SSH authorized
command. It requires no config file; all the configuration is done in
the forced command line using command line options (such as -ro for
read only or -doas if rrsync needs to be called using doas, as
it does in order to access files not otherwise accessible by the
non-privileged user). For example, the line in my authorized\_keys
file for backups with rsnapshot looks like this:
command="/usr/local/sbin/rrsync -doas -ro / 2>>;/home/_rsyncu/rrsync.err",from="<backup-host>" ssh-ed25519 <public-key>
The distributed version of rrsync was originally written in perl in
2004, but was shifted to a python version in December 2021. I liked it
as a wrapper for use with rsnapshot for backups but wanted to
continue using perl, which I’m much more experienced with using than
python, so I started updating my own copy in February 2022 to replace
a much more primitive rsync_wrapper.sh shell script I had been
using for that purpose. Since then, I have kept the perl version in
feature parity with the python version and also added the use of
OpenBSD’s pledge and unveil to further limit the blast radius of any
vulnerability. This wrapper provides the ability to limit what options
can be supplied to rsync on the server side, to limit it to read-only
(exactly what you want for a backup use case), or to limit it to a
specific directory (e.g., if you want to build something to move files
to remote hosts for later installation, which is how I use it with my
distribute and install scripts that will be described in the next
blog post in this series). I also added a -doas option for
cases where the remote rsync command needs to be executed with doas
for root privileges (e.g., like the backup use case given above). All
commands and errors are logged (and those logs monitored and alerts
generated with reportnew).
Alternatives
Some of the functionality of these scripts is now redundant with
functionality built into rsync’s rsyncd daemon mode, but which
which doesn’t have some of the capabilities you get from a wrapper –
forcing the use of SSH, restricting specific options, sanitizing
arguments, logging each command, etc. rsyncd can be used with
chroot, which may be stricter than the directory limits imposed by
rrsync, but is probably weaker than the directory limits imposed by
OpenBSD’s unveil. I prefer to have these additional features, so
haven’t really explored use of rsyncd. The above scripts can be used
with other alternatives to rsync itself, such as OpenBSD’s
openrsync. The big weakness in openrsync is that it doesn’t
support rsnapshot’s use of rsync’s --link-dest flag, which
enables it to save space by using hard links for files that haven’t
changed since the last backup. (A hard link essentially makes a file
appear to be in multiple places without having an extra copy of the
file.)
rsync-altroot.pl
One last script included with my rsync-tools package is a very simple
script, rsync-altroot.pl, which is used for keeping an /altroot
disk and its filesystem partitions up-to-date with the production
system. This was originally used as a simple form of redundancy so
that a system could be booted from the altroot disk in the event of a
disk failure. I just use it as another form of daily backup. I’ve
periodically found it handy to address cases of accidental deletion,
especially of key files in /etc, though I’ve come closer to
eliminating such cases through broad use of syslock. There are three
options to the script, -m to mount the altroot file systems, -r to
perform the rsync, and -u to unmount the altroot file
system. Without options, it does those three steps in sequence. The
configuration for the altroot file system is found in /etc/fstab,
rsync-altroot.pl expects to see lines in the same format as fstab
with the altroot device unique identifiers (DUIDs) specified so that
it knows which altroot file systems to mount. (DUIDs are persistent
unique identifiers for a disk that remains stable even if the disk is
moved to a different controller.)
Use of LLMs for Security Assessment
I’ve used Claude, Gemini, and ChatGPT to review all of these scripts,
and the most serious issue was a vulnerability identified by Gemini in
rrsync, where the order of operations (path validation before glob
expansion) meant that there was a path traversal vulnerability which
made it possible to escape a restricted directory to, e.g., grab the
system password file. (Specifically, the path validation would pass
for a constructed string which, after glob expansion, would point at
the password file outside of the restricted directory.) It is
noteworthy that this vulnerability, while exploitable on Linux or
macOS, was mitigated by OpenBSD’s unveil, which would block access
outside of the restricted directory. Claude has been my favored LLM
for security assessment, and I use the paid (Pro) version, while I use
only free versions of the others. Each has found different
vulnerabilities and issues, and I like to use them to cross-validate
each other. The free Gemini has often confabulated nonexistent
vulnerabilities and ChatGPT on one occasion denied that an issue
Claude identified was real (but quickly changed its position after
being given more information, which is a feature I find extremely
annoying when the wording of both positions express high confidence).
Beyond security assessment, I used Claude to assist in cleaning up
rsync-client.pl/rsync-server.pl, to remove use of shell command
execution and glob expansion and generate more error messages and
warnings where appropriate.
It’s worth noting that the author and maintainer of rsync, Andrew
Tridgell, has recently come under fire for his use of LLMs in
development of that project. Beteween December 2026 and May 2026 I was
using rsync version 3.4.1, but today in June 2026 I’m using version
3.4.1. Version 3.4.3 had hundreds of AI-assisted commits upset some
people for breaking some incremental backup use cases (though I
noticed no issues for rsnapshot). Versions 3.4.2 and 3.4.3 fixed six
CVEs (CVE-2026-29518, CVE-2026-43617, CVE-2026-43618, CVE-2026-43619,
CVE-2026-43620, and CVE-2026-45232); 3.4.4 fixed CVE-2026-41035. This
led the author of a Go-based rsync replacement to
compare which bugs his version and openrsync avoided. Andrew
Tridgell has responded to criticism here and I am fairly sympathetic to
his position, which is prioritizing security.
On a meta-note, since the last blog post I used the free Gemini Flash model to assist in migrating this blog from a decades-old Blogger instance to self-hosting using static site generation using the PaperMod theme with Hugo. I put in a request to download my Blogger archive on June 3 and received it on June 5, figured out whether I wanted to use Hugo or Zola, how I wanted to support comments, tags, search, and redirection from the old site, and performed the migration and implementation with Gemini assistance. The extraction of posts and comments was fairly smooth, but required identification and cleanup of comments awaiting moderation which contained a lot of spam. The toughest parts were getting the redirection to work properly and getting the new commenting with Bluesky and Mastodon integration to work properly, as well as learning the details of Hugo and its interactions with PaperMod which were sometimes frustrating. There’s still some cleanup required on some of the redirections which I’m fixing as I see 404 errors in my logs. Since the comments are just posts on Bluesky and Mastodon there’s no moderation beyond Bluesky’s block capability and the on-platform moderation, but I have the ability to filter what appears on the site if necessary to avoid spam or harassment. Now that I have all my content on my own servers I’ll also put some work into cleaning up old links, broken images, etc. in archived content that has arisen from the inevitable link rot from the passage of time. The process of migration was quite quick and mostly easy, but I did go through many, many iterations of python scripts for doing various tasks and saw some of the worst LLM confabulation I’ve seen in months during a few bottlenecks (after mostly quite good results)–e.g., blaming an absurd bug in a URL checking script on Hugo (which wasn’t even involved), and falsely claiming that blog post Markdown front matter already contained original page permalink URLs (so there’s no need to find them).
Getting Started and Further Reading
These three rsync-related tools are available on my
website and on
Github. A sample
rsync.conf config file is provided for
rsync-client.pl/rsync-server.pl and the Github README contains
more details about usage and implementation. All were written
initially for OpenBSD but also work on Linux and macOS. rrsync in
particular is used as a key component of the distribute/install tools
described in the next post.