Backups and Recovery
Backups
Crymap uses a “filesystem as a database” model for its storage. This means that
no exotic tooling is required to back the Crymap data up. Even simple tools
such as rsync
are sufficient.
A proper backup system should be able to handle the following:
- Binary file content
- Unicode file names
- Symlinks must be backed up as symlinks, not copies of what they point to, and must retain their exact target
- Cyclic and dangling symlinks must be tolerated
Exact file modes and owners do not need to be preserved, but it is better if they are. While Crymap creates hard links, it does not require them to be preserved, and a backup will function properly if the hard-linked files are reconstructed as separate files. Crymap does not use file timestamps or extended attributes for anything important.
Crymap is designed to be tolerant of anomalies resulting from the backup process not capturing an atomic snapshot of the user data directory. There is one requirement: the backup system must be able to process the a user’s whole data directory in less than 24 hours. If this requirement is met, no data which existed before the backup started will be lost, with one corner case.
The corner case is has to do with mailbox renaming, since Crymap uses filesystem directories to model the mailbox hierarchy. Consider a user with the below mailbox hierarchy.
|-+ INBOX
|
|-+ In Progress
| |
| \-+ TPS Reports
|
\-+ Archive
It might happen that the backup system happens to process these directories in
this order: Archive
, INBOX
, In Progress
. When it goes through Archive
,
it makes a backup in which Archive
has no child mailboxes. Now, suppose that
when it is working on INBOX
, the user moves TPS Reports
into Archive
:
|-+ INBOX
|
|-+ In Progress
|
\-+ Archive
|
\-+ TPS Reports
Now, when the backup system finishes INBOX
and moves on to In Progress
, it
makes a backup of In Progress
which has no child mailboxes. Now we have a
backup which does not contain TPS Reports
anywhere, even though that mailbox
always existed somewhere!
This is a problem inherent in any filesystem-based backup system, and means it is important to have multiple backups available in case one backup runs at just the wrong time.
This corner case does not affect moving individual messages between mailboxes because Crymap is able to retain the messages in their old location for the 24hr grace period.
Restoring from Backup
Due to the “filesystem as a database” model, restoring from backup is usually just a matter of recovering the files and putting them back in the correct place, possibly after fixing permissions if the backup system does not preserve them.
If the backup is not a fully consistent state, mail applications may have difficulty resynchronising. It may be necessary to do a “repair” or fully reconfigure the application.
It is also possible to retrieve parts of a user’s account (i.e. mailboxes) and
insert them into the current account by placing the directories in an
appropriate place under the mail
directory within the user data directory. If
there are missing RSA keys, they can also be retrieved from the keys
directory in the backup and added to the current filesystem. No special
configuration or notification to Crymap is needed for these operations; simply
putting the files/directories into place is sufficient.
Objects from a user account are readable from only that user account. For example, in case of a system failure, you cannot set up a new system, create user accounts in it, and then expect to be able to drop data from the backups into those new user accounts. The user accounts themselves must be restored from backup.