Checking integrity and consistency

Imagine your repository is saved on a server that has a faulty hard drive, or even worse, attackers get privileged access and modify the files in your repository with the intention to make you restore malicious data:

echo "boom" > /srv/rustic-repo/index/de30f3231ca2e6a59af4aa84216dfe2ef7339c549dc11b09b84000997b139628

Trying to restore a snapshot which has been modified as shown above will yield an error:

$ rustic restore c23e491f /tmp/restore-work
...
Fatal: unable to load index de30f323: load <index/de30f3231c>: invalid data returned

In order to detect these things before they become a problem, it’s a good idea to regularly use the check command to test whether your repository is healthy and consistent, and that your precious backup data is unharmed. There are two types of checks that can be performed:

  • Structural consistency and integrity, e.g. snapshots, trees and pack files (default)
  • Integrity of the actual data that you backed up (enabled with flags, see below)

To verify the structure of the repository, issue the check command. If the repository is damaged like in the example above, check will detect this and yield the same error as when you tried to restore:

$ rustic check
...
load indexes
error: error loading index de30f323: load <index/de30f3231c>: invalid data returned
Fatal: LoadIndex returned errors

If the repository structure is intact, rustic will show that no errors were found:

$ rustic check
...
load indexes
check all packs
check snapshots, trees and blobs
no errors were found

By default, the check command does not verify that the actual pack files on disk in the repository are unmodified, because doing so requires reading a copy of every pack file in the repository. To tell rustic to also verify the integrity of the pack files in the repository, use the --read-data flag:

$ rustic check --read-data
...
load indexes
check all packs
check snapshots, trees and blobs
read all data
[0:00] 100.00%  3 / 3 items
duration: 0:00
no errors were found

Note: Since --read-data has to download all pack files in the repository, beware that it might incur higher bandwidth costs than usual and also that it takes more time than the default check.

Alternatively, use the --read-data-subset parameter to check only a subset of the repository pack files at a time. It supports three ways to select a subset. One selects a specific part of pack files, the second and third selects a random subset of the pack files by the given percentage or size.

Use --read-data-subset=n/t to check a specific part of the repository pack files at a time. The parameter takes two values, n and t. When the check command runs, all pack files in the repository are logically divided in t (roughly equal) groups, and only files that belong to group number n are checked. For example, the following commands check all repository pack files over 5 separate invocations:

rustic check --read-data-subset=1/5
rustic check --read-data-subset=2/5
rustic check --read-data-subset=3/5
rustic check --read-data-subset=4/5
rustic check --read-data-subset=5/5

Use --read-data-subset=x% to check a randomly chosen subset of the repository pack files. It takes one parameter, x, the percentage of pack files to check as an integer or floating point number. This will not guarantee to cover all available pack files after sufficient runs, but it is easy to automate checking a small subset of data after each backup. For a floating point value the following command may be used:

rustic check --read-data-subset=2.5%

When checking bigger subsets you most likely want to specify the percentage as an integer:

rustic check --read-data-subset=10%

Use --read-data-subset=nS to check a randomly chosen subset of the repository pack files. It takes one parameter, nS, where ‘n’ is a whole number representing file size and ‘S’ is the unit of file size (K/M/G/T) of pack files to check. Behind the scenes, the specified size will be converted to percentage of the total repository size. The behaviour of the check command following this conversion will be the same as the percentage option above. For a file size value the following command may be used:

rustic check --read-data-subset=50M
rustic check --read-data-subset=10G
Last change: 2024-11-18, commit: 2230cc8