Repository Format

All data is stored in a restic repository. A repository is able to store data of several different types, which can later be requested based on an ID. This so-called “storage ID” is the SHA-256 hash of the content of a file. All files in a repository are only written once and never modified afterwards. Writing should occur atomically to prevent concurrent operations from reading incomplete files. This allows accessing and even writing to the repository with multiple clients in parallel. Only the prune operation removes data from the repository.

Repositories consist of several directories and a top-level file called config. For all other files stored in the repository, the name for the file is the lower case hexadecimal representation of the storage ID, which is the SHA-256 hash of the file’s contents. This allows for easy verification of files for accidental modifications, like disk read errors, by simply running the program sha256sum on the file and comparing its output to the file name. If the prefix of a filename is unique amongst all the other files in the same directory, the prefix may be used instead of the complete filename.

Apart from the files stored within the keys directory, all files are encrypted with AES-256 in counter mode (CTR). The integrity of the encrypted data is secured by a Poly1305-AES message authentication code (sometimes also referred to as a “signature”).

In the first 16 bytes of each encrypted file the initialisation vector (IV) is stored. It is followed by the encrypted data and completed by the 16 byte MAC. The format is: IV || CIPHERTEXT || MAC. The complete encryption overhead is 32 bytes. For each file, a new random IV is selected.

The file config is encrypted this way and contains a JSON document like the following:

{
    "version": 2,
    "id": "5956a3f67a6230d4a92cefb29529f10196c7d92582ec305fd71ff6d331d6271b",
    "chunker_polynomial": "25b468838dcb75"
}

After decryption, restic first checks that the version field contains a version number that it understands, otherwise it aborts. At the moment, the version is expected to be 1 or 2. The list of changes in the repository format is contained in the section “Changes” below.

The field id holds a unique ID which consists of 32 random bytes, encoded in hexadecimal. This uniquely identifies the repository, regardless if it is accessed via a remote storage backend or locally. The field chunker_polynomial contains a parameter that is used for splitting large files into smaller chunks (see below).

Last change: 2024-11-04, commit: 9c3225c