File manager - Edit - /usr/share/doc/restic/html/design.html
Back
<!DOCTYPE html> <html class="writer-html5" lang="en" > <head> <meta charset="utf-8" /><meta name="generator" content="Docutils 0.17.1: http://docutils.sourceforge.net/" /> <meta name="viewport" content="width=device-width, initial-scale=1.0" /> <title>Terminology — restic 0.12.1 documentation</title> <link rel="stylesheet" href="_static/pygments.css" type="text/css" /> <link rel="stylesheet" href="_static/css/restic.css" type="text/css" /> <link rel="shortcut icon" href="_static/favicon.ico"/> <script data-url_root="./" id="documentation_options" src="_static/documentation_options.js"></script> <script src="_static/jquery.js"></script> <script src="_static/underscore.js"></script> <script src="_static/doctools.js"></script> <script src="_static/js/theme.js"></script> <link rel="index" title="Index" href="genindex.html" /> <link rel="search" title="Search" href="search.html" /> </head> <body class="wy-body-for-nav"> <div class="wy-grid-for-nav"> <nav data-toggle="wy-nav-shift" class="wy-nav-side"> <div class="wy-side-scroll"> <div class="wy-side-nav-search" > <a href="index.html" class="icon icon-home"> restic <img src="_static/logo.png" class="logo" alt="Logo"/> </a> <div class="version"> 0.12.1 </div> <div role="search"> <form id="rtd-search-form" class="wy-form" action="search.html" method="get"> <input type="text" name="q" placeholder="Search docs" /> <input type="hidden" name="check_keywords" value="yes" /> <input type="hidden" name="area" value="default" /> </form> </div> </div><div class="wy-menu wy-menu-vertical" data-spy="affix" role="navigation" aria-label="Navigation menu"> <ul> <li class="toctree-l1"><a class="reference internal" href="010_introduction.html">Introduction</a></li> <li class="toctree-l1"><a class="reference internal" href="020_installation.html">Installation</a></li> <li class="toctree-l1"><a class="reference internal" href="030_preparing_a_new_repo.html">Preparing a new repository</a></li> <li class="toctree-l1"><a class="reference internal" href="040_backup.html">Backing up</a></li> <li class="toctree-l1"><a class="reference internal" href="045_working_with_repos.html">Working with repositories</a></li> <li class="toctree-l1"><a class="reference internal" href="050_restore.html">Restoring from backup</a></li> <li class="toctree-l1"><a class="reference internal" href="060_forget.html">Removing backup snapshots</a></li> <li class="toctree-l1"><a class="reference internal" href="070_encryption.html">Encryption</a></li> <li class="toctree-l1"><a class="reference internal" href="075_scripting.html">Scripting</a></li> <li class="toctree-l1"><a class="reference internal" href="080_examples.html">Examples</a></li> <li class="toctree-l1"><a class="reference internal" href="090_participating.html">Participating</a></li> <li class="toctree-l1"><a class="reference internal" href="100_references.html">References</a></li> <li class="toctree-l1"><a class="reference internal" href="110_talks.html">Talks</a></li> <li class="toctree-l1"><a class="reference internal" href="faq.html">FAQ</a></li> <li class="toctree-l1"><a class="reference internal" href="manual_rest.html">Manual</a></li> <li class="toctree-l1"><a class="reference internal" href="developer_information.html">Developer Information</a></li> </ul> </div> </div> </nav> <section data-toggle="wy-nav-shift" class="wy-nav-content-wrap"><nav class="wy-nav-top" aria-label="Mobile navigation menu" > <i data-toggle="wy-nav-top" class="fa fa-bars"></i> <a href="index.html">restic</a> </nav> <div class="wy-nav-content"> <div class="rst-content"> <div role="navigation" aria-label="Page navigation"> <ul class="wy-breadcrumbs"> <li><a href="index.html" class="icon icon-home"></a> »</li> <li>Terminology</li> <li class="wy-breadcrumbs-aside"> <a href="_sources/design.rst.txt" rel="nofollow"> View page source</a> </li> </ul> <hr/> </div> <div role="main" class="document" itemscope="itemscope" itemtype="http://schema.org/Article"> <div itemprop="articleBody"> <section id="terminology"> <h1>Terminology<a class="headerlink" href="#terminology" title="Permalink to this headline">¶</a></h1> <p>This section introduces terminology used in this document.</p> <p><em>Repository</em>: All data produced during a backup is sent to and stored in a repository in a structured form, for example in a file system hierarchy with several subdirectories. A repository implementation must be able to fulfill a number of operations, e.g. list the contents.</p> <p><em>Blob</em>: A Blob combines a number of data bytes with identifying information like the SHA-256 hash of the data and its length.</p> <p><em>Pack</em>: A Pack combines one or more Blobs, e.g. in a single file.</p> <p><em>Snapshot</em>: A Snapshot stands for the state of a file or directory that has been backed up at some point in time. The state here means the content and meta data like the name and modification time for the file or the directory and its contents.</p> <p><em>Storage ID</em>: A storage ID is the SHA-256 hash of the content stored in the repository. This ID is required in order to load the file from the repository.</p> </section> <section id="repository-format"> <h1>Repository Format<a class="headerlink" href="#repository-format" title="Permalink to this headline">¶</a></h1> <p>All data is stored in a restic repository. A repository is able to store data of several different types, which can later be requested based on an ID. This so-called “storage ID” is the SHA-256 hash of the content of a file. All files in a repository are only written once and never modified afterwards. This allows accessing and even writing to the repository with multiple clients in parallel. Only the <code class="docutils literal notranslate"><span class="pre">prune</span></code> operation removes data from the repository.</p> <p>Repositories consist of several directories and a top-level file called <code class="docutils literal notranslate"><span class="pre">config</span></code>. For all other files stored in the repository, the name for the file is the lower case hexadecimal representation of the storage ID, which is the SHA-256 hash of the file’s contents. This allows for easy verification of files for accidental modifications, like disk read errors, by simply running the program <code class="docutils literal notranslate"><span class="pre">sha256sum</span></code> on the file and comparing its output to the file name. If the prefix of a filename is unique amongst all the other files in the same directory, the prefix may be used instead of the complete filename.</p> <p>Apart from the files stored within the <code class="docutils literal notranslate"><span class="pre">keys</span></code> directory, all files are encrypted with AES-256 in counter mode (CTR). The integrity of the encrypted data is secured by a Poly1305-AES message authentication code (sometimes also referred to as a “signature”).</p> <p>In the first 16 bytes of each encrypted file the initialisation vector (IV) is stored. It is followed by the encrypted data and completed by the 16 byte MAC. The format is: <code class="docutils literal notranslate"><span class="pre">IV</span> <span class="pre">||</span> <span class="pre">CIPHERTEXT</span> <span class="pre">||</span> <span class="pre">MAC</span></code>. The complete encryption overhead is 32 bytes. For each file, a new random IV is selected.</p> <p>The file <code class="docutils literal notranslate"><span class="pre">config</span></code> is encrypted this way and contains a JSON document like the following:</p> <div class="highlight-json notranslate"><div class="highlight"><pre><span></span><span class="p">{</span><span class="w"></span> <span class="w"> </span><span class="nt">"version"</span><span class="p">:</span><span class="w"> </span><span class="mi">1</span><span class="p">,</span><span class="w"></span> <span class="w"> </span><span class="nt">"id"</span><span class="p">:</span><span class="w"> </span><span class="s2">"5956a3f67a6230d4a92cefb29529f10196c7d92582ec305fd71ff6d331d6271b"</span><span class="p">,</span><span class="w"></span> <span class="w"> </span><span class="nt">"chunker_polynomial"</span><span class="p">:</span><span class="w"> </span><span class="s2">"25b468838dcb75"</span><span class="w"></span> <span class="p">}</span><span class="w"></span> </pre></div> </div> <p>After decryption, restic first checks that the version field contains a version number that it understands, otherwise it aborts. At the moment, the version is expected to be 1. The field <code class="docutils literal notranslate"><span class="pre">id</span></code> holds a unique ID which consists of 32 random bytes, encoded in hexadecimal. This uniquely identifies the repository, regardless if it is accessed via SFTP or locally. The field <code class="docutils literal notranslate"><span class="pre">chunker_polynomial</span></code> contains a parameter that is used for splitting large files into smaller chunks (see below).</p> <section id="repository-layout"> <h2>Repository Layout<a class="headerlink" href="#repository-layout" title="Permalink to this headline">¶</a></h2> <p>The <code class="docutils literal notranslate"><span class="pre">local</span></code> and <code class="docutils literal notranslate"><span class="pre">sftp</span></code> backends are implemented using files and directories stored in a file system. The directory layout is the same for both backend types.</p> <p>The basic layout of a repository stored in a <code class="docutils literal notranslate"><span class="pre">local</span></code> or <code class="docutils literal notranslate"><span class="pre">sftp</span></code> backend is shown here:</p> <div class="highlight-default notranslate"><div class="highlight"><pre><span></span>/tmp/restic-repo ├── config ├── data │ ├── 21 │ │ └── 2159dd48f8a24f33c307b750592773f8b71ff8d11452132a7b2e2a6a01611be1 │ ├── 32 │ │ └── 32ea976bc30771cebad8285cd99120ac8786f9ffd42141d452458089985043a5 │ ├── 59 │ │ └── 59fe4bcde59bd6222eba87795e35a90d82cd2f138a27b6835032b7b58173a426 │ ├── 73 │ │ └── 73d04e6125cf3c28a299cc2f3cca3b78ceac396e4fcf9575e34536b26782413c │ [...] ├── index │ ├── c38f5fb68307c6a3e3aa945d556e325dc38f5fb68307c6a3e3aa945d556e325d │ └── ca171b1b7394d90d330b265d90f506f9984043b342525f019788f97e745c71fd ├── keys │ └── b02de829beeb3c01a63e6b25cbd421a98fef144f03b9a02e46eff9e2ca3f0bd7 ├── locks ├── snapshots │ └── 22a5af1bdc6e616f8a29579458c49627e01b32210d09adb288d1ecda7c5711ec └── tmp </pre></div> </div> <p>A local repository can be initialized with the <code class="docutils literal notranslate"><span class="pre">restic</span> <span class="pre">init</span></code> command, e.g.:</p> <div class="highlight-console notranslate"><div class="highlight"><pre><span></span><span class="gp">$ </span>restic -r /tmp/restic-repo init </pre></div> </div> <p>The local and sftp backends will auto-detect and accept all layouts described in the following sections, so that remote repositories mounted locally e.g. via fuse can be accessed. The layout auto-detection can be overridden by specifying the option <code class="docutils literal notranslate"><span class="pre">-o</span> <span class="pre">local.layout=default</span></code>, valid values are <code class="docutils literal notranslate"><span class="pre">default</span></code> and <code class="docutils literal notranslate"><span class="pre">s3legacy</span></code>. The option for the sftp backend is named <code class="docutils literal notranslate"><span class="pre">sftp.layout</span></code>, for the s3 backend <code class="docutils literal notranslate"><span class="pre">s3.layout</span></code>.</p> </section> <section id="s3-legacy-layout"> <h2>S3 Legacy Layout<a class="headerlink" href="#s3-legacy-layout" title="Permalink to this headline">¶</a></h2> <p>Unfortunately during development the AWS S3 backend uses slightly different paths (directory names use singular instead of plural for <code class="docutils literal notranslate"><span class="pre">key</span></code>, <code class="docutils literal notranslate"><span class="pre">lock</span></code>, and <code class="docutils literal notranslate"><span class="pre">snapshot</span></code> files), and the pack files are stored directly below the <code class="docutils literal notranslate"><span class="pre">data</span></code> directory. The S3 Legacy repository layout looks like this:</p> <div class="highlight-default notranslate"><div class="highlight"><pre><span></span>/config /data ├── 2159dd48f8a24f33c307b750592773f8b71ff8d11452132a7b2e2a6a01611be1 ├── 32ea976bc30771cebad8285cd99120ac8786f9ffd42141d452458089985043a5 ├── 59fe4bcde59bd6222eba87795e35a90d82cd2f138a27b6835032b7b58173a426 ├── 73d04e6125cf3c28a299cc2f3cca3b78ceac396e4fcf9575e34536b26782413c [...] /index ├── c38f5fb68307c6a3e3aa945d556e325dc38f5fb68307c6a3e3aa945d556e325d └── ca171b1b7394d90d330b265d90f506f9984043b342525f019788f97e745c71fd /key └── b02de829beeb3c01a63e6b25cbd421a98fef144f03b9a02e46eff9e2ca3f0bd7 /lock /snapshot └── 22a5af1bdc6e616f8a29579458c49627e01b32210d09adb288d1ecda7c5711ec </pre></div> </div> <p>The S3 backend understands and accepts both forms, new backends are always created with the default layout for compatibility reasons.</p> </section> </section> <section id="pack-format"> <h1>Pack Format<a class="headerlink" href="#pack-format" title="Permalink to this headline">¶</a></h1> <p>All files in the repository except Key and Pack files just contain raw data, stored as <code class="docutils literal notranslate"><span class="pre">IV</span> <span class="pre">||</span> <span class="pre">Ciphertext</span> <span class="pre">||</span> <span class="pre">MAC</span></code>. Pack files may contain one or more Blobs of data.</p> <p>A Pack’s structure is as follows:</p> <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">EncryptedBlob1</span> <span class="o">||</span> <span class="o">...</span> <span class="o">||</span> <span class="n">EncryptedBlobN</span> <span class="o">||</span> <span class="n">EncryptedHeader</span> <span class="o">||</span> <span class="n">Header_Length</span> </pre></div> </div> <p>At the end of the Pack file is a header, which describes the content. The header is encrypted and authenticated. <code class="docutils literal notranslate"><span class="pre">Header_Length</span></code> is the length of the encrypted header encoded as a four byte integer in little-endian encoding. Placing the header at the end of a file allows writing the blobs in a continuous stream as soon as they are read during the backup phase. This reduces code complexity and avoids having to re-write a file once the pack is complete and the content and length of the header is known.</p> <p>All the blobs (<code class="docutils literal notranslate"><span class="pre">EncryptedBlob1</span></code>, <code class="docutils literal notranslate"><span class="pre">EncryptedBlobN</span></code> etc.) are authenticated and encrypted independently. This enables repository reorganisation without having to touch the encrypted Blobs. In addition it also allows efficient indexing, for only the header needs to be read in order to find out which Blobs are contained in the Pack. Since the header is authenticated, authenticity of the header can be checked without having to read the complete Pack.</p> <p>After decryption, a Pack’s header consists of the following elements:</p> <div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">Type_Blob1</span> <span class="o">||</span> <span class="n">Length</span><span class="p">(</span><span class="n">EncryptedBlob1</span><span class="p">)</span> <span class="o">||</span> <span class="n">Hash</span><span class="p">(</span><span class="n">Plaintext_Blob1</span><span class="p">)</span> <span class="o">||</span> <span class="p">[</span><span class="o">...</span><span class="p">]</span> <span class="n">Type_BlobN</span> <span class="o">||</span> <span class="n">Length</span><span class="p">(</span><span class="n">EncryptedBlobN</span><span class="p">)</span> <span class="o">||</span> <span class="n">Hash</span><span class="p">(</span><span class="n">Plaintext_Blobn</span><span class="p">)</span> <span class="o">||</span> </pre></div> </div> <p>This is enough to calculate the offsets for all the Blobs in the Pack. Length is the length of a Blob as a four byte integer in little-endian format. The type field is a one byte field and labels the content of a blob according to the following table:</p> <table class="docutils align-default"> <colgroup> <col style="width: 42%" /> <col style="width: 58%" /> </colgroup> <thead> <tr class="row-odd"><th class="head"><p>Type</p></th> <th class="head"><p>Meaning</p></th> </tr> </thead> <tbody> <tr class="row-even"><td><p>0</p></td> <td><p>data</p></td> </tr> <tr class="row-odd"><td><p>1</p></td> <td><p>tree</p></td> </tr> </tbody> </table> <p>All other types are invalid, more types may be added in the future.</p> <p>For reconstructing the index or parsing a pack without an index, first the last four bytes must be read in order to find the length of the header. Afterwards, the header can be read and parsed, which yields all plaintext hashes, types, offsets and lengths of all included blobs.</p> </section> <section id="indexing"> <h1>Indexing<a class="headerlink" href="#indexing" title="Permalink to this headline">¶</a></h1> <p>Index files contain information about Data and Tree Blobs and the Packs they are contained in and store this information in the repository. When the local cached index is not accessible any more, the index files can be downloaded and used to reconstruct the index. The files are encrypted and authenticated like Data and Tree Blobs, so the outer structure is <code class="docutils literal notranslate"><span class="pre">IV</span> <span class="pre">||</span> <span class="pre">Ciphertext</span> <span class="pre">||</span> <span class="pre">MAC</span></code> again. The plaintext consists of a JSON document like the following:</p> <div class="highlight-json notranslate"><div class="highlight"><pre><span></span>{ "supersedes": [ "ed54ae36197f4745ebc4b54d10e0f623eaaaedd03013eb7ae90df881b7781452" ], "packs": [ { "id": "73d04e6125cf3c28a299cc2f3cca3b78ceac396e4fcf9575e34536b26782413c", "blobs": [ { "id": "3ec79977ef0cf5de7b08cd12b874cd0f62bbaf7f07f3497a5b1bbcc8cb39b1ce", "type": "data", "offset": 0, "length": 25 },{ "id": "9ccb846e60d90d4eb915848add7aa7ea1e4bbabfc60e573db9f7bfb2789afbae", "type": "tree", "offset": 38, "length": 100 }, { "id": "d3dc577b4ffd38cc4b32122cabf8655a0223ed22edfd93b353dc0c3f2b0fdf66", "type": "data", "offset": 150, "length": 123 } ] }, [...] ] } </pre></div> </div> <p>This JSON document lists Packs and the blobs contained therein. In this example, the Pack <code class="docutils literal notranslate"><span class="pre">73d04e61</span></code> contains two data Blobs and one Tree blob, the plaintext hashes are listed afterwards.</p> <p>The field <code class="docutils literal notranslate"><span class="pre">supersedes</span></code> lists the storage IDs of index files that have been replaced with the current index file. This happens when index files are repacked, for example when old snapshots are removed and Packs are recombined.</p> <p>There may be an arbitrary number of index files, containing information on non-disjoint sets of Packs. The number of packs described in a single file is chosen so that the file size is kept below 8 MiB.</p> </section> <section id="keys-encryption-and-mac"> <h1>Keys, Encryption and MAC<a class="headerlink" href="#keys-encryption-and-mac" title="Permalink to this headline">¶</a></h1> <p>All data stored by restic in the repository is encrypted with AES-256 in counter mode and authenticated using Poly1305-AES. For encrypting new data first 16 bytes are read from a cryptographically secure pseudorandom number generator as a random nonce. This is used both as the IV for counter mode and the nonce for Poly1305. This operation needs three keys: A 32 byte for AES-256 for encryption, a 16 byte AES key and a 16 byte key for Poly1305. For details see the original paper <a class="reference external" href="https://cr.yp.to/mac/poly1305-20050329.pdf">The Poly1305-AES message-authentication code</a> by Dan Bernstein. The data is then encrypted with AES-256 and afterwards a message authentication code (MAC) is computed over the ciphertext, everything is then stored as IV || CIPHERTEXT || MAC.</p> <p>The directory <code class="docutils literal notranslate"><span class="pre">keys</span></code> contains key files. These are simple JSON documents which contain all data that is needed to derive the repository’s master encryption and message authentication keys from a user’s password. The JSON document from the repository can be pretty-printed for example by using the Python module <code class="docutils literal notranslate"><span class="pre">json</span></code> (shortened to increase readability):</p> <div class="highlight-default notranslate"><div class="highlight"><pre><span></span>$ python -mjson.tool /tmp/restic-repo/keys/b02de82* { "hostname": "kasimir", "username": "fd0" "kdf": "scrypt", "N": 65536, "r": 8, "p": 1, "created": "2015-01-02T18:10:13.48307196+01:00", "data": "tGwYeKoM0C4j4/9DFrVEmMGAldvEn/+iKC3te/QE/6ox/V4qz58FUOgMa0Bb1cIJ6asrypCx/Ti/pRXCPHLDkIJbNYd2ybC+fLhFIJVLCvkMS+trdywsUkglUbTbi+7+Ldsul5jpAj9vTZ25ajDc+4FKtWEcCWL5ICAOoTAxnPgT+Lh8ByGQBH6KbdWabqamLzTRWxePFoYuxa7yXgmj9A==", "salt": "uW4fEI1+IOzj7ED9mVor+yTSJFd68DGlGOeLgJELYsTU5ikhG/83/+jGd4KKAaQdSrsfzrdOhAMftTSih5Ux6w==", } </pre></div> </div> <p>When the repository is opened by restic, the user is prompted for the repository password. This is then used with <code class="docutils literal notranslate"><span class="pre">scrypt</span></code>, a key derivation function (KDF), and the supplied parameters (<code class="docutils literal notranslate"><span class="pre">N</span></code>, <code class="docutils literal notranslate"><span class="pre">r</span></code>, <code class="docutils literal notranslate"><span class="pre">p</span></code> and <code class="docutils literal notranslate"><span class="pre">salt</span></code>) to derive 64 key bytes. The first 32 bytes are used as the encryption key (for AES-256) and the last 32 bytes are used as the message authentication key (for Poly1305-AES). These last 32 bytes are divided into a 16 byte AES key <code class="docutils literal notranslate"><span class="pre">k</span></code> followed by 16 bytes of secret key <code class="docutils literal notranslate"><span class="pre">r</span></code>. The key <code class="docutils literal notranslate"><span class="pre">r</span></code> is then masked for use with Poly1305 (see the paper for details).</p> <p>Those keys are used to authenticate and decrypt the bytes contained in the JSON field <code class="docutils literal notranslate"><span class="pre">data</span></code> with AES-256 and Poly1305-AES as if they were any other blob (after removing the Base64 encoding). If the password is incorrect or the key file has been tampered with, the computed MAC will not match the last 16 bytes of the data, and restic exits with an error. Otherwise, the data yields a JSON document which contains the master encryption and message authentication keys for this repository (encoded in Base64). The command <code class="docutils literal notranslate"><span class="pre">restic</span> <span class="pre">cat</span> <span class="pre">masterkey</span></code> can be used as follows to decrypt and pretty-print the master key:</p> <div class="highlight-console notranslate"><div class="highlight"><pre><span></span><span class="gp">$ </span>restic -r /tmp/restic-repo cat masterkey <span class="go">{</span> <span class="go"> "mac": {</span> <span class="go"> "k": "evFWd9wWlndL9jc501268g==",</span> <span class="go"> "r": "E9eEDnSJZgqwTOkDtOp+Dw=="</span> <span class="go"> },</span> <span class="go"> "encrypt": "UQCqa0lKZ94PygPxMRqkePTZnHRYh1k1pX2k2lM2v3Q=",</span> <span class="go">}</span> </pre></div> </div> <p>All data in the repository is encrypted and authenticated with these master keys. For encryption, the AES-256 algorithm in Counter mode is used. For message authentication, Poly1305-AES is used as described above.</p> <p>A repository can have several different passwords, with a key file for each. This way, the password can be changed without having to re-encrypt all data.</p> </section> <section id="snapshots"> <h1>Snapshots<a class="headerlink" href="#snapshots" title="Permalink to this headline">¶</a></h1> <p>A snapshot represents a directory with all files and sub-directories at a given point in time. For each backup that is made, a new snapshot is created. A snapshot is a JSON document that is stored in an encrypted file below the directory <code class="docutils literal notranslate"><span class="pre">snapshots</span></code> in the repository. The filename is the storage ID. This string is unique and used within restic to uniquely identify a snapshot.</p> <p>The command <code class="docutils literal notranslate"><span class="pre">restic</span> <span class="pre">cat</span> <span class="pre">snapshot</span></code> can be used as follows to decrypt and pretty-print the contents of a snapshot file:</p> <div class="highlight-console notranslate"><div class="highlight"><pre><span></span><span class="gp">$ </span>restic -r /tmp/restic-repo cat snapshot 251c2e58 <span class="go">enter password for repository:</span> <span class="go">{</span> <span class="go"> "time": "2015-01-02T18:10:50.895208559+01:00",</span> <span class="go"> "tree": "2da81727b6585232894cfbb8f8bdab8d1eccd3d8f7c92bc934d62e62e618ffdf",</span> <span class="go"> "dir": "/tmp/testdata",</span> <span class="go"> "hostname": "kasimir",</span> <span class="go"> "username": "fd0",</span> <span class="go"> "uid": 1000,</span> <span class="go"> "gid": 100,</span> <span class="go"> "tags": [</span> <span class="go"> "NL"</span> <span class="go"> ]</span> <span class="go">}</span> </pre></div> </div> <p>Here it can be seen that this snapshot represents the contents of the directory <code class="docutils literal notranslate"><span class="pre">/tmp/testdata</span></code>. The most important field is <code class="docutils literal notranslate"><span class="pre">tree</span></code>. When the meta data (e.g. the tags) of a snapshot change, the snapshot needs to be re-encrypted and saved. This will change the storage ID, so in order to relate these seemingly different snapshots, a field <code class="docutils literal notranslate"><span class="pre">original</span></code> is introduced which contains the ID of the original snapshot, e.g. after adding the tag <code class="docutils literal notranslate"><span class="pre">DE</span></code> to the snapshot above it becomes:</p> <div class="highlight-console notranslate"><div class="highlight"><pre><span></span><span class="gp">$ </span>restic -r /tmp/restic-repo cat snapshot 22a5af1b <span class="go">enter password for repository:</span> <span class="go">{</span> <span class="go"> "time": "2015-01-02T18:10:50.895208559+01:00",</span> <span class="go"> "tree": "2da81727b6585232894cfbb8f8bdab8d1eccd3d8f7c92bc934d62e62e618ffdf",</span> <span class="go"> "dir": "/tmp/testdata",</span> <span class="go"> "hostname": "kasimir",</span> <span class="go"> "username": "fd0",</span> <span class="go"> "uid": 1000,</span> <span class="go"> "gid": 100,</span> <span class="go"> "tags": [</span> <span class="go"> "NL",</span> <span class="go"> "DE"</span> <span class="go"> ],</span> <span class="go"> "original": "251c2e5841355f743f9d4ffd3260bee765acee40a6229857e32b60446991b837"</span> <span class="go">}</span> </pre></div> </div> <p>Once introduced, the <code class="docutils literal notranslate"><span class="pre">original</span></code> field is not modified when the snapshot’s meta data is changed again.</p> <p>All content within a restic repository is referenced according to its SHA-256 hash. Before saving, each file is split into variable sized Blobs of data. The SHA-256 hashes of all Blobs are saved in an ordered list which then represents the content of the file.</p> <p>In order to relate these plaintext hashes to the actual location within a Pack file , an index is used. If the index is not available, the header of all data Blobs can be read.</p> </section> <section id="trees-and-data"> <h1>Trees and Data<a class="headerlink" href="#trees-and-data" title="Permalink to this headline">¶</a></h1> <p>A snapshot references a tree by the SHA-256 hash of the JSON string representation of its contents. Trees and data are saved in pack files in a subdirectory of the directory <code class="docutils literal notranslate"><span class="pre">data</span></code>.</p> <p>The command <code class="docutils literal notranslate"><span class="pre">restic</span> <span class="pre">cat</span> <span class="pre">blob</span></code> can be used to inspect the tree referenced above (piping the output of the command to <code class="docutils literal notranslate"><span class="pre">jq</span> <span class="pre">.</span></code> so that the JSON is indented):</p> <div class="highlight-console notranslate"><div class="highlight"><pre><span></span><span class="gp">$ </span>restic -r /tmp/restic-repo cat blob 2da81727b6585232894cfbb8f8bdab8d1eccd3d8f7c92bc934d62e62e618ffdf <span class="p">|</span> jq . <span class="go">enter password for repository:</span> <span class="go">{</span> <span class="go"> "nodes": [</span> <span class="go"> {</span> <span class="go"> "name": "testdata",</span> <span class="go"> "type": "dir",</span> <span class="go"> "mode": 493,</span> <span class="go"> "mtime": "2014-12-22T14:47:59.912418701+01:00",</span> <span class="go"> "atime": "2014-12-06T17:49:21.748468803+01:00",</span> <span class="go"> "ctime": "2014-12-22T14:47:59.912418701+01:00",</span> <span class="go"> "uid": 1000,</span> <span class="go"> "gid": 100,</span> <span class="go"> "user": "fd0",</span> <span class="go"> "inode": 409704562,</span> <span class="go"> "content": null,</span> <span class="go"> "subtree": "b26e315b0988ddcd1cee64c351d13a100fedbc9fdbb144a67d1b765ab280b4dc"</span> <span class="go"> }</span> <span class="go"> ]</span> <span class="go">}</span> </pre></div> </div> <p>A tree contains a list of entries (in the field <code class="docutils literal notranslate"><span class="pre">nodes</span></code>) which contain meta data like a name and timestamps. When the entry references a directory, the field <code class="docutils literal notranslate"><span class="pre">subtree</span></code> contains the plain text ID of another tree object.</p> <p>When the command <code class="docutils literal notranslate"><span class="pre">restic</span> <span class="pre">cat</span> <span class="pre">blob</span></code> is used, the plaintext ID is needed to print a tree. The tree referenced above can be dumped as follows:</p> <div class="highlight-console notranslate"><div class="highlight"><pre><span></span><span class="gp">$ </span>restic -r /tmp/restic-repo cat blob b26e315b0988ddcd1cee64c351d13a100fedbc9fdbb144a67d1b765ab280b4dc <span class="go">enter password for repository:</span> <span class="go">{</span> <span class="go"> "nodes": [</span> <span class="go"> {</span> <span class="go"> "name": "testfile",</span> <span class="go"> "type": "file",</span> <span class="go"> "mode": 420,</span> <span class="go"> "mtime": "2014-12-06T17:50:23.34513538+01:00",</span> <span class="go"> "atime": "2014-12-06T17:50:23.338468713+01:00",</span> <span class="go"> "ctime": "2014-12-06T17:50:23.34513538+01:00",</span> <span class="go"> "uid": 1000,</span> <span class="go"> "gid": 100,</span> <span class="go"> "user": "fd0",</span> <span class="go"> "inode": 416863351,</span> <span class="go"> "size": 1234,</span> <span class="go"> "links": 1,</span> <span class="go"> "content": [</span> <span class="go"> "50f77b3b4291e8411a027b9f9b9e64658181cc676ce6ba9958b95f268cb1109d"</span> <span class="go"> ]</span> <span class="go"> },</span> <span class="go"> [...]</span> <span class="go"> ]</span> <span class="go">}</span> </pre></div> </div> <p>This tree contains a file entry. This time, the <code class="docutils literal notranslate"><span class="pre">subtree</span></code> field is not present and the <code class="docutils literal notranslate"><span class="pre">content</span></code> field contains a list with one plain text SHA-256 hash.</p> <p>The command <code class="docutils literal notranslate"><span class="pre">restic</span> <span class="pre">cat</span> <span class="pre">blob</span></code> can also be used to extract and decrypt data given a plaintext ID, e.g. for the data mentioned above:</p> <div class="highlight-console notranslate"><div class="highlight"><pre><span></span><span class="gp">$ </span>restic -r /tmp/restic-repo cat blob 50f77b3b4291e8411a027b9f9b9e64658181cc676ce6ba9958b95f268cb1109d <span class="p">|</span> sha256sum <span class="go">enter password for repository:</span> <span class="go">50f77b3b4291e8411a027b9f9b9e64658181cc676ce6ba9958b95f268cb1109d -</span> </pre></div> </div> <p>As can be seen from the output of the program <code class="docutils literal notranslate"><span class="pre">sha256sum</span></code>, the hash matches the plaintext hash from the map included in the tree above, so the correct data has been returned.</p> </section> <section id="locks"> <h1>Locks<a class="headerlink" href="#locks" title="Permalink to this headline">¶</a></h1> <p>The restic repository structure is designed in a way that allows parallel access of multiple instance of restic and even parallel writes. However, there are some functions that work more efficient or even require exclusive access of the repository. In order to implement these functions, restic processes are required to create a lock on the repository before doing anything.</p> <p>Locks come in two types: Exclusive and non-exclusive locks. At most one process can have an exclusive lock on the repository, and during that time there must not be any other locks (exclusive and non-exclusive). There may be multiple non-exclusive locks in parallel.</p> <p>A lock is a file in the subdir <code class="docutils literal notranslate"><span class="pre">locks</span></code> whose filename is the storage ID of the contents. It is encrypted and authenticated the same way as other files in the repository and contains the following JSON structure:</p> <div class="highlight-json notranslate"><div class="highlight"><pre><span></span><span class="p">{</span><span class="w"></span> <span class="w"> </span><span class="nt">"time"</span><span class="p">:</span><span class="w"> </span><span class="s2">"2015-06-27T12:18:51.759239612+02:00"</span><span class="p">,</span><span class="w"></span> <span class="w"> </span><span class="nt">"exclusive"</span><span class="p">:</span><span class="w"> </span><span class="kc">false</span><span class="p">,</span><span class="w"></span> <span class="w"> </span><span class="nt">"hostname"</span><span class="p">:</span><span class="w"> </span><span class="s2">"kasimir"</span><span class="p">,</span><span class="w"></span> <span class="w"> </span><span class="nt">"username"</span><span class="p">:</span><span class="w"> </span><span class="s2">"fd0"</span><span class="p">,</span><span class="w"></span> <span class="w"> </span><span class="nt">"pid"</span><span class="p">:</span><span class="w"> </span><span class="mi">13607</span><span class="p">,</span><span class="w"></span> <span class="w"> </span><span class="nt">"uid"</span><span class="p">:</span><span class="w"> </span><span class="mi">1000</span><span class="p">,</span><span class="w"></span> <span class="w"> </span><span class="nt">"gid"</span><span class="p">:</span><span class="w"> </span><span class="mi">100</span><span class="w"></span> <span class="p">}</span><span class="w"></span> </pre></div> </div> <p>The field <code class="docutils literal notranslate"><span class="pre">exclusive</span></code> defines the type of lock. When a new lock is to be created, restic checks all locks in the repository. When a lock is found, it is tested if the lock is stale, which is the case for locks with timestamps older than 30 minutes. If the lock was created on the same machine, even for younger locks it is tested whether the process is still alive by sending a signal to it. If that fails, restic assumes that the process is dead and considers the lock to be stale.</p> <p>When a new lock is to be created and no other conflicting locks are detected, restic creates a new lock, waits, and checks if other locks appeared in the repository. Depending on the type of the other locks and the lock to be created, restic either continues or fails.</p> </section> <section id="backups-and-deduplication"> <h1>Backups and Deduplication<a class="headerlink" href="#backups-and-deduplication" title="Permalink to this headline">¶</a></h1> <p>For creating a backup, restic scans the source directory for all files, sub-directories and other entries. The data from each file is split into variable length Blobs cut at offsets defined by a sliding window of 64 bytes. The implementation uses Rabin Fingerprints for implementing this Content Defined Chunking (CDC). An irreducible polynomial is selected at random and saved in the file <code class="docutils literal notranslate"><span class="pre">config</span></code> when a repository is initialized, so that watermark attacks are much harder.</p> <p>Files smaller than 512 KiB are not split, Blobs are of 512 KiB to 8 MiB in size. The implementation aims for 1 MiB Blob size on average.</p> <p>For modified files, only modified Blobs have to be saved in a subsequent backup. This even works if bytes are inserted or removed at arbitrary positions within the file.</p> </section> <section id="threat-model"> <h1>Threat Model<a class="headerlink" href="#threat-model" title="Permalink to this headline">¶</a></h1> <p>The design goals for restic include being able to securely store backups in a location that is not completely trusted (e.g., a shared system where others can potentially access the files) or even modify or delete them in the case of the system administrator.</p> <p>General assumptions:</p> <ul class="simple"> <li><p>The host system a backup is created on is trusted. This is the most basic requirement, and it is essential for creating trustworthy backups.</p></li> <li><p>The user uses an authentic copy of restic.</p></li> <li><p>The user does not share the repository password with an attacker.</p></li> <li><p>The restic backup program is not designed to protect against attackers deleting files at the storage location. There is nothing that can be done about this. If this needs to be guaranteed, get a secure location without any access from third parties.</p></li> <li><p>The whole repository is re-encrypted if a key is leaked. With the current key management design, it is impossible to securely revoke a leaked key without re-encrypting the whole repository.</p></li> <li><p>Advances in cryptography attacks against the cryptographic primitives used by restic (i.e, AES-256-CTR-Poly1305-AES and SHA-256) have not occurred. Such advances could render the confidentiality or integrity protections provided by restic useless.</p></li> <li><p>Sufficient advances in computing have not occurred to make bruteforce attacks against restic’s cryptographic protections feasible.</p></li> </ul> <p>The restic backup program guarantees the following:</p> <ul class="simple"> <li><p>Unencrypted content of stored files and metadata cannot be accessed without a password for the repository. Everything except the metadata included for informational purposes in the key files is encrypted and authenticated. The cache is also encrypted to prevent metadata leaks.</p></li> <li><p>Modifications to data stored in the repository (due to bad RAM, broken harddisk, etc.) can be detected.</p></li> <li><p>Data that has been tampered will not be decrypted.</p></li> </ul> <p>With the aforementioned assumptions and guarantees in mind, the following are examples of things an adversary could achieve in various circumstances.</p> <p>An adversary with read access to your backup storage location could:</p> <ul class="simple"> <li><p>Attempt a brute force password guessing attack against a copy of the repository (even more reason to use long, 30+ character passwords).</p></li> <li><p>Infer which packs probably contain trees via file access patterns.</p></li> <li><p>Infer the size of backups by using creation timestamps of repository objects.</p></li> </ul> <p>An adversary with network access could:</p> <ul class="simple"> <li><p>Attempt to DoS the server storing the backup repository or the network connection between client and server.</p></li> <li><p>Determine from where you create your backups (i.e., the location where the requests originate).</p></li> <li><p>Determine where you store your backups (i.e., which provider/target system).</p></li> <li><p>Infer the size of backups by using creation timestamps of repository objects.</p></li> </ul> <p>The following are examples of the implications associated with violating some of the aforementioned assumptions.</p> <p>An adversary who compromises (via malware, physical access, etc.) the host system making backups could:</p> <ul class="simple"> <li><p>Render the entire backup process untrustworthy (e.g., intercept password, copy files, manipulate data).</p></li> <li><p>Create snapshots (containing garbage data) which cover all modified files and wait until a trusted host has used forget often enough to forget all correct snapshots.</p></li> <li><p>Create a garbage snapshot for every existing snapshot with a slightly different timestamp and wait until forget has run, thereby removing all correct snapshots at once.</p></li> </ul> <p>An adversary with write access to your files at the storage location could:</p> <ul class="simple"> <li><p>Delete or manipulate your backups, thereby impairing your ability to restore files from the compromised storage location.</p></li> <li><p>Determine which files belong to what snapshot (e.g., based on the timestamps of the stored files). When only these files are deleted, the particular snapshot vanishes and all snapshots depending on data that has been added in the snapshot cannot be restored completely. Restic is not designed to detect this attack.</p></li> </ul> <p>An adversary who compromises a host system with append-only access to the backup repository could:</p> <ul class="simple"> <li><p>Render new backups untrustworthy <em>after</em> the host has been compromised (due to having complete control over new backups). An attacker cannot delete or manipulate old backups. As such, restoring old snapshots created <em>before</em> a host compromise remains possible. <em>Note: It is **not*</em> recommended to ever run forget automatically for an append-only backup to which a potentially compromised host has access because an attacker using fake snapshots could cause forget to remove correct snapshots.*</p></li> </ul> <p>An adversary who has a leaked key for a repository which has not been re-encrypted could:</p> <ul class="simple"> <li><p>Decrypt existing and future backup data. If multiple hosts backup into the same repository, an attacker will get access to the backup data of every host.</p></li> </ul> </section> </div> </div> <footer> <hr/> <div role="contentinfo"> <p>© Copyright 2024, restic authors.</p> </div> Built with <a href="https://www.sphinx-doc.org/">Sphinx</a> using a <a href="https://github.com/readthedocs/sphinx_rtd_theme">theme</a> provided by <a href="https://readthedocs.org">Read the Docs</a>. </footer> </div> </div> </section> </div> <script> jQuery(function () { SphinxRtdTheme.Navigation.enable(true); }); </script> </body> </html>
| ver. 1.4 |
Github
|
.
| PHP 8.2.28 | Generation time: 0.02 |
proxy
|
phpinfo
|
Settings