| 1 | <style>body{max-width:650px;margin:20px auto;color:#252525}</style> |
| 2 | <h1>s3-bsync</h1> |
| 3 | <p>Bidirectional syncing tool to sync local filesystem directories with S3 |
| 4 | buckets. Written by <a href="https://joshstock.in">Josh Stockin</a>.</p> |
| 5 | <h3>Behavior</h3> |
| 6 | <p>After an initial sync (manually handling conflicts and uncommon files), the S3 |
| 7 | bucket maintains precedence. Files with the same size and modify time on both |
| 8 | hosts are ignored. A newer copy of a file always overwrites the corresponding |
| 9 | old, regardless of changes in the old. (In other words, <strong>there is no manual |
| 10 | conflict resolution after first sync. Conflicting files are handled |
| 11 | automatically as described here.</strong> This script is meant to run without input |
| 12 | or output by default, in a cron job for example.) Untracked files, in either |
| 13 | S3 or on the local machine, are copied to the opposite host and tracked. |
| 14 | Tracked files that are moved or removed on either host are moved or removed on |
| 15 | the corresponding host, with the tracking adjusted accordingly. Ultimately, |
| 16 | after a sync, the <code>.state.s3sync</code> state tracking file should match the contents |
| 17 | of the S3 bucket's synced directories.</p> |
| 18 | <h3>Installation</h3> |
| 19 | <p>Depends on <code>python3</code> and <code>aws-cli</code>. Both can be installed with your package |
| 20 | manager.</p> |
| 21 | <p>Install with <code>python3 setup.py install</code>. Root permissions not required. |
| 22 | <em>This program does not manage S3 authentication or <code>aws-cli</code> credentials. You |
| 23 | must do this yourself with the <code>aws configure</code> command, or other means of |
| 24 | IAM/S3 policy.</em></p> |
| 25 | <h4>Source files</h4> |
| 26 | <p><code>setup.py</code> manages installation metadata.</p> |
| 27 | <h4>Created files and .s3syncignore</h4> |
| 28 | <p>The default file used to store sync information is <code>~/.state.s3sync</code>, but this |
| 29 | location can be reconfigured. The file uses the binary s3sync file format |
| 30 | specified later in this document. If you want to intentionally ignore |
| 31 | untracked files, use a <code>.s3syncignore</code> file, in the same manner as |
| 32 | <a href="https://git-scm.com/docs/gitignore"><code>.gitignore</code></a>.</p> |
| 33 | <h2>s3sync file format</h2> |
| 34 | <p>The <code>.state.s3sync</code> file saved in home directory defines the state of tracked |
| 35 | objects from the specified S3 buckets and key prefixes used in the last sync.</p> |
| 36 | <h3>Control bytes</h3> |
| 37 | <pre><code>90 - Begin bucket block |
| 38 | 91 - End bucket block |
| 39 | 92 - Begin directory map |
| 40 | 93 - End directory map |
| 41 | 94 - Begin object block |
| 42 | 95 - End object block |
| 43 | 96 - ETag type MD5 |
| 44 | 97 - ETag type null-terminated string (non-MD5) |
| 45 | 98 |
| 46 | 99 |
| 47 | 9A - Begin metadata block |
| 48 | 9B - End metadata block |
| 49 | 9C |
| 50 | 9D - File signature byte |
| 51 | 9E |
| 52 | 9F - File signature byte |
| 53 | </code></pre> |
| 54 | <h3>File structure</h3> |
| 55 | <pre><code>Header { |
| 56 | File signature - 4 bytes - 9D 9F 53 33 |
| 57 | File version - 1 byte - 01 |
| 58 | } |
| 59 |
|
| 60 | Metadata block { |
| 61 | Begin metadata block control byte - 9A |
| 62 | Last synced time - 8 bytes uint |
| 63 | End metadata block control byte - 9B |
| 64 | } |
| 65 |
|
| 66 | Bucket block { |
| 67 | Begin bucket block control byte - 90 |
| 68 | Bucket name - null-terminated string |
| 69 | Directory map { |
| 70 | Begin directory map block control byte - 92 |
| 71 | Path to local directory - null-terminated string |
| 72 | S3 key prefix - null-terminated string |
| 73 | Recursive sync - 1 byte boolean |
| 74 | End directory map block control byte - 93 |
| 75 | }... |
| 76 | Recorded object { |
| 77 | Begin object block control byte - 94 |
| 78 | Key - null-terminated string |
| 79 | Last modified time - 8 bytes uint |
| 80 | ETag type - 96 or 97 |
| 81 | ETag - 16 bytes or null-terminated string |
| 82 | File size - 8 bytes uint |
| 83 | End object block control byte - 95 |
| 84 | }... |
| 85 | End bucket block control byte - 91 |
| 86 | }... |
| 87 | </code></pre> |
| 88 | <h2>Copyright</h2> |
| 89 | <p>This program is copyrighted by <a href="https://joshstock.in/">Joshua Stockin</a> and |
| 90 | licensed under the <a href="LICENSE">MIT License</a>.</p> |
| 91 | <p>A form of the following should be present in each source file.</p> |
| 92 | <pre><code class="language-txt">s3-bsync Copyright (c) 2021 Joshua Stockin |
| 93 | <https://joshstock.in> |
| 94 | <https://git.joshstock.in/s3-bsync> |
| 95 |
|
| 96 | This software is licensed and distributed under the terms of the MIT License. |
| 97 | See the MIT License in the LICENSE file of this project's root folder. |
| 98 |
|
| 99 | This comment block and its contents, including this disclaimer, MUST be |
| 100 | preserved in all copies or distributions of this software's source. |
| 101 | </code></pre> |
| 102 | <p><<a href="https://joshstock.in">https://joshstock.in</a>> | <a href="mailto:josh@joshstock.in">josh@joshstock.in</a> | joshuas3#9641</p> |
| 103 |
|