1 | <style>body{max-width:650px;margin:20px auto;color:#252525}</style> |
2 | <h1>s3-bsync</h1> |
3 | <p>Bidirectional syncing tool to sync local filesystem directories with S3 |
4 | buckets. Written by <a href="https://joshstock.in">Josh Stockin</a>.</p> |
5 | <h3>Behavior</h3> |
6 | <p>After an initial sync (manually handling conflicts and uncommon files), the S3 |
7 | bucket maintains precedence. Files with the same size and modify time on both |
8 | hosts are ignored. A newer copy of a file always overwrites the corresponding |
9 | old, regardless of changes in the old. (In other words, <strong>there is no manual |
10 | conflict resolution after first sync. Conflicting files are handled |
11 | automatically as described here.</strong> This script is meant to run without input |
12 | or output by default, in a cron job for example.) Untracked files, in either |
13 | S3 or on the local machine, are copied to the opposite host and tracked. |
14 | Tracked files that are moved or removed on either host are moved or removed on |
15 | the corresponding host, with the tracking adjusted accordingly. Ultimately, |
16 | after a sync, the <code>.state.s3sync</code> state tracking file should match the contents |
17 | of the S3 bucket's synced directories.</p> |
18 | <h3>Installation</h3> |
19 | <p>Depends on <code>python3</code> and <code>aws-cli</code>. Both can be installed with your package |
20 | manager.</p> |
21 | <p>Install with <code>python3 setup.py install</code>. Root permissions not required. |
22 | <em>This program does not manage S3 authentication or <code>aws-cli</code> credentials. You |
23 | must do this yourself with the <code>aws configure</code> command, or other means of |
24 | IAM/S3 policy.</em></p> |
25 | <h4>Source files</h4> |
26 | <p><code>setup.py</code> manages installation metadata.</p> |
27 | <h4>Created files and .s3syncignore</h4> |
28 | <p>The default file used to store sync information is <code>~/.state.s3sync</code>, but this |
29 | location can be reconfigured. The file uses the binary s3sync file format |
30 | specified later in this document. If you want to intentionally ignore |
31 | untracked files, use a <code>.s3syncignore</code> file, in the same manner as |
32 | <a href="https://git-scm.com/docs/gitignore"><code>.gitignore</code></a>.</p> |
33 | <h2>s3sync file format</h2> |
34 | <p>The <code>.state.s3sync</code> file saved in home directory defines the state of tracked |
35 | objects from the specified S3 buckets and key prefixes used in the last sync.</p> |
36 | <h3>Control bytes</h3> |
37 | <pre><code>90 - Begin bucket block |
38 | 91 - End bucket block |
39 | 92 - Begin directory map |
40 | 93 - End directory map |
41 | 94 - Begin object block |
42 | 95 - End object block |
43 | 96 - ETag type MD5 |
44 | 97 - ETag type null-terminated string (non-MD5) |
45 | 98 |
46 | 99 |
47 | 9A - Begin metadata block |
48 | 9B - End metadata block |
49 | 9C |
50 | 9D - File signature byte |
51 | 9E |
52 | 9F - File signature byte |
53 | </code></pre> |
54 | <h3>File structure</h3> |
55 | <pre><code>Header { |
56 | File signature - 4 bytes - 9D 9F 53 33 |
57 | File version - 1 byte - 01 |
58 | } |
59 |
|
60 | Metadata block { |
61 | Begin metadata block control byte - 9A |
62 | Last synced time - 8 bytes uint |
63 | End metadata block control byte - 9B |
64 | } |
65 |
|
66 | Bucket block { |
67 | Begin bucket block control byte - 90 |
68 | Bucket name - null-terminated string |
69 | Directory map { |
70 | Begin directory map block control byte - 92 |
71 | Path to local directory - null-terminated string |
72 | S3 key prefix - null-terminated string |
73 | Recursive sync - 1 byte boolean |
74 | End directory map block control byte - 93 |
75 | }... |
76 | Recorded object { |
77 | Begin object block control byte - 94 |
78 | Key - null-terminated string |
79 | Last modified time - 8 bytes uint |
80 | ETag type - 96 or 97 |
81 | ETag - 16 bytes or null-terminated string |
82 | File size - 8 bytes uint |
83 | End object block control byte - 95 |
84 | }... |
85 | End bucket block control byte - 91 |
86 | }... |
87 | </code></pre> |
88 | <h2>Copyright</h2> |
89 | <p>This program is copyrighted by <a href="https://joshstock.in/">Joshua Stockin</a> and |
90 | licensed under the <a href="LICENSE">MIT License</a>.</p> |
91 | <p>A form of the following should be present in each source file.</p> |
92 | <pre><code class="language-txt">s3-bsync Copyright (c) 2021 Joshua Stockin |
93 | <https://joshstock.in> |
94 | <https://git.joshstock.in/s3-bsync> |
95 |
|
96 | This software is licensed and distributed under the terms of the MIT License. |
97 | See the MIT License in the LICENSE file of this project's root folder. |
98 |
|
99 | This comment block and its contents, including this disclaimer, MUST be |
100 | preserved in all copies or distributions of this software's source. |
101 | </code></pre> |
102 | <p><<a href="https://joshstock.in">https://joshstock.in</a>> | <a href="mailto:josh@joshstock.in">josh@joshstock.in</a> | joshuas3#9641</p> |
103 |
|