1 | ## s3-bsync |
2 |
|
3 | Bidirectional syncing tool to sync local filesystem directories with S3 |
4 | buckets. Developed by [Josh Stockin](https://joshstock.in) and licensed under |
5 | the MIT License. |
6 |
|
7 | **Work in progress (v0.1.0). Not in a functional or usable state. Do NOT use |
8 | this unless you know what you are doing.** |
9 |
|
10 | ### Behavior |
11 |
|
12 | After an initial sync (manually handling conflicts and uncommon files), the S3 |
13 | bucket maintains precedence. Files with the same size and modify time on both |
14 | hosts are ignored. A newer copy of a file always overwrites the corresponding |
15 | old, regardless of changes in the old. (In other words, **there is no manual |
16 | conflict resolution after the first sync. Conflicting files are handled |
17 | automatically as described here.** This script is meant to run without input |
18 | or output by default, in a cron job for example.) Untracked files, in either |
19 | S3 or on the local machine, are copied to the opposite host and tracked. |
20 | Tracked files that are moved or removed on either host are moved or removed on |
21 | the corresponding host, with the tracking adjusted accordingly. Ultimately, |
22 | after a sync, the `.state.s3sync` state tracking file should match the contents |
23 | of the S3 bucket's and local synced directories. |
24 |
|
25 | ### Installation |
26 |
|
27 | Depends on `python3` and `aws-cli`. Both can be installed with your package |
28 | manager. Requires Python modules `pip` and `setuptools` if you want to install |
29 | on your system path using one of the methods listed below. Python module |
30 | `python-gnupg` optionally required if you wish to use GPG encryption options. |
31 |
|
32 | Install with one of the following: |
33 |
|
34 | * `./install.sh [<python interpreter>?]` (Preferred) |
35 | * `python3 -m pip install .` |
36 | * `python3 ./setup.py install` (Not recommended) |
37 |
|
38 | Uninstall with one of the following: |
39 |
|
40 | * `./install.sh uninstall [<python interpreter>?]` (Preferred) |
41 | * `python3 -m pip uninstall s3-bsync` |
42 |
|
43 | `install.sh` is a frontend for `pip (un)install`, configured by setuptools in |
44 | `setup.py`. The script automatically performs compatibility checks on Python |
45 | interpreter and other required dependencies. |
46 |
|
47 | Root permissions are not required. *This program does not manage S3 |
48 | authentication or `aws-cli` credentials. You must do this yourself with the |
49 | `aws configure` command, or through some other means of IAM/S3 policy.* |
50 |
|
51 | ### Usage |
52 |
|
53 | ``` |
54 | usage: s3-bsync [--help] [--version] [--init] [--debug] [--dryrun] [--file SYNCFILE] |
55 | [--dump] [--purge] [--overwrite] [--dir PATH S3_DEST] [--rmdir RMPATH] |
56 | |
57 | Bidirectional syncing tool to sync local filesystem directories with S3 buckets. |
58 | |
59 | optional arguments: |
60 | --help, -h, -? Display this help message and exit. |
61 | --version, -v Display program and version information and exit. |
62 | |
63 | program behavior: |
64 | The program runs in sync mode by default. |
65 | |
66 | --init, -i Run in initialize (edit) mode. This allows tracking file |
67 | management and directory options to be used. (default: False) |
68 | --debug Enables debug mode, which prints program information to stdout. |
69 | (default: False) |
70 | --dryrun Run program logic without making changes. Useful when paired with |
71 | debug mode to see what changes would be made. (default: False) |
72 | |
73 | tracking file management: |
74 | Configuring the tracking file. |
75 | |
76 | --file SYNCFILE The s3sync state file used to store tracking and state |
77 | information. It should resolve to an absolute path. (default: |
78 | ['~/.state.s3sync']) |
79 | --dump Dump s3sync state file configuration and exit. (default: False) |
80 | --purge Deletes the tracking configuration file if it exists and exits. |
81 | Requires init mode. (default: False) |
82 | --overwrite Overwrite tracking file with new directory maps instead of |
83 | appending. Requires init mode. (default: False) |
84 | |
85 | directory mapping: |
86 | Requires initialize mode to be enabled. |
87 | |
88 | --dir PATH S3_DEST Directory map to detail which local directory corresponds to S3 |
89 | bucket and key prefix. Can be used multiple times to set multiple |
90 | directories. Local directories must be absolute. S3 destination in |
91 | `s3://bucket-name/prefix` format. Example: `--dir |
92 | /home/josh/Documents s3://joshstockin/Documents` |
93 | --rmdir RMPATH Remove tracked directory map by local directory identifier. |
94 | Running `--rmdir /home/josh/Documents` would remove the directory |
95 | map from the s3syncfile and stop tracking/syncing that directory. |
96 | ``` |
97 |
|
98 | #### Source files |
99 |
|
100 | `setup.py` manages installation metadata. |
101 | `install.sh` handles installation and uninstallation using pip. |
102 |
|
103 | #### Created files and .s3syncignore |
104 |
|
105 | The default file used to store sync information is `~/.state.s3sync`, but this |
106 | location can be reconfigured. The file uses the binary s3sync file format |
107 | specified later in this document. If you want to intentionally ignore |
108 | untracked files, use a `.s3syncignore` file, in the same manner as |
109 | [`.gitignore`](https://git-scm.com/docs/gitignore). |
110 |
|
111 | ## s3sync file format |
112 |
|
113 | The `.state.s3sync` file saved in home directory defines the state of tracked |
114 | objects from the specified S3 buckets and key prefixes used in the last sync. |
115 |
|
116 | ### Control bytes |
117 |
|
118 | 90 - Begin bucket block |
119 | 91 - End bucket block |
120 | 92 - Begin directory map |
121 | 93 - End directory map |
122 | 94 - Begin object block |
123 | 95 - End object block |
124 | 96 - ETag type MD5 |
125 | 97 - ETag type null-terminated string (non-MD5) |
126 | 98 |
127 | 99 |
128 | 9A - Begin metadata block |
129 | 9B - End metadata block |
130 | 9C |
131 | 9D - File signature byte |
132 | 9E |
133 | 9F - File signature byte |
134 |
|
135 | ### File structure |
136 |
|
137 | Version 1 of the s3sync file format. |
138 |
|
139 | ``` |
140 | Header { |
141 | File signature - 4 bytes - 9D 9F 53 33 |
142 | File version - 1 byte - 01 |
143 | } |
144 | Metadata block { |
145 | Begin metadata block control byte - 9A |
146 | Last synced time - 8 bytes uint |
147 | End metadata block control byte - 9B |
148 | } |
149 | Bucket block { |
150 | Begin bucket block control byte - 90 |
151 | Bucket name - null-terminated string |
152 | Directory map { |
153 | Begin directory map block control byte - 92 |
154 | Path to local directory - null-terminated string |
155 | S3 key prefix (no `/` termination) - null-terminated string |
156 | Compress (gzip level) - 0-11 (1 byte) |
157 | Recursive sync - 1 byte boolean |
158 | GPG encryption enabled - 1 byte boolean |
159 | GPG encryption email - null-terminated string |
160 | End directory map block control byte - 93 |
161 | }... |
162 | Recorded object { |
163 | Begin object block control byte - 94 |
164 | Key - null-terminated string |
165 | Last modified time - 8 bytes uint |
166 | ETag type - 96 or 97 |
167 | ETag - 16 bytes or null-terminated string |
168 | File size - 8 bytes uint |
169 | End object block control byte - 95 |
170 | }... |
171 | End bucket block control byte - 91 |
172 | }... |
173 | ``` |
174 |
|
175 | ## Copyright |
176 |
|
177 | This program is copyrighted by [Joshua Stockin](https://joshstock.in/) and |
178 | licensed under the [MIT License](LICENSE). |
179 |
|
180 | A form of the following should be present in each source file. |
181 |
|
182 | ```txt |
183 | s3-bsync Copyright (c) 2022 Joshua Stockin |
184 | <https://joshstock.in> |
185 | <https://git.joshstock.in/s3-bsync> |
186 | |
187 | This software is licensed and distributed under the terms of the MIT License. |
188 | See the MIT License in the LICENSE file of this project's root folder. |
189 | |
190 | This comment block and its contents, including this disclaimer, MUST be |
191 | preserved in all copies or distributions of this software's source. |
192 | ``` |
193 |
|
194 | <<https://joshstock.in>> | [josh@joshstock.in](mailto:josh@joshstock.in) | joshuas3#9641 |
195 |
|