Index

s3-bsync / master

Bidirectional syncing tool to sync local filesystem directories with S3 buckets. (Incomplete)

Latest Commit

{#}TimeHashSubjectAuthor#(+)(-)GPG?
1316 Jun 2022 21:050fae871Update serialized data handlersJosh Stockin111G

Blob @ s3-bsync / README.md

text/plain7827 bytesdownload raw
1## s3-bsync
2
3Bidirectional syncing tool to sync local filesystem directories with S3
4buckets. Developed by [Josh Stockin](https://joshstock.in) and licensed under
5the MIT License.
6
7**Work in progress (v0.1.0). Not in a functional or usable state. Do NOT use
8this unless you know what you are doing.**
9
10### Behavior
11
12After an initial sync (manually handling conflicts and uncommon files), the S3
13bucket maintains precedence. Files with the same size and modify time on both
14hosts are ignored. A newer copy of a file always overwrites the corresponding
15old, regardless of changes in the old. (In other words, **there is no manual
16conflict resolution after the first sync. Conflicting files are handled
17automatically as described here.** This script is meant to run without input
18or output by default, in a cron job for example.) Untracked files, in either
19S3 or on the local machine, are copied to the opposite host and tracked.
20Tracked files that are moved or removed on either host are moved or removed on
21the corresponding host, with the tracking adjusted accordingly. Ultimately,
22after a sync, the `.state.s3sync` state tracking file should match the contents
23of the S3 bucket's and local synced directories.
24
25### Installation
26
27Depends on `python3` and `aws-cli`. Both can be installed with your package
28manager. Requires Python modules `pip` and `setuptools` if you want to install
29on your system path using one of the methods listed below. Python module
30`python-gnupg` optionally required if you wish to use GPG encryption options.
31
32Install with one of the following:
33
34* `./install.sh [<python interpreter>?]` (Preferred)
35* `python3 -m pip install .`
36* `python3 ./setup.py install` (Not recommended)
37
38Uninstall with one of the following:
39
40* `./install.sh uninstall [<python interpreter>?]` (Preferred)
41* `python3 -m pip uninstall s3-bsync`
42
43`install.sh` is a frontend for `pip (un)install`, configured by setuptools in
44`setup.py`. The script automatically performs compatibility checks on Python
45interpreter and other required dependencies.
46
47Root permissions are not required. *This program does not manage S3
48authentication or `aws-cli` credentials. You must do this yourself with the
49`aws configure` command, or through some other means of IAM/S3 policy.*
50
51### Usage
52
53```
54usage: s3-bsync [--help] [--version] [--init] [--debug] [--dryrun] [--file SYNCFILE]
55 [--dump] [--purge] [--overwrite] [--dir PATH S3_DEST] [--rmdir RMPATH]
56
57Bidirectional syncing tool to sync local filesystem directories with S3 buckets.
58
59optional arguments:
60 --help, -h, -? Display this help message and exit.
61 --version, -v Display program and version information and exit.
62
63program behavior:
64 The program runs in sync mode by default.
65
66 --init, -i Run in initialize (edit) mode. This allows tracking file
67 management and directory options to be used. (default: False)
68 --debug Enables debug mode, which prints program information to stdout.
69 (default: False)
70 --dryrun Run program logic without making changes. Useful when paired with
71 debug mode to see what changes would be made. (default: False)
72
73tracking file management:
74 Configuring the tracking file.
75
76 --file SYNCFILE The s3sync state file used to store tracking and state
77 information. It should resolve to an absolute path. (default:
78 ['~/.state.s3sync'])
79 --dump Dump s3sync state file configuration and exit. (default: False)
80 --purge Deletes the tracking configuration file if it exists and exits.
81 Requires init mode. (default: False)
82 --overwrite Overwrite tracking file with new directory maps instead of
83 appending. Requires init mode. (default: False)
84
85directory mapping:
86 Requires initialize mode to be enabled.
87
88 --dir PATH S3_DEST Directory map to detail which local directory corresponds to S3
89 bucket and key prefix. Can be used multiple times to set multiple
90 directories. Local directories must be absolute. S3 destination in
91 `s3://bucket-name/prefix` format. Example: `--dir
92 /home/josh/Documents s3://joshstockin/Documents`
93 --rmdir RMPATH Remove tracked directory map by local directory identifier.
94 Running `--rmdir /home/josh/Documents` would remove the directory
95 map from the s3syncfile and stop tracking/syncing that directory.
96```
97
98#### Source files
99
100`setup.py` manages installation metadata.
101`install.sh` handles installation and uninstallation using pip.
102
103#### Created files and .s3syncignore
104
105The default file used to store sync information is `~/.state.s3sync`, but this
106location can be reconfigured. The file uses the binary s3sync file format
107specified later in this document. If you want to intentionally ignore
108untracked files, use a `.s3syncignore` file, in the same manner as
109[`.gitignore`](https://git-scm.com/docs/gitignore).
110
111## s3sync file format
112
113The `.state.s3sync` file saved in home directory defines the state of tracked
114objects from the specified S3 buckets and key prefixes used in the last sync.
115
116### Control bytes
117
118 90 - Begin bucket block
119 91 - End bucket block
120 92 - Begin directory map
121 93 - End directory map
122 94 - Begin object block
123 95 - End object block
124 96 - ETag type MD5
125 97 - ETag type null-terminated string (non-MD5)
126 98
127 99
128 9A - Begin metadata block
129 9B - End metadata block
130 9C
131 9D - File signature byte
132 9E
133 9F - File signature byte
134
135### File structure
136
137Version 1 of the s3sync file format.
138
139```
140Header {
141 File signature - 4 bytes - 9D 9F 53 33
142 File version - 1 byte - 01
143}
144Metadata block {
145 Begin metadata block control byte - 9A
146 Last synced time - 8 bytes uint
147 End metadata block control byte - 9B
148}
149Bucket block {
150 Begin bucket block control byte - 90
151 Bucket name - null-terminated string
152 Directory map {
153 Begin directory map block control byte - 92
154 Path to local directory - null-terminated string
155 S3 key prefix (no `/` termination) - null-terminated string
156 Compress (gzip level) - 0-11 (1 byte)
157 Recursive sync - 1 byte boolean
158 GPG encryption enabled - 1 byte boolean
159 GPG encryption email - null-terminated string
160 End directory map block control byte - 93
161 }...
162 Recorded object {
163 Begin object block control byte - 94
164 Key - null-terminated string
165 Last modified time - 8 bytes uint
166 ETag type - 96 or 97
167 ETag - 16 bytes or null-terminated string
168 File size - 8 bytes uint
169 End object block control byte - 95
170 }...
171 End bucket block control byte - 91
172}...
173```
174
175## Copyright
176
177This program is copyrighted by [Joshua Stockin](https://joshstock.in/) and
178licensed under the [MIT License](LICENSE).
179
180A form of the following should be present in each source file.
181
182```txt
183s3-bsync Copyright (c) 2022 Joshua Stockin
184<https://joshstock.in>
185<https://git.joshstock.in/s3-bsync>
186
187This software is licensed and distributed under the terms of the MIT License.
188See the MIT License in the LICENSE file of this project's root folder.
189
190This comment block and its contents, including this disclaimer, MUST be
191preserved in all copies or distributions of this software's source.
192```
193
194&lt;<https://joshstock.in>&gt; | [josh@joshstock.in](mailto:josh@joshstock.in) | joshuas3#9641
195