Index

s3-bsync / 94fa0b0

Bidirectional syncing tool to sync local filesystem directories with S3 buckets. (Incomplete)

Latest Commit

{#}TimeHashSubjectAuthor#(+)(-)GPG?
1016 Jun 2022 12:0894fa0b0Clean up repo, begin work on class abstraction and serializationJosh Stockin1461G

Blob @ s3-bsync / README.md

text/plain7239 bytesdownload raw
1# s3-bsync
2
3Bidirectional syncing tool to sync local filesystem directories with S3
4buckets. Written by [Josh Stockin](https://joshstock.in).
5
6**Work in progress. Not in a functional state. Do NOT use this.**
7
8### Behavior
9
10After an initial sync (manually handling conflicts and uncommon files), the S3
11bucket maintains precedence. Files with the same size and modify time on both
12hosts are ignored. A newer copy of a file always overwrites the corresponding
13old, regardless of changes in the old. (In other words, **there is no manual
14conflict resolution after first sync. Conflicting files are handled
15automatically as described here.** This script is meant to run without input
16or output by default, in a cron job for example.) Untracked files, in either
17S3 or on the local machine, are copied to the opposite host and tracked.
18Tracked files that are moved or removed on either host are moved or removed on
19the corresponding host, with the tracking adjusted accordingly. Ultimately,
20after a sync, the `.state.s3sync` state tracking file should match the contents
21of the S3 bucket's synced directories.
22
23### Installation
24
25Depends on `python3` and `aws-cli`. Both can be installed with your package
26manager. Requires Python modules `pip` and `setuptools` if you want to install
27on your system path using one of the methods listed below.
28
29Install with one of the following:
30
31* `./install.sh [interpreter?]` (Preferred)
32* `python3 -m pip install .`
33* `./setup.py` (Not recommended)
34
35Uninstall with one of the following:
36
37* `./install.sh uninstall [interpreter?]` (Preferred)
38* `python3 -m pip uninstall s3-bsync`
39
40`install.sh` is a frontend for `pip (un)install`, configured by setuptools in
41`setup.py`.
42
43Root permissions are not required. *This program does not manage S3
44authentication or `aws-cli` credentials. You must do this yourself with the
45`aws configure` command, or through some other means of IAM/S3 policy.*
46
47### Usage
48
49```
50usage: s3-bsync [-h] [-v] [-i] [--debug] [--file SYNCFILE] [--dump] [--dryrun] [--purge]
51 [--overwrite] [--dir PATH S3_DEST]
52
53Bidirectional syncing tool to sync local filesystem directories with S3 buckets.
54
55optional arguments:
56 -h, -?, --help Display this help message and exit.
57 -v, --version Display program and version information and exit.
58
59program behavior:
60 The program runs in sync mode by default.
61
62 -i, --init Run in initialize mode. This allows tracking file management and
63 directory options to be used. (default: False)
64 --debug Enables debug mode, which prints program information to stdout.
65 (default: False)
66 --file SYNCFILE The s3sync state file used to store tracking and state
67 information. It should resolve to an absolute path. (default:
68 ['~/.state.s3sync'])
69 --dump Dump s3sync state file configuration. --dryrun implicitly enabled.
70 (default: False)
71 --dryrun Run program logic without making changes. Useful when paired with
72 debug mode to see what changes would be made. (default: False)
73
74tracking file management:
75 Requires initialize mode to be enabled.
76
77 --purge Deletes the default (if not otherwise specified with --file)
78 tracking configuration file if it exists. (default: False)
79 --overwrite Overwrite tracking file with new directory maps instead of
80 appending. (default: False)
81
82directory mapping:
83 Requires initialize mode to be enabled.
84
85 --dir PATH S3_DEST Directory map to detail which local directory corresponds to S3
86 bucket and key prefix. Can be used multiple times to set multiple
87 directories. Local directories must be absolute. S3 destination in
88 `s3://bucket-name/prefix` format. Example: `--dir
89 /home/josh/Documents s3://joshstockin/Documents`
90```
91
92#### Source files
93
94`setup.py` manages installation metadata.
95`install.sh` handles installation and uninstallation using pip.
96
97#### Created files and .s3syncignore
98
99The default file used to store sync information is `~/.state.s3sync`, but this
100location can be reconfigured. The file uses the binary s3sync file format
101specified later in this document. If you want to intentionally ignore
102untracked files, use a `.s3syncignore` file, in the same manner as
103[`.gitignore`](https://git-scm.com/docs/gitignore).
104
105## s3sync file format
106
107The `.state.s3sync` file saved in home directory defines the state of tracked
108objects from the specified S3 buckets and key prefixes used in the last sync.
109
110### Control bytes
111
112 90 - Begin bucket block
113 91 - End bucket block
114 92 - Begin directory map
115 93 - End directory map
116 94 - Begin object block
117 95 - End object block
118 96 - ETag type MD5
119 97 - ETag type null-terminated string (non-MD5)
120 98
121 99
122 9A - Begin metadata block
123 9B - End metadata block
124 9C
125 9D - File signature byte
126 9E
127 9F - File signature byte
128
129### File structure
130
131Version 1 of the s3sync file format.
132
133```
134Header {
135 File signature - 4 bytes - 9D 9F 53 33
136 File version - 1 byte - 01
137}
138Metadata block {
139 Begin metadata block control byte - 9A
140 Last synced time - 8 bytes uint
141 End metadata block control byte - 9B
142}
143Bucket block {
144 Begin bucket block control byte - 90
145 Bucket name - null-terminated string
146 Directory map {
147 Begin directory map block control byte - 92
148 Path to local directory - null-terminated string
149 S3 key prefix (no `/` termination) - null-terminated string
150 Compress (gzip level) - 0-11 (4 bytes)
151 Recursive sync - 1 byte boolean
152 GPG encryption enabled - 1 byte boolean
153 GPG encryption email - null-terminated string
154 End directory map block control byte - 93
155 }...
156 Recorded object {
157 Begin object block control byte - 94
158 Key - null-terminated string
159 Last modified time - 8 bytes uint
160 ETag type - 96 or 97
161 ETag - 16 bytes or null-terminated string
162 File size - 8 bytes uint
163 End object block control byte - 95
164 }...
165 End bucket block control byte - 91
166}...
167```
168
169## Copyright
170
171This program is copyrighted by [Joshua Stockin](https://joshstock.in/) and
172licensed under the [MIT License](LICENSE).
173
174A form of the following should be present in each source file.
175
176```txt
177s3-bsync Copyright (c) 2022 Joshua Stockin
178<https://joshstock.in>
179<https://git.joshstock.in/s3-bsync>
180
181This software is licensed and distributed under the terms of the MIT License.
182See the MIT License in the LICENSE file of this project's root folder.
183
184This comment block and its contents, including this disclaimer, MUST be
185preserved in all copies or distributions of this software's source.
186```
187
188&lt;<https://joshstock.in>&gt; | [josh@joshstock.in](mailto:josh@joshstock.in) | joshuas3#9641
189