Index

s3-bsync / dac3807

Bidirectional syncing tool to sync local filesystem directories with S3 buckets. (Incomplete)

Latest Commit

{#}TimeHashSubjectAuthor#(+)(-)GPG?
518 Oct 2021 20:59dac3807Create install script; set up source directoryJosh Stockin1227G

Blob @ s3-bsync / README.md

text/plain4962 bytesdownload raw
1# s3-bsync
2
3Bidirectional syncing tool to sync local filesystem directories with S3
4buckets. Written by [Josh Stockin](https://joshstock.in).
5
6**Work in progress. Not in a functional state. Do NOT use this.**
7
8### Behavior
9
10After an initial sync (manually handling conflicts and uncommon files), the S3
11bucket maintains precedence. Files with the same size and modify time on both
12hosts are ignored. A newer copy of a file always overwrites the corresponding
13old, regardless of changes in the old. (In other words, **there is no manual
14conflict resolution after first sync. Conflicting files are handled
15automatically as described here.** This script is meant to run without input
16or output by default, in a cron job for example.) Untracked files, in either
17S3 or on the local machine, are copied to the opposite host and tracked.
18Tracked files that are moved or removed on either host are moved or removed on
19the corresponding host, with the tracking adjusted accordingly. Ultimately,
20after a sync, the `.state.s3sync` state tracking file should match the contents
21of the S3 bucket's synced directories.
22
23### Installation
24
25Depends on `python3` and `aws-cli`. Both can be installed with your package
26manager. Requires Python modules `pip` and `setuptools` if you want to install
27on your system path using one of the methods listed below.
28
29Install with one of the following:
30
31* `./install.sh [interpreter?]` (Preferred)
32* `python3 -m pip install .`
33* `./setup.py` (Not recommended)
34
35Uninstall with one of the following:
36
37* `./install.sh uninstall [interpreter?]` (Preferred)
38* `python3 -m pip uninstall s3-bsync`
39
40`install.sh` is a frontend for `pip (un)install`, configured by setuptools in
41`setup.py`.
42
43Root permissions are not required. *This program does not manage S3
44authentication or `aws-cli` credentials. You must do this yourself with the
45`aws configure` command, or through some other means of IAM/S3 policy.*
46
47#### Source files
48
49`setup.py` manages installation metadata.
50`install.sh` handles installation and uninstallation using pip.
51
52#### Created files and .s3syncignore
53
54The default file used to store sync information is `~/.state.s3sync`, but this
55location can be reconfigured. The file uses the binary s3sync file format
56specified later in this document. If you want to intentionally ignore
57untracked files, use a `.s3syncignore` file, in the same manner as
58[`.gitignore`](https://git-scm.com/docs/gitignore).
59
60## s3sync file format
61
62The `.state.s3sync` file saved in home directory defines the state of tracked
63objects from the specified S3 buckets and key prefixes used in the last sync.
64
65### Control bytes
66
67 90 - Begin bucket block
68 91 - End bucket block
69 92 - Begin directory map
70 93 - End directory map
71 94 - Begin object block
72 95 - End object block
73 96 - ETag type MD5
74 97 - ETag type null-terminated string (non-MD5)
75 98
76 99
77 9A - Begin metadata block
78 9B - End metadata block
79 9C
80 9D - File signature byte
81 9E
82 9F - File signature byte
83
84### File structure
85
86Version 1 of the s3sync file format.
87
88```
89Header {
90 File signature - 4 bytes - 9D 9F 53 33
91 File version - 1 byte - 01
92}
93Metadata block {
94 Begin metadata block control byte - 9A
95 Last synced time - 8 bytes uint
96 End metadata block control byte - 9B
97}
98Bucket block {
99 Begin bucket block control byte - 90
100 Bucket name - null-terminated string
101 Directory map {
102 Begin directory map block control byte - 92
103 Path to local directory - null-terminated string
104 S3 key prefix - null-terminated string
105 Recursive sync - 1 byte boolean
106 End directory map block control byte - 93
107 }...
108 Recorded object {
109 Begin object block control byte - 94
110 Key - null-terminated string
111 Last modified time - 8 bytes uint
112 ETag type - 96 or 97
113 ETag - 16 bytes or null-terminated string
114 File size - 8 bytes uint
115 End object block control byte - 95
116 }...
117 End bucket block control byte - 91
118}...
119```
120
121## Copyright
122
123This program is copyrighted by [Joshua Stockin](https://joshstock.in/) and
124licensed under the [MIT License](LICENSE).
125
126A form of the following should be present in each source file.
127
128```txt
129s3-bsync Copyright (c) 2021 Joshua Stockin
130<https://joshstock.in>
131<https://git.joshstock.in/s3-bsync>
132
133This software is licensed and distributed under the terms of the MIT License.
134See the MIT License in the LICENSE file of this project's root folder.
135
136This comment block and its contents, including this disclaimer, MUST be
137preserved in all copies or distributions of this software's source.
138```
139
140&lt;<https://joshstock.in>&gt; | [josh@joshstock.in](mailto:josh@joshstock.in) | joshuas3#9641
141