1 | # s3-bsync |
2 |
|
3 | Bidirectional syncing tool to sync local filesystem directories with S3 |
4 | buckets. Written by [Josh Stockin](https://joshstock.in). |
5 |
|
6 | ### Behavior |
7 |
|
8 | After an initial sync (manually handling conflicts and uncommon files), the S3 |
9 | bucket maintains precedence. Files with the same size and modify time on both |
10 | hosts are ignored. A newer copy of a file always overwrites the corresponding |
11 | old, regardless of changes in the old. (In other words, **there is no manual |
12 | conflict resolution after first sync. Conflicting files are handled |
13 | automatically as described here.** This script is meant to run without input |
14 | or output by default, in a cron job for example.) Untracked files, in either |
15 | S3 or on the local machine, are copied to the opposite host and tracked. |
16 | Tracked files that are moved or removed on either host are moved or removed on |
17 | the corresponding host, with the tracking adjusted accordingly. Ultimately, |
18 | after a sync, the `.state.s3sync` state tracking file should match the contents |
19 | of the S3 bucket's synced directories. |
20 |
|
21 | ### Installation |
22 |
|
23 | Depends on `python3` and `aws-cli`. Both can be installed with your package |
24 | manager. |
25 |
|
26 | Install with `python3 setup.py install`. Root permissions not required. |
27 | *This program does not manage S3 authentication or `aws-cli` credentials. You |
28 | must do this yourself with the `aws configure` command, or other means of |
29 | IAM/S3 policy.* |
30 |
|
31 | #### Source files |
32 |
|
33 | `setup.py` manages installation metadata. |
34 |
|
35 | #### Created files and .s3syncignore |
36 |
|
37 | The default file used to store sync information is `~/.state.s3sync`, but this |
38 | location can be reconfigured. The file uses the binary s3sync file format |
39 | specified later in this document. If you want to intentionally ignore |
40 | untracked files, use a `.s3syncignore` file, in the same manner as |
41 | [`.gitignore`](https://git-scm.com/docs/gitignore). |
42 |
|
43 | ## s3sync file format |
44 |
|
45 | The `.state.s3sync` file saved in home directory defines the state of tracked |
46 | objects from the specified S3 buckets and key prefixes used in the last sync. |
47 |
|
48 | ### Control bytes |
49 |
|
50 | 90 - Begin bucket block |
51 | 91 - End bucket block |
52 | 92 - Begin directory map |
53 | 93 - End directory map |
54 | 94 - Begin object block |
55 | 95 - End object block |
56 | 96 - ETag type MD5 |
57 | 97 - ETag type null-terminated string (non-MD5) |
58 | 98 |
59 | 99 |
60 | 9A - Begin metadata block |
61 | 9B - End metadata block |
62 | 9C |
63 | 9D - File signature byte |
64 | 9E |
65 | 9F - File signature byte |
66 |
|
67 | ### File structure |
68 |
|
69 | ``` |
70 | Header { |
71 | File signature - 4 bytes - 9D 9F 53 33 |
72 | File version - 1 byte - 01 |
73 | } |
74 | |
75 | Metadata block { |
76 | Begin metadata block control byte - 9A |
77 | Last synced time - 8 bytes uint |
78 | End metadata block control byte - 9B |
79 | } |
80 | |
81 | Bucket block { |
82 | Begin bucket block control byte - 90 |
83 | Bucket name - null-terminated string |
84 | Directory map { |
85 | Begin directory map block control byte - 92 |
86 | Path to local directory - null-terminated string |
87 | S3 key prefix - null-terminated string |
88 | Recursive sync - 1 byte boolean |
89 | End directory map block control byte - 93 |
90 | }... |
91 | Recorded object { |
92 | Begin object block control byte - 94 |
93 | Key - null-terminated string |
94 | Last modified time - 8 bytes uint |
95 | ETag type - 96 or 97 |
96 | ETag - 16 bytes or null-terminated string |
97 | File size - 8 bytes uint |
98 | End object block control byte - 95 |
99 | }... |
100 | End bucket block control byte - 91 |
101 | }... |
102 | ``` |
103 |
|
104 | ## Copyright |
105 |
|
106 | This program is copyrighted by [Joshua Stockin](https://joshstock.in/) and |
107 | licensed under the [MIT License](LICENSE). |
108 |
|
109 | A form of the following should be present in each source file. |
110 |
|
111 | ```txt |
112 | s3-bsync Copyright (c) 2021 Joshua Stockin |
113 | <https://joshstock.in> |
114 | <https://git.joshstock.in/s3-bsync> |
115 | |
116 | This software is licensed and distributed under the terms of the MIT License. |
117 | See the MIT License in the LICENSE file of this project's root folder. |
118 | |
119 | This comment block and its contents, including this disclaimer, MUST be |
120 | preserved in all copies or distributions of this software's source. |
121 | ``` |
122 |
|
123 | <<https://joshstock.in>> | [josh@joshstock.in](mailto:josh@joshstock.in) | joshuas3#9641 |
124 |
|