Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More robust metadata syncing #74

Open
evgmik opened this issue Dec 29, 2020 · 0 comments
Open

More robust metadata syncing #74

evgmik opened this issue Dec 29, 2020 · 0 comments
Labels
enhancement Not a bug, it's a new feature or improvement

Comments

@evgmik
Copy link
Collaborator

evgmik commented Dec 29, 2020

Below are quotes from communication with @sahib, quoted text belongs to @evgmik.

Here is the scenario, Ali and Bob made changes in their repos. They
were syncing from time to time. Let's even assume that they have fully
synced metadata.

Ali by incident nukes his ~/.brig/metadata.tgz.locked, but he could
synchronize with bob and he does. So on his side content is restored and
he is still tracking bob.
The problem is at bob's side, when he asks for diff or sync there will be
an error message

diff: No commit with index `3` found

Since Ali has only one diff (after first sync). If Ali does enough commits, there will be
proper patch number, but it will be with wrong  metadata which assumed
to be the same.  So sync is dangerous.

What I suggest is to put a hash to every diff message. If last know diff
is missing, we go back in history until we find the common ancestor,
worse case scenario if would empty repo state (which should have the
same hash in any repo for any user). This way we can recover from
destroyed metadata case with minimal loses.

I think you have a point here, although I'm not so much worried about
the scenario above. But the patch number is an additional concept we
might not need.  Also it's additional state that might get out of sync
or is calculated wrong because we introduced a bug.  We already have
hashes indicating "diffs" - those are just the commit hashes.

So it would be nicer if we could the patch API from this:

interface Sync {
    fetchPatch    @1 (fromIndex :Int64) -> (data :Data);
    fetchPatches  @5 (fromIndex :Int64) -> (data :Data);
}

to this:

interface Sync {
    # If "to" is empty, fetch complete diff until staging commit:
    fetchPatch    @1 (from :String, to :String) -> (data :Data);
    fetchPatches  @5 (from :String, to :String) -> (data :Data);
}

There is one downside of this approach (which is I think why I choose
patch numbers instead): The commit hashes on the metadata copy of the
remote will not be the same as on the remote - since the copy does not
have to be complete (some folders might be missing e.g.) - so we need to
store and trust the commit hashes coming from the remote. Using indices
was an easy way to workaround that storage.

@evgmik evgmik added the enhancement Not a bug, it's a new feature or improvement label Dec 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Not a bug, it's a new feature or improvement
Projects
None yet
Development

No branches or pull requests

1 participant