Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rethinking pins #91

Open
evgmik opened this issue Feb 5, 2021 · 6 comments
Open

Rethinking pins #91

evgmik opened this issue Feb 5, 2021 · 6 comments
Labels
proposal Proposals question Issue tracker used as a support platform.

Comments

@evgmik
Copy link
Collaborator

evgmik commented Feb 5, 2021

I was thinking what a pin means within brig.

First of all we have two types of pins:

  1. implicit/regular
  2. explicit

I always thought that explicit stands for keep this file at all cost with all its version. But it seems that explicit
actually means: keep this version of the file at all cost. Which scenario are we enforcing?

Second question, what is the role of a regular pin? It seems quite useless, in the presence of the explicit. It seems that it used internally by repinner to mark versions to be kept cached. Which bring us to the 3rd
type of pins
3. Cached or not flag, i.e. is the file pinned at backend.

So it seems we have too many pin types which serve somewhat similar function: pin file at the backed, which is the only thing which counts during use. Of course, I might be wrong about my deductions based on reverse engineering.

My main issue: to "invite" a pin during synchronization, you have to explicitly pin (well, we have sync type which pulls everything on sync, but I think it is not the main brig use case). But explicit pin, has meaning of keep me which is somewhat orthogonal to get me.

I am still thinking about this situation, but I would say we need explicitly assign pins for different tasks.

  1. get/download this file/dir (keep old versions as long as pinner allows), should be toglable by user. Right now IT IS NOT SETTABLE.
  2. keep the latest version at all cost, never unpin the latest version
  3. keep this version at all cost (not sure how to dig it in history, to clear at the later time)

The logic how we set pins 1-3 should be rethough as well. Currently if you sync repos metadata (without setting pins) and then do brig cat fileCachedInRemote it will be shown to a user, but funny things is that after a while repinner removes it from the backend cache. Technically it is correct: user did not ask it to be pinned, but from other hand there is clearly a demand to have the file, unless there is a pressure to clear space.

So maybe we need 4th type of a pin:
4. keep unless local space is needed.

Finally, setting any of above pins should immideately trigger pin at backend. Removing a brig pin removes backend pin only if others are unset as well.

I was thinking about above from the point of a mobile user, who has small disk space, and need to sync only
portions of the main repo.

Does it all make sense?

@evgmik evgmik added question Issue tracker used as a support platform. proposal Proposals labels Feb 5, 2021
@sahib
Copy link
Owner

sahib commented Feb 5, 2021

I welcome that discussion, the whole pinning concept in brig is not perfectly thought throughout and could use a review.

I always thought that explicit stands for keep this file at all cost with all its version. But it seems that explicit
actually means: keep this version of the file at all cost. Which scenario are we enforcing?

IIRC, the pin is only for the version. I agree that it would make more sense to make to make it per file. We would need a way to say that we only want (e.g.) 10 versions of this file. This could be a new hint introduced for a file or directory.

Second question, what is the role of a regular pin? It seems quite useless, in the presence of the explicit. It seems that it used internally by repinner to mark versions to be kept cached.

The semantics are, that this pin was pinned by brig, without the user's explicit consent. Explicit pins tell you that a user actually told the system to keep it. Current repinner therefore does not touch explict pins, since they have stronger guarantees.

Cached or not flag, i.e. is the file pinned at backend.

I wouldn't call this pinned, since files without any pin, but with "IsCached" are "in transit" and might get deleted at any moment.

My main issue: to "invite" a pin during synchronization, you have to explicitly pin (well, we have sync type which pulls everything on sync, but I think it is not the main brig use case). But explicit pin, has meaning of keep me which is somewhat orthogonal to get me.

What do you mean by "invite"?

I am still thinking about this situation, but I would say we need explicitly assign pins for different tasks.

Good suggestion. As written above, this could be combined with the hint system to tell brig some details. So we could set things like this (option naming / syntax just for the sake of discussion):

$ brig hints set /share   --keep-versions 1
$ brig hints set /private --keep-versions 10
$ brig hints set /archive --keep-versions inf

Or maybe even limit the maximum size of a directory before old versions get killed:

$ brig hints set /share   --max-cached-size 1G

This would allow us to have only one kind of pin, but a highly configurable way to decide which file should be kept. Since hints (and pins) are always local they only apply to us. The remote side can have their set of hints that enforce the rules they wish for.
The information above would give the repinner a lot of logic to work on.

[...] but funny things is that after a while repinner removes it from the backend cache.

I think this can be considered a bug. When a file is downloaded, then it should be only cached. The user that synced with user should add their own rules on what files/versions to keep. If brig cat would pin the files I output it would be counter-intuitive...

Finally, setting any of above pins should immideately trigger pin at backend. Removing a brig pin removes backend pin only if others are unset as well. I was thinking about above from the point of a mobile user, who has small disk space, and need to sync only portions of the main repo.

As discussed in another ticket, we need a queue of pins that should be carried out. But I think I would vote for a similar way how hints does this: When setting a pin, a user can --apply them directly or wait until the repinner thinks it's time. Best would be probably to rename the --recode to --apply and not only enforceencoding, but also pin status.

Does it all make sense?

Current pin status does not make sense, therefore your proposal makes sense. However, instead of adding more pin types I would reduce the number of different pins to 1 and make it configurable in another way.

@evgmik
Copy link
Collaborator Author

evgmik commented Feb 5, 2021

My main issue: to "invite" a pin during synchronization, you have to explicitly pin (well, we have sync type which pulls everything on sync, but I think it is not the main brig use case). But explicit pin, has meaning of keep me which is somewhat orthogonal to get me.

What do you mean by "invite"?

"invite" was used in the sense of setting the pin.

I also like hints idea and switching --recode to --apply.

@sahib
Copy link
Owner

sahib commented Feb 6, 2021

Good, then let's tackle that as one item. I think it's an important thing and will make brig a lot better.

The following this should be considered in the pin redesign:

  • Implement a hint system for pins, making pins transparent for users.
  • Remove distinction between explicit and implicit pins.
  • Re-implement re-pinner to use the hints.
  • Reduce number of bytes written per file update (See issue fix: stabilize & improve the FUSE layer  #49)
  • Queue pins/unpins so they get carried out without blocking (See issue Freeze in current develop branch #87)
  • Badger should be updated to a more recent version. New version is not backwards incompatible, so better now than later.

Anything I forgot?

@evgmik
Copy link
Collaborator Author

evgmik commented Feb 6, 2021

Anything I forgot?

I think you covered it well. One more item to track:

  • pin hints preservation during syncronization, currently strange things happening with it.

I will address badger in a separate issue.

@sahib
Copy link
Owner

sahib commented Feb 26, 2021

Seems IPFS v0.8 adds a way to add a pin in the background: https://github.com/ipfs/go-ipfs/releases/tag/v0.8.0
Didn't try it yet, but this would solve our issue with blocking pins, which is quite nice.

@evgmik
Copy link
Collaborator Author

evgmik commented Feb 27, 2021

There is also remote pinnig from our remote push might benefit. I.e. the backend might start ipfs download before the metadata is synchronized.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
proposal Proposals question Issue tracker used as a support platform.
Projects
None yet
Development

No branches or pull requests

2 participants