-
Notifications
You must be signed in to change notification settings - Fork 688
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to backup scorch online? #1396
Comments
Currently, no. We are aware this is a big limitation. |
so current best practice is to close the index and tar / copy the folder from disk? |
mosuka/blast is the only open source project i can find that attempts to use bleve with raft. the technique used there for snapshotting / backups is to iterate through index items with What's your preferred approach for snapshots of a scorch index in the RAFT use case? I can think of a few...
|
Dump was only suitable for debugging, it was never intended for backup (and it had the same limitations in needing exclusive access to the k/v store). Generally, online backup should be more straightforward to implement for scorch. All of index data is written to immutable segments, meaning if you can guarantee they won't go away for a period, you can simply copy them. Now, that is just the raw index data, you need the index meta-data to know which set of files represent a snapshot, and also the deleted bitmaps associated with them. This is the part that is still sub-optimal in scorch, because we're using BoltDB for storing the meta-data, and that comes with limitations on exclusive access. So, due to that, an out-of-process backup isn't really doable in the short term. But, the existing read/write scorch instance could reasonably have a method to Backup() into some other location. Unfortunately, it won't be as simple as just adding it to the Reader method, as even though that represents a logical snapshot of the index, some of the segments in that may be in-memory still. Instead, it would probably need to be a new top-level method, which found the most recent fully-persisted snapshot in the root bolt, copied it, and the root.bolt itself into the new location. I don't think it's a whole lot of work, roughly:
|
Thanks Marty, that's helpful. Would it be possible / convenient to get a snapshot of the current state of the index and then wait for the segments to become persistent before proceeding with the rest of the backup as you described? |
@mschoch digging deeper... it seems to me that if |
Unfortunately no, the way Bleve works, when those in-memory segments eventually get persisted, they are introduced as a new snapshot. This simplifies the code most places because once you get a snapshot, it's immutable.
No, |
@mschoch Thanks again for your helpful response. I decided to go white box and thoroughly reviewed the Scorch.Batch, introducerLoop, and introduceSegment. I updated the README.md and renamed some variables to make it easier to read. PR is here: #1452 Near the end of that review... i ran into the I'm still looking to see where that is closed. Maybe I'll find it tomorrow. At that point, the Backup function follows your outline.
When recreating the DB from the backup however... i'm worried that perhaps not all the required metadata will be rehydrated. should i save off more than the snapshot data? or is that it? This might not be a fully online backup per se since we need to stop the world until the current root is persisted, but it might be a good place to start. Just demonstrating backup / recovery would be a win IMO and this might be just enough for me. |
i'm working on a PR now. it doesn't look for a persistent snapshot... it just writes snapshot data to a writer. |
I'm doing my best, but I cannot provide feedback and the pace you're moving forward. I think you're going down a wrong path trying to block updates, it should not be necessary. Aside from the uninteresting case of a brand new index, there is always at least one fully-persisted snapshot on disk. Backup should just add a ref count to that and/or prevent it's files from deletion (we have some additional maps used for that), and then backup that already-on-disk snapshot. |
Is there any way to backup scorch index online, ie take a snapshot of current scorch index without closing it and copy the snapshot to another location?
The text was updated successfully, but these errors were encountered: