-
-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Regression: Adding a lot of files to MFS will slow ipfs down significantly #8694
Comments
The shellscript I'm using is open source, so you should be able to reproduce this:
This will rsync the arch package mirror and loop over the files and import this into the local MFS. Just make sure you have enough space in ~ for the download (69GB) and on the IPFS node to write this into the storage. |
@RubenKelevra do you know which version caused a regression? Have you tried with v0.11.0? v0.12.0 is a very targeted release which should not have disturbed much so understanding when this issue emerged would be very helpful. |
Hey @aschmahmann, I started the import on 0.11 yesterday. As soon as I'm home I can report if this is happening there too. While an offline import works without slowdown, I still get sometimes errors back which looks like the ipfs add comes too fast back and the This seems to be a dedicated issue which is probably not a regression, as I never tried importing it off line before. |
I can confirm this issue for 0.11 as well, so it's not a new thing.
The next step for me is to try the binary from dist.ipfs.io to rule out any build issues. |
I can confirm the issue for the binary from dist.ipfs.io as well. |
Thanks that's very helpful. Is this a v0.10.0 -> v0.11.0 thing? When was the last known version before the behavior started changing? In any event, having a more minimal reproduction would help (e.g. making a version of the script that works from a local folder rather than relying on rsync). If this is v0.11.0 related then my suspicion is that you have directories that were small enough you could transfer them through go-ipfs previously, but large enough that MFS will now automatically shard them (could be confirmed by looking at your MFS via If so then what exactly about sharded directories + MFS is causing the slow down should be looked at. Some things I'd start with investigating are:
|
I think the last time I ran a full import I was on 0.9.1. I just started the import to make sure that's correct.
Sure, if you want to avoid any rsync, just comment out L87. I think that should work. The script will still expect a repository directory like from Manjaro or Arch to work properly, but you can just reuse the same repository without having to update it between each try.
Sounds like a reasonable suspicion, but on the other hand, this shouldn't lead to minutes in response time for simple operations. I feel like we're dealing with some kind of locked operation which gets "overwritten" with new data fed into ipfs, while it's running. So we pile up tasks before a lock. This would explain why it's starting fast and get slower and slower until it's basically down to a crawl. |
Ah and additionally, I used sharding previously just for testing, but decided against it. So the import was running fine with sharding previously (like with 0.4 or something). Previously, there was no need for sharding, which makes me wonder why IPFS would do sharding if it's not necessary. |
@aschmahmann I've installed 0.9.1 from dist.ipfs.io and I can confirm, the bug is not present in this version. |
Ok, so to clarify your performance/testing looks like:
TLDR: Two reasons. 1) Serializing the block to check if it exceeds the limit before re-encoding it is expensive, so having some conservative estimate is reasonable 2) Maxing out the block size isn't necessarily optimal. For example, if you keep writing blocks up to 1MB in size then every time you add an entry you create a duplicate block of similar size which can lead to a whole bunch of wasted space that you may/may not want to GC depending on how accessible you want your history to be. #7022 (comment) Thanks for your testing work so far. If you're able to keep going here, understanding if v0.10.0 is ✔️ or ❌ would be helpful. Additionally/alternatively, you could try v0.11.0 and jack up the internal variable controlling the auto-sharding threshold to effectively turn it off by doing I also realized this internal flag was missing from the docs 🤦 so I put up #8723 |
We're going to close because don't have additional info to dig in further. Feel free to reopen with the requested info if this is still an issue. Thanks. |
@aschmahmann was this fixed? I updated to the 0.13rc1, and I ran into serious performance issues again. Have you tried to add many files to the MFS in simple |
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as off-topic.
This comment was marked as off-topic.
I think this was more due to large repining operations by the cluster daemon, as the MFS folders need to be pinned locally on every change. I created a ticket on the cluster project for this. Furthermore, I see (at least with a few file changes) no large hangs when using 1 MB sharding. But I haven't yet tested the full import I had originally trouble with – and what this ticket is about. |
@aschmahmann I can confirm this issue with the suggested (1 MB should never be exceeded on my datasets, as sharding wasn't necessary before to store the folders.) The changes to the MFS are crunching to a hold after a lot of consecutive operations, where single All other MFS operations are blocked as well, so you get response times in the minutes for simple @BigLep please reopen as this isn't fixed and can be reproduced |
I'll take my project pacman.store with the package mirrors for Manjaro, Arch Linux etc. down until this is solved. I don't like running 0.9 anymore due to the age and would need to downgrade the whole server again. I just cannot share days old packages, even weeks due to safety concerns, so I don't like to do any harm here. The URLs will just return empty directories for now. |
This comment was marked as off-topic.
This comment was marked as off-topic.
Yeah. CPU load piles up and a simple I think there's just something running in concurrency and somehow work need to be done again and again to make the change to the MFS, as other parts of it are still changing. But that's just a guess. Could be anything else, really. |
@lidel We might be conflating different topics in the same issue here, let's have a new issue for that report when it comes and please ping me. |
@schomatis ack, moved |
This comment was marked as off-topic.
This comment was marked as off-topic.
@dhyaniarun1993 I am confident that this is an other issue (I couldn't find the issue so if you want open a new one even tho we know what this is). |
This comment was marked as off-topic.
This comment was marked as off-topic.
@dhyaniarun1993 I don't want to spam this issue so I'll mark our conversation off-topic FYI, pls open a new issue. The resolution (walking from
Kubo will fetch EDIT: github wont let me hide my own messages ... :'( |
I have this same issue. I'm maintaining a package mirror with approximately Here's the question. I already have a table of names and cids of |
@Jorropo : what are the next steps here? |
Looking at this:
heap from pprof: cpu from pprof: block: allocations: It seems to me that this is related to sharding. Funny enough my computer froze while writing this and I lost the profile data which I had in a ram folder and it also showed a bunch of goroutines waiting on a leveldb select(). |
@hsanjuan Thanks for taking look at this! Would be cool to bring the mirror back online! |
Running same test with pebbleds as backend instead of leveldb:
And OK: this is the issue:
I will work on a fix tomorrow, but doing |
This is a mitigation to increased MFS memory usage in the course of many writes operations. The underlying issue is the unbounded growth of the mfs directory cache in boxo. In the latest boxo version, this cache can be cleared by calling Flush() on the folder. In order to trigger that, we call Flush() on the parent folder of the file/folder where the write-operations are happening. To flushing the parent folder allows it to grow unbounded. Then, any read operation to that folder or parents (i.e. stat), will trigger a sync-operation to match the cache to the underlying unixfs structure (and obtain the correct node-cid). This sync operation must visit every item in the cache. When the cache has grown too much, and the underlying unixfs-folder has switched into a HAMT, the operation can take minutes. Thus, we should clear the cache often and the Flush flag is a good indicator that we can let it go. Users can always run with --flush=false and flush at regular intervals during their MFS writes if they want to extract some performance. Fixes #8694, #10588.
Ok, summary for everyone looking here:
The HAMT slowdown part is something to keep in mind when working with MFS folders with a large number of files in them, but it is not a bug. |
Checklist
Installation method
built from source
Version
Config
Description
I'm running 2a871ef compiled by go 1.17.6 on Arch Linux for some days on one of my servers.
I had trouble with my MFS datastore after updating (I couldn't delete a file). So I reset my datastore and started importing the data again.
I'm using a shell script that adds the files and folders individually. Because of #7532, I can't use
ipfs files write
but instead useipfs add
, followed by anipfs files cp /ipfs/$cid /path/to/file
and anipfs pin rm $cid
.For the
ipfs add
is setsize-65536
as the chunker,blake2b-256
as the hashing algorithm, and use raw-leaves.After the 3 days, there was basically no IO on the machine and ipfs was using around 1.6 cores pretty consistently without any progress real progress. At that time only this one script was running against the API with no concurrency. The automatic garbage collector of ipfs is off.
There are no experimental settings activated and I'm using flatfs.
I did some debugging, all operations were still working, just extremely slow:
and
This is while my script was still running on the API and waiting minutes on each response.
Here's my memory dump etc. while the issue occurred: /ipfs/QmPJ1ec2CywWLFeaHFaTeo6g56S5Bqi3g3MEF1a3JrL8zk
Here's a dump after I stopped the import of files and the CPU usage dropped down to like 0.3 cores: /ipfs/QmbotJhgzc2SBxuvGA9dsCFLbxd836QBNFYkLhdqTCZwrP
Here's what the memory looked like as the issue occurred (according to
atop 1
):The machine got 10 dedicated cores from a AMD EPYC 7702 and 1 TB SSD storage via NAS.
The text was updated successfully, but these errors were encountered: