-
Notifications
You must be signed in to change notification settings - Fork 140
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory Constrained environments #343
Comments
Yes, I understand... I had an Do you have recurring tasks configured at all? You could skip the scheduler if not. Another question: are you starting Solid Queue via I'll see if I get to bring back the |
I do have recurring tasks.
For what it's worth, this was a very good call. That being said, per the readme, the dispatcher really isn't doing that much anymore. Unless you got big plans for the dispatcher, it seems the Supervisor has the maintenance task thread (or a second thread) that could be doing what's left of the dispatchers job. |
I think it depends on your setup 🤔 If you use delayed jobs, then the dispatcher will be making sure they get dispatched. This also applies to jobs automatically retried via Active Job with delay (the default). We do use them heavily and run several of them separate from workers, but perhaps you don't? In that case, would it help to just not run the dispatcher? You can achieve that by not configuring it at all, but I imagine you do need it in some cases 🤔 Although I imagine you need the concurrency maintenance task 😬 I think the async mode was a good idea for this case, TBH. |
I'm starting to worry this last exchange is falling further and further into the "I didn't fully understand" side of things, again :-( I have to admit to being a bit flummoxed between the complexity required for a competent Async Job subsystem and the desire to fit things into the itty bitty tiny box of small scale and affordable cloud based deployments. Given where SolidQueue currently sits with memory utilization, I'm going to have good think on the trade-offs between running just 1 worker and assuming it's going to recycle (on OOM) for almost every execution Vs. just facing I have to push that memory dial to the right and eat the bill 😢 |
So sorry, this is my fault. I call it Are you starting Solid Queue via |
That one I actually understood. Adding complexity to the code so it can be configured to run both ways seems like a big lift. I know you had it before, but every line of code in SolidQueue is a line that has to be supported, tested, and will eventually used in an unexpected way. I would guess Threads are ok for very lite / IO intensive work loads, but given the GVL I simply don't understand where the tradeoffs are on Threads Vs. Processes. I shouldn't have started this conversation without a better understanding.
I've switched to bin/jobs. I can't thank you enough for being willing to engage, and tolerate / put up with my learning curve on some of these issues. |
Oh, no, no please, it's me who should thank you for your patience and help to make Solid Queue better! 🙏 ❤️ The reason I asked about |
I'll look into the Eager Vs. Lazy tradeoffs. Thanks for that. Once I get worker recycling working / finished, I'll have more suggestions to share that have helped. For example, SolidQueue.on_start { GC.auto_compact = true } helps and is shared between forks. |
There's also |
Oh that looks interesting! Thank you. |
What's the easiest way to do this? Would an empty |
@hms If you're running in a super constrained environment, you could just use the puma plugin that we use in development. Then everything is running off that single Rails process. Just make sure you keep WEB_CONCURRENCY = 1. Another option is to stop getting fleeced by cloud providers charging ridiculous prices for tiny hosts 😄. Rails 8 is actually about answering that question in the broad sense. |
@majkelcc yes! If you have no recurring tasks defined at all (the default), then the scheduler will be automatically disabled. Alternatively, if you're running jobs in more than one place and want to disable it in one of them (this is what we do in HEY), then you can pass |
Oh, how I hate Heroku and the games they play with radically inappropriate "starter" resource sizing (you think Apple is bad) all in an effort to prop up their already overly insanely high prices to force me into upgrades. (does that make it "Insanely high(2) prices?". And yes, I'm very jealous of your new monster Dells and the fact you got off the treadmill (want to rent me a small slice for something I can afford...) But as a solo developer, who is extremely grateful for the technical compression the Rails community and you have delivered over the years, I can not put a price on the value of A) Not having to worry about anything DevOps; B) The comfort of 10+ years of using a system and feeling like you know all of it's corners I'm very much crossing my fingers that Rails 8 reduces the moving parts enough that the learning curve of a new deployment strategy becomes within reach. |
@hms You're the ideal target for the progress we're bringing to the deployment story in Rails 8. Stay tuned for Rails World! But in the meantime, I'd try with the puma plugin approach. |
@dhh In my case, I have at least split the web server from the SolidQueue environments, so I'm living large with 512MB x2. I'm just bristling at the fact that I have to go from $9 to $50 a month to double that memory. |
Highway robbery. Selling 512MB instances in 2024 is something. |
@hms have you had any success with When I deploy using either a separate worker or with the Puma plugin, I immediately exceed the 512mb quota. (Highway robbery... I know).
|
I've been focusing on other SQ PRs. But I have a "recycle Workers" PR in my back pocket. Only question is if the SQ team will accept one of my other PRs since the implementation has to change a little if they do. I tried all sorts of ways to try to constrain the memory footprint of Workers using the AWS S3 gem without any luck. It generates an unrecoverable memory footprint. I'm guessing a degenerate memory fragmentation rather than leak, but either way, this means I'll require one of: an unlimited Heroku budget or getting off my arse and pushing my recycling PR ASAP. CG.auto_compact = true: |
This was a good thread to find. I enjoy solid_queue for sure but noticed while running the puma plugin the ram within a matter of time goes far to far over the normal dyno memory threshold. This is a super small app so just tossing more money/ram at it doesn't seem to be the win. For now i'll run a seperate dyno for solid_queue and avoid all the R-errors from the platform. |
As an added data point, we're running a (pretty much vanilla) Rails 8 app on the smallest Digital Ocean droplet (1GB of memory) and solid_queue 1.1.2 is taking up 48% of that memory, matching the values reported by @hms in the original post. Our app is currently very barebones but we do have one recurring job. It doesn't make sense to me that the job framework would use almost 500MB of memory. I'll look into puma plugin mode, but I'd love to see solid_queue use less memory by default. |
Adding another data point: we just introduced ActiveJob + SolidQueue to an old and large codebase, which has a memory footprint of about 500Mb when running as a web server using Puma. When starting a SolidQueue worker with bin/jobs in production (using eager loading), 4 processes that each use close to 500Mb are started, we've seen memory usage reach 1.6Gb at some point. There's probably something in our codebase that makes it hard for processes to efficiently share memory for some reason (breaking ruby's copy-on-write mechanism when forking processes), at this point it's not something we'll be able to address since our codebase is so large. We ended up using the rake task instead of bin/jobs to bypass eager loading, so that the worker is the only process that ends up with the entire app in memory, and our SolidQueue server appears to stabilize at 650Mb instead. For comparison, we are trying to migrate from Sidekiq, which in our case only has a memory footprint of 400Mb for the same application, so SolidQueue appears to have a 4X memory footprint when using bin/jobs and 1.6X memory when using the rake task in our case. Would it make sense to only have the worker load the entire rails app, and the supervisor / dispatcher / scheduler be super lightweight processes instead? I can imagine that there's some value in trying to fork the workers from the same process to reduce memory footprint when having multiple worker processes, but at the same time I imagine that the 90% use case is running a single multi-threaded worker process anyway? |
@Jell, yeah, I still think the I'd recommend not to migrate from Sidekiq in your case since it seems it was working well, right? |
@rosa actually we do want to migrate away from Sidekiq because we'd like to use a background jobs that would allow us to act as a transactional outbox (the sharp tool described in https://github.com/rails/solid_queue?tab=readme-ov-file#jobs-and-transactional-integrity), and we also want to have better guarantees of "at least once processing" (which would require sidekiq pro). So we're looking at migrating to either SolidQueue or GoodJob. The higher memory footprint is not an issue for us per se, just doesn't feel good to have to motivate the cost increase. And I didn't mean to compare to Sidekiq to say "sidekiq is better", just to give what I think is a fair benchmark? We overall much prefer ActiveJob + SolidQueue over Sidekiq overall. |
from my understanding of your explanation on the other threads I think I believe I agree with you @rosa, an async mode with a single process would be perfect for our use case and I believe should give us a memory footprint which is on par with the industry standard. Thanks a lot for your patience and your effort 🙏 |
@rosa
At the risk of your crafting a Voodoo doll of me, and using it every time I reach out... Without knowing / understanding your design criteria and objectives, I'm at risk of asking poor questions or making bad suggestions, but here it goes anyway.... I'm going to apologize in advance for being "That Guy".
With the new V0.9 release (no tasks, nothing run), a fresh startup of SolidQueue, I see the following memory footprint (OSX):
Once the Jobs actually do something of value, the worker reliably grows to 200Mb plus (I'm looking at your ActiveRecord...). For those of us running on cloud services and a shoestring budget, that's already tight. I'm my case, I run a second Worker to isolate high memory jobs so I can "recycle on OOM" while still servicing everything else via the other worker.
I can purchase my way into additional memory resources at a cost of 10x (literally) what I'm paying now. And it only goes up from there. So this issue is real and painful for me, and I would guess a bunch of other folks running on shoestring budgets.
I'm sure there are use-cases where larger deployments would want a Dispatcher without a Supervisor, so I think understand the rational for the current design. But it would be nice if there was a way to via configuration to have a SuperDispatcherVisor... have the supervisor take on the dispatchers responsibilities and allow us to reclaim 110Mb+.
The text was updated successfully, but these errors were encountered: