-
Notifications
You must be signed in to change notification settings - Fork 321
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
regression: massive dependency tree on import #255
Comments
This repo is an internal library, so dependencies we need for Prometheus anyway aren't a problem per-se nor are issues that arise from 3rd parties importing it directly as such usage isn't supported. However nothing in this repo should be depending on all of that stuff, and for example the entirety of Nats should be not getting compiled into Prometheus. We only depend on the logging bits of go-kit. Can you explain how this is happening? Also, it is not generally appropriate to fork our direct dependencies over issues like this. It there's an issue let's tackle it more directly - and I believe you'll find the circular import (which isn't actually circular - as Go forbids those) is between client_golang and common, which we are aware of and which is much trickier to resolve. |
I think this assessment is incorrect. The difference between what Go downloads and what is actually vendored is different. If I check prior to #242 and run
If I pull master, and do the same
The net change for the current dependencies is smaller by ~15k LoC. |
I am not sure its as simple as that? Anyone using client_golang is also using I have some concerns that once this propagates out to more and more downstream dependencies, the entire go ecosystem will end up importing this bloat. A huge amount of libraries are importing Prometheus, so if any single module in a dependency chain imports a Prometheus version with this dependency on go-kit v0.10, we will get all of these dependencies as well. |
While this is true, this is nothing irreversible. Also, the truth is that technically the bloat does not matter as if you don't import a package that imports that bloat, it will NOT end up as a dependency of yours for your module. I don't want to speculate, would be nice to see the exact, other problems this situation can cause. As per the solution, it's not that easy to remove https://github.com/go-kit/kit/tree/master/log dep, by forking, etc. The reason is that everyone depends on go-kit anyway, so there will be always some helper etc, using it. It would need to be a bigger upstream change. Can we at least try to ask go-kit to move the log to a separate sub-module first? Probably unlikely they will accept it, but it will at least highlight the problem on their side. There is also a road map: go-kit/kit#843 which tells us they plan to clean the packages a bit. (drop unnecessary ones) |
I disagree with this. Prometheus import is present in virtually every single go project's dependency tree. Once this propagates to a few key packages (prometheus/client_go being an obvious one, other common targets like grpc will follow) I don't see how it can be prevented - someone will always be importing the bloated version. I think its fair to say that there is a difference between a direct dependency (ie shows up in vendor), indirect dependency (shows up in go.sum, must be downloaded, etc), and no dependency. Making the claim that indirect dependency and no dependency are close enough that it doesn't really matter if we import a ton of bloat is a reasonable one for certain projects I think - and ultimately this is the Prometheus project's decision to make for their own projects. However, I encourage you to consider that by making this choice you are essentially forcing this decision on majority of the golang ecosystem. No project will be able to decide they want to avoid downloading GBs of dependencies if they want to depend on any project that imports Prometheus. Personally, I don't think that is the correct decision, and will be very disappointed to bring in even more dependencies next time we update our Prometheus dependency. |
Did you file an issue in go-kit? One of the issues in 0.10 was needed for prometheus. |
Even if Prometheus would hold back from requiring recent versions of go-kit/kit, someone in the dependency tree will. The probability might be approaching 1 slightly more slowly, but go-kit/kit (or parts thereof) are popular enough to not really make a dent in the big picture.
If this is a problem that can be solved, it needs to be solved in go-kit/kit. But I do think this is a bit of a wild-goose chase. I think it's the nature of the Go dependency management as it has evolved that you will have to look at most of the popular open source projects written in Go to resolve the dependencies (and at most of their releases on top of that, see our little problem of pseudo-circular dependencies). You'll have half of the universe in your cache eventually, and I guess "that's fine" 🐶 |
And if that is really an issue, we should keep the vendors directories in our projects as it prevent users to have them in their cache. |
One of the major issues with Go modules is that even if a dependency is not part of the final artifact, Go will still download it (or parts of it, based on whether it has a go.mod file or not). This is particularly annoying if you use go mod download in a dockerfile for example to propagate cache: then it will download everything. I agree that go-kit should probably improve the situation (by splitting up the monorepo or by introducing submodules) |
I'm a newcomer to Go, so unfortunately I can't offer any solutions, but I wanted to note that exploded dependency tree is also problematic in organizations where all direct and indirect dependencies must go through open source license review. So far in my review, all of the dependencies have permissive licenses, but it's a lot to ask of our lawyers. The important metric for me is number of dependencies (
I understand the nuance of dependency being in go.sum versus actually being compiled, but it doesn't make a difference for license review (since you're just 1 import away from using the code once it's in |
@peplin you can strip down the number of licenses to review by using this tool: https://github.com/mitchellh/golicense Under the hood it uses
I wonder why. If a piece of code never makes it to the binary why is it a subject of a license check?
Isn't that true if an import is not in |
The reason I initially reported this was we were doing the same license checking @peplin. I agree with @sagikazarmark that its not really valid; we switched to Despite licenses there is still some impact to these dependencies, although it is much smaller than I originally thought; basically just go.sum noise which is irrelevant, and |
We in the Caddy project would also like to see the dependency tree shrink considerably. It makes development more tedious, and every single line of that go.sum represents a new point of failure. All it takes is one mistake in the huge chain (a bad commit, network error, repository misconfiguration, go.mod misconfiguration, you name it) and the Go tooling stops working on the whole project, making it practically impossible to dev with. |
etcd 3.5 (properly adapting modules) should improve the situation considerably (which is a dependency of go-kit itself, see etcd-io/etcd#12498). But the ultimate solution for this particular issue would be extracting the logging library from the core go-kit repository. |
This will be fixed upstream in go 1.17 golang/go#36460 |
go-kit/kit#1055 might also help |
Just a note to correct an unfortunately extremely common misconception: |
That being said, |
This right here is my main issue with huge monorepos or dependency thirsty projects. You can still have a single git repo codebase for multiple modules in Go. I've also seen pgx (jackc/pgx#977) use go-kit for the logging bits of code. Pgx then is used by GORM, so dependency hell can easily propagate. Everything below is just a rant It is a shame to not see more developers take this (managing your dependencies) seriously. It adds unnecessary bloat and risks (what happens when a N-th degree, random dependency is removed and your build pipelines start failing?, how long will it take for the fix to propagate?; can you maintain or replace a dependency if it is no longer maintained?). If the code you need from a monolith repo is a manageable amount, and not likely to need updates, then just copy-it in your codebase. |
@gouthamve the Oh! yes, see the kubernetes PR above of what we end up seeing, this is bad! please help fix! |
We are monitoring the upstream situation and we will move to the independent logging when we feel the time is right, in cooperation with go kit maintainers. |
For the record - #302 (comment) thanks @roidelapluie and @peterbourgon for your quick responses to close out the work in progress PR. Looking forward to seeing progress on this issue in due course. |
The go-kit/log is stable, which means that pull requests are welcome. |
thanks @roidelapluie filed #304 |
closed by #304 |
released in v0.26.0 |
thanks a ton @roidelapluie |
Thanks for the great work here! The work in this repo is done, but the whole ecosystem needs to update in order for this to have any impact on projects. For us in particular, contrib.go.opencensus.io/exporter/prometheus -> github.com/prometheus/statsd_exporter -> github.com/prometheus/client_golang (done already!). I will send some PRs |
Pulling in the same changes as done in prometheus/common#255
Pulling in the same changes as done in prometheus/common#255 Signed-off-by: John Howard <[email protected]>
Awesome, thanks @howardjohn ! |
here's a list of repos that depend on prometheus/common >= v0.11.0 and <= v0.25.0 : |
Yup, I've been working on fixing a few places where we have some recursive dependencies like |
thanks @SuperQ ! |
statsd exporter 0.20.3 is out with this change! |
github.com/go-kit/kit is only being used for the log packages and brings in a lot of dependencies as compared to github.com/go-kit/log. This would help determine how useful switching would be. For reference see kubernetes/kubernetes#102144 and prometheus/common#255 (comment)
Importing
github.com/prometheus/common
now causes import of 7 million lines of code since #242. Prior to that, only 1.5 MLOC were imported.This includes:
The root of this is github.com/go-kit/kit ultimately. It imports a ton of stuff recently. Note that client_golang has not yet updated, so likely the ecosystem is not yet fully impacted unless they update prometheus/common directly.
We use it only for ~100 lines of logging code, so we can almost just replace it entirely. However, github.com/prometheus/common actually depends on an older version(s) of github.com/prometheus/common due to circular imports. The circular import can be removed if github.com/mwitkow/go-conntrack dependency is dropped. go-conntrack offers tracing and monitoring (using prometheus). Our usage is only the tracing. Therefor, we could remove this circular import with a fork of go-conntrack if desired
The text was updated successfully, but these errors were encountered: