Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Go 1.22 vendored workspaces support #553

Merged
merged 24 commits into from
Jan 7, 2025

Conversation

eskultety
Copy link
Member

@eskultety eskultety commented Jun 10, 2024

Go 1.22 introduced another toolkit feature that affects cachi2 directly - vendored workspaces meaning that projects can now combine the vendoring feature (local module replacements) and workspaces (controlling local modules centrally wrt/ dependencies and language version requirements).

Go enabled this via a new subcommand go work vendor which needs to be run instead of the existing go mod vendor. This looks like the only difference and so we should be able to address it very simply by running that modified vendoring command when we detect the project uses workspaces.

Given the low complexity of the code no design doc accompanies this PoC as the changes should be straightforward, provided we finish the workspaces support in #457.

Note this PoC is based on work proposed in #457 so as such we need to wait for that PR before we consider this to be merged.

EDIT: Our existing workspaces code didn't account for the fact that go list isn't workspace aware and so we've been missing some module packages in the output SBOM (as noted by comment #553 (comment)). This PR addresses that.

References: #398

Notes:

  • I successfully built a modified (re-enabled workspaces which upstream disables by the means of GOWORK=off) downstream version of hypershift
  • I successfully prefetched upstream kubernetes and grafana repos

Maintainers will complete the following section

  • Commit messages are descriptive enough
  • Code coverage from testing does not decrease and new code is covered
  • Docs updated (if applicable)
  • Docs links in the code are still valid (if docs were updated)

Note: if the contribution is external (not from an organization member), the CI
pipeline will not run automatically. After verifying that the CI is safe to run:

@eskultety eskultety mentioned this pull request Jun 17, 2024
4 tasks
@eskultety
Copy link
Member Author

eskultety commented Jun 26, 2024

Since v1:

@chmeliik @brunoapimentel I'm ambivalent on the decision to test this in an e2e test (I did make sense to me to verify building with everything vendored, i.e. we didn't need to fetch anything, also there's a lack of real projects to test the feature with), so let me know if you want me to model it as a basic integration test or e2e is fine.

Note: Due to the ^above the integration test currently points to my own fork and based on the decision and the overall acceptance of the integration test I will push the change to the test origin repo.

@eskultety eskultety changed the title POC Go 1.22 support (including vendored workspaces) Go 1.22 support (including vendored workspaces) Jun 26, 2024
@@ -1525,6 +1527,7 @@ def parse_module_line(line: str) -> ParsedModule:
def _vendor_deps(
go: Go,
app_dir: RootedPath,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we still need to handle the app_dir vs. workspace root mismatch

                packages=({"path": "hi/hiii", "type": "gomod"},),
                flags=["--gomod-vendor"],

This means that the app_dir will be hi/hiii and so cachi2 will try to parse hi/hiii/vendor/modules.txt, which doesn't exist. We should parse the root vendor/modules.txt.

Copy link
Contributor

@chmeliik chmeliik Jun 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested with https://github.com/cachito-testing/gomod-pandemonium/tree/go-1.22-workspace-vendoring

cachi2 --log-level debug fetch-deps '{"type": "gomod", "path": "."}' --gomod-vendor
cp cachi2-output/bom.json root-bom.json

cachi2 --log-level debug fetch-deps '{"type": "gomod", "path": "terminaltor"}' --gomod-vendor
cp cachi2-output/bom.json submodule-bom.json

get_module_purls() {               
    jq < "$1" '.components[].purl | select(test("type=module"))' -r | sort -u
}

comm -23 <(get_module_purls root-bom.json) <(get_module_purls submodule-bom.json)
pkg:golang/github.com/Azure/[email protected]?type=module
pkg:golang/github.com/google/[email protected]?type=module
pkg:golang/github.com/go-task/[email protected]?type=module
pkg:golang/github.com/Microsoft/[email protected]?type=module
pkg:golang/golang.org/x/[email protected]?type=module

^ currently, these are only identified if you're processing the workspace root

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even after fixing the modules.txt parsing, there's still a problem with reporting packages:

get_nonstd_purls() {                    
    jq < "$1" '.components[].purl | select(test("^pkg:golang/[^/]*\\.[^/]*/"))' -r | sort -u
}

comm -23 <(get_nonstd_purls root-bom.json) <(get_nonstd_purls submodule-bom-fixed.json)
pkg:golang/github.com/cachito-testing/[email protected]?type=package
pkg:golang/github.com/cachito-testing/gomod-pandemonium/[email protected]?type=package
pkg:golang/github.com/cachito-testing/retrodep/v2/retrodep/[email protected]?type=package
pkg:golang/github.com/cachito-testing/retrodep/v2/[email protected]?type=package
pkg:golang/github.com/Masterminds/[email protected]?type=package
pkg:golang/github.com/op/[email protected]?type=package
pkg:golang/github.com/pkg/[email protected]?type=package
pkg:golang/golang.org/x/sys/[email protected]?type=package
pkg:golang/golang.org/x/tools/go/[email protected]?type=package
pkg:golang/gopkg.in/[email protected]?type=package

It looks like go list ./... isn't "workspace-aware", so we will need to run that in the workspace root as well. We can probably make that part of the bigger rework

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed in latest revision. Should be resolved I think.

@chmeliik
Copy link
Contributor

Since v1:

* rebased on top of [Add support for go workspaces #457](https://github.com/containerbuildsystem/cachi2/pull/457)

* added an e2e integration test (to verify that a project can actually be built with a vendored workspace)

@chmeliik @brunoapimentel I'm ambivalent on the decision to test this in an e2e test (I did make sense to me to verify building with everything vendored, i.e. we didn't need to fetch anything, also there's a lack of real projects to test the feature with), so let me know if you want me to model it as a basic integration test or e2e is fine.

Note: Due to the ^above the integration test currently points to my own fork and based on the decision and the overall acceptance of the integration test I will push the change to the test origin repo.

Basic would have been enough IMO, but there's no harm in keeping it e2e when it's already e2e

@pierDipi
Copy link

pierDipi commented Aug 8, 2024

@eskultety do you know when we can start using Go 1.22 in Konflux ?

I'm getting

2024-08-08 09:19:07,145 INFO go.mod reported versions: '1.22'[go], '1.22.6'[toolchain]
2024-08-08 09:19:07,145 ERROR Failed to fetch gomod dependencies
2024-08-08 09:19:07,149 ERROR PackageManagerError: Required/recommended Go toolchain version '1.22.6' is not supported yet.
Error: PackageManagerError: Required/recommended Go toolchain version '1.22.6' is not supported yet.
  Please lower your required/recommended Go version and retry the request. You may also want to open a feature request on adding support for this version.

and I'm assuming the feature request has been already requested somewhere given this PR

@eskultety
Copy link
Member Author

@eskultety do you know when we can start using Go 1.22 in Konflux ?

I'm getting

2024-08-08 09:19:07,145 INFO go.mod reported versions: '1.22'[go], '1.22.6'[toolchain]
2024-08-08 09:19:07,145 ERROR Failed to fetch gomod dependencies
2024-08-08 09:19:07,149 ERROR PackageManagerError: Required/recommended Go toolchain version '1.22.6' is not supported yet.
Error: PackageManagerError: Required/recommended Go toolchain version '1.22.6' is not supported yet.
  Please lower your required/recommended Go version and retry the request. You may also want to open a feature request on adding support for this version.

and I'm assuming the feature request has been already requested somewhere given this PR

@pierDipi do you use vendored workspaces? If so, this work is still in progress, if not, this PR is not the right place to discuss - #591

@eskultety eskultety changed the title Go 1.22 support (including vendored workspaces) Go 1.22 vendored workspaces support Dec 8, 2024
@eskultety
Copy link
Member Author

eskultety commented Dec 8, 2024

Since v2:

  • complete rework
  • address the fact that go list isn't workspace aware and run it under every workspaced module's context and aggregate the package lists
  • refactor _resolve_gomod such that the related logic is separated into tiny helpers, improving the test isolation in return
  • refactor _test_resolve_gomod to rely more on mocked output rather than carefully constructed ordered list of low level go commands
  • updated the unit test gomod mocking script and generated data with go 1.23 and with properly set up workspaces repository

Copy link
Collaborator

@a-ovchinnikov a-ovchinnikov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some first-pass observations

cachi2/core/package_managers/gomod.py Outdated Show resolved Hide resolved
cachi2/core/package_managers/gomod.py Outdated Show resolved Hide resolved
cachi2/core/package_managers/gomod.py Outdated Show resolved Hide resolved
cachi2/core/package_managers/gomod.py Outdated Show resolved Hide resolved
tests/unit/package_managers/test_gomod.py Show resolved Hide resolved
Copy link
Collaborator

@a-ovchinnikov a-ovchinnikov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with several non-blocking nitpicks, suggestions and a couple of small questions. The change is about refactoring existing code, the bulk of it is test data update.

hack/mock-unittest-data/gomod.sh Outdated Show resolved Hide resolved
cachi2/core/package_managers/gomod.py Show resolved Hide resolved
cachi2/core/package_managers/gomod.py Outdated Show resolved Hide resolved
cachi2/core/package_managers/gomod.py Outdated Show resolved Hide resolved
cachi2/core/package_managers/gomod.py Outdated Show resolved Hide resolved
cachi2/core/package_managers/gomod.py Show resolved Hide resolved
@brunoapimentel
Copy link
Contributor

brunoapimentel commented Jan 3, 2025

Wow, great (and brave) job on doing all this refactoring. The overall solution LGTM.

I fetched the dependencies on K8s using both your branch here and a slightly modified version of main (just removing the workspace vendoring restriction and using go work vendor instead of go mod vendor), and with your branch, the resulting SBOM contained 290 extra unique dependencies.

I will do a more nitpickish review on Monday and give you my final ack.

@eskultety eskultety force-pushed the go122_workspaces branch 2 times, most recently from b26ca1f to 92374c6 Compare January 6, 2025 12:57
cachi2/core/package_managers/gomod.py Show resolved Hide resolved
cachi2/core/package_managers/gomod.py Outdated Show resolved Hide resolved
tests/unit/package_managers/test_gomod.py Outdated Show resolved Hide resolved
cachi2/core/package_managers/gomod.py Show resolved Hide resolved
Returns the name of the gomod test mocking directory that will be
useful in future gomod testing patches.

Signed-off-by: Erik Skultety <[email protected]>
- replace the class-level Go() mock in the function signature with a
  local mock object
        --> there has to be a reason for class definitions to be mocked

- adjust workspace tests module paths
        --> reduces confusion on workspace directory vs workspace
            module directory

Signed-off-by: Erik Skultety <[email protected]>
- drop the "nested workspaces" test case
    --> it's unclear how that was supposed to work, i.e where would the
        go.work be hosted if the main module was in the 'mainmod'
        subdirectory which means we wouldn't even find the go.work file
        if app_dir == 'mainmod'
    --> nested workspaces don't seem to be supported in a clean way
        (i.e. without manual intervention) [1]

- adjust module path strings so that the test paths follow the
  test ID: 'app_root_is_workspace'

- replace RootedPath(tmp_path) occurrence with rooted_tmp_path fixture

[1] golang/go#53938

Signed-off-by: Erik Skultety <[email protected]>
Future patches will make changes to both the mocked data as well as the
mock script that generates it, so this way it's going to be clearer how we
got the mock changes and what caused them. Besides, our mock data were
created with version of Go that is no longer supported by upstream, so
it's time to catch up anyway with the latest language version release.

Signed-off-by: Erik Skultety <[email protected]>
The original workspaces integrating work happened around the 1.21
release targeting only the 1.18 behaviour which introduced the concept
of workspaces. However, 1.22 improved the concept and so we should test
with data specifically targeting the 1.22 changes.

Signed-off-by: Erik Skultety <[email protected]>
We need to list all packages in the tracked workspaces, however, that
cannot be done with a single

    $ go list -deps -json=-json=ImportPath,Module,Standard,Deps ./...

command because 'go list' isn't fully workspace aware and so we need to
run that command under every module tracked in a workspace to get the
full picture of the list of packages (which we need to report in the
SBOM).

Signed-off-by: Erik Skultety <[email protected]>
_vendor_deps:
    - drop the mention of 'can_make_changes' argument - the actual
      argument was dropped already, this is just a docstring leftover
    - is no longer allowed to make vendor-related changes to the
      repository directory. This is a tiny leftover from commit
      ba75c2b

_parse_workspace_module:
    - The replace info being relative to the processed module doesn't
      sound right:

        dir: <snip>/kubernetes.git/staging/src/k8s.io/sample-controller
        replace: ./staging/src/k8s.io/sample-controller

    It's always relative to the workspace, i.e. the go.work file.
    Therefore, adjust the doc string.

Fixes:
  - ba75c2b
  - 6f6bb42

Signed-off-by: Erik Skultety <[email protected]>
The argument in question in this function is 'main_module_version'
which hasn't been used since introduced in commit 6f6bb42 .

Fixes: 6f6bb42

Signed-off-by: Erik Skultety <[email protected]>
Snippet from the docs [1]:

    If the vendor directory is present in the main module’s root
    directory, it will be used automatically if the go version in the
    main module’s go.mod file is 1.14 or higher. To explicitly enable
    vendoring, invoke the go command with the flag -mod=vendor. To
    disable vendoring, use the flag -mod=readonly or -mod=mod.

This bit from [2] also being important since we only ever used it with
the 'list' command:

    This would mean that go commands other than go mod ... and go get
    no longer update versions in go.mod. If go.mod needs to be fixed,
    these commands would report that error and stop. And the errors in
    go list should be reported only for the packages that can't be
    resolved - the overall go list operation should succeed.

With -mod readonly being the default.

[1] https://go.dev/ref/mod#vendoring
[2] golang/go#40728

Signed-off-by: Erik Skultety <[email protected]>
Execute all symlink checks in a single loop. This is a purely cosmetic
change.

Signed-off-by: Erik Skultety <[email protected]>
It makes more sense to validate the queried modules before querying
packages which requires another 'go' call. Not only that, such a 'go'
call will be more expensive in the future when supporting Go workspace
vendoring since all workspaces have to be traversed individually to
query packages.

Signed-off-by: Erik Skultety <[email protected]>
The alias based definition doesn't provide any links to documentation
nor was it accompanied with a descriptive commit message making it hard
to follow what's going on, more specifically, which bits are custom
implementation and which are pydantic primitives.

The simplification proposed by this patch follows the examples in the
official docs [1],[2].

[1] https://docs.pydantic.dev/latest/api/config/#pydantic.config.ConfigDict.alias_generator
[2] https://docs.pydantic.dev/latest/api/config/#pydantic.config.ConfigDict.populate_by_name

Signed-off-by: Erik Skultety <[email protected]>
The model represents the structure of the go.work file (relevant fields
only) which is going to be leveraged in future patches which will add a
wrapping class and logic over anything go.work related.

Signed-off-by: Erik Skultety <[email protected]>
Introduce a wrapping UserDict class over the ParsedGoWork model adding
convenience methods we are relying on in the code:
    - getting relative workspace module paths
    - querying the absolute path of the go.work file
    - querying the containing directory path of the go.work file

This wrapping class will enable us to isolate some of the workspace
related code which means we'll be able to declutter the
test_resolve_gomod behemoth of a test somewhat.

Signed-off-by: Erik Skultety <[email protected]>
The test case is covered by the recently added TestGoWork test class.

Signed-off-by: Erik Skultety <[email protected]>
Where previously we only passed a simple path to the go.work file we
now pass a GoWork instance to have better control of the data we need
to work with during _resolve_gomod. The code isn't necessarily more
simplified, but it's more isolated, meaning that we e.g. don't need to
pull workspace related data from parsed modules which will help the
making the unit test suite more resilient to code changes.

Additionally, update integration test data accordingly.

Signed-off-by: Erik Skultety <[email protected]>
The logic (and tests) have been fully fused into the GoWork wrapping
class and there are no more callers of the original left.

Signed-off-by: Erik Skultety <[email protected]>
Declutter the already hard-to-follow _resolve_gomod function.
Naturally, we have to pass some more arguments to the extracted logic
compared to the original inline version of it.

This patch also allows us to:
- test the logic in isolation
- simplify the existing unnecessarily complex negative test
- add a proper positive test
- reduce the horribly error-prone mock_run_cmd list of return values
  via side effects by isolating the functionality in a dedicated helper

Note the patch is best viewed with --color-moved.

Signed-off-by: Erik Skultety <[email protected]>
Makes the code more readable without having to pass a suspicious
command list around. This also further improves the unit test suite in
that individual Go commands should be isolated in dedicated
helpers/helper classes that are tested separately and mocked everywhere
else.

Signed-off-by: Erik Skultety <[email protected]>
This patch falls into the category of decluttering the already big
_resolve_gomod function. Same as with the recent _go_list_deps
introduction this function will make the unit tests more resilient and
isolated. While the function as proposed here doesn't do much at the
moment, most of the logic will be added in a future patch where we'll
iterate over all go workspace modules and query module packages, since
the 'go list' command isn't fully workspace aware.

Signed-off-by: Erik Skultety <[email protected]>
This patch builds upon the previous one by expanding the
_parse_packages helper function to query all module packages used by a
project.
So far we haven't been producing a complete SBOM when it comes to
workspaces since we only recently found out that the 'go list' command
isn't really workspace aware and so in order to build the full picture
of all packages inside all workspace modules, we have to run the
command in a loop in every workspace module and aggregate the results
(possibly with duplicate entries which will be filtered out by SBOM
 formatter) as opposed to simply running the

    'go list -e -deps ./...'

command from the workspace containing directory (i.e. the one with
go.work file), hoping that go would list the module packages
recursively.

This patch also needs more tweaks to the test_resolve_gomod function as
now with the isolation we need to mock all the go_list_deps calls made
in a loop from within _parse_packages.

The expected mocked gomod results data had to be adjusted as well due
to this patch. Additionally, related integration test data are updated
as well.

Signed-off-by: Erik Skultety <[email protected]>
Introduced by Go 1.22 and controlled via 'go work vendor'. We already
can work with plain module vendoring. Workspace vendoring is the same
but relying on the logic backed by the go.work file.
Until now we weren't able to process workspaces properly and correctly,
but with the recent changes introduced in previous patches to address
that we're now fully ready to flip the switch on vendored workspaces
support.

Signed-off-by: Erik Skultety <[email protected]>
@eskultety eskultety added this pull request to the merge queue Jan 7, 2025
Merged via the queue into containerbuildsystem:main with commit 724af38 Jan 7, 2025
14 checks passed
@eskultety eskultety deleted the go122_workspaces branch January 7, 2025 10:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants