Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Options for representing release tags without bloom? #92

Closed
mikepurvis opened this issue Dec 21, 2016 · 9 comments
Closed

Options for representing release tags without bloom? #92

mikepurvis opened this issue Dec 21, 2016 · 9 comments
Labels

Comments

@mikepurvis
Copy link
Contributor

mikepurvis commented Dec 21, 2016

Following the discussion in #65 and with the additional context of my ROSCon talk, I'm hoping to reopen a discussion about how to represent release tags in a distribution.yaml file without requiring bloom or a second dedicated GBP repo. My overall goals are:

  • For internal-only repos where their only destination is into rosinstall_generator-based bundle workspaces, eliminate the bloom-release step, while retaining the catkin_prepare_release step (and the overall notion of per-package versioning).
  • An internal rosdistro should be able to contain conventional release stanzas for upstream packages, and have those interoperate with these unbloomed internal packages.
  • Maintain the idea of a "latest releases" build distinct from a "latest devel" build.
  • Minimize overall disruption to the current tooling (especially rosdistro and rosinstall_generator).

It was rightly pointed out in #65 that simply putting source repo URLs into the release stanza URL doesn't work for multi-package repos. So that's out, leaving a handful of other potential options:

  1. Add a new field to the source stanza for "tag of current release". This would also address --upstream option broken for custom upstream tags rosinstall_generator#33, which is the issue that rosinstall_generator (and now rosdistro_freeze_source) currently can only guess at the upstream tag which corresponds to a given release.
    • Drawback: doing/supporting this properly requires changes across a bunch of tools, though I don't think it's actually a regressing change, and the new field would obviously be optional.
  2. Allow the release stanza to exist with version string only and no URL (such a stanza would be used as it currently is by rosdistro_freeze_source and rosinstall_generator, but ignored when the URL is required, for example by rosdistro_build_cache).
    • Drawback: breaks rosdistro API, since release stanza no longer guaranteed to have URL field.
  3. Handle it entirely using the existing source stanza. Set version to the branch name as it is today, and make rosdistro_freeze_source able to do something like "latest tag", which would need to examine recent commits on each version branch, looking for the most recent commit which is a tagged version (unsure if this can be done efficiently).
    • Drawback: to lock into a specific release you need to either remove newer tags, or permanently set source/version to the tag you want to lock to. Losing the idea of a separate development and release version probably doesn't matter in this instance since holding back a version would probably apply to an upstream repo that you're not actively developing on anyway.
  4. Keep using bloom and a release repo, but shorten the time taken by modifying the tracks.yaml to eliminate all git-bloom-generate steps other than the first rosrelease one (example).
    • Drawback: doesn't really meet the main objective, users constantly prompted to update the default bloom actions.

Sorry for the big brain dump, but I'd hoping for some guidance here on how to best move this forward in a way that will work to benefit Clearpath's internal tooling, other users doing similar things, and the upstream repos that make up public ROS.

@dirk-thomas
Copy link
Member

dirk-thomas commented Dec 21, 2016

I took the freedom to edit your post and changed the unordered list into an ordered one. That makes it easier to refer to the proposed options.

I have a few question to clarify the goals. Some options can maybe be ruled out based on the answers:

  • How should the tag name in the source repo pointing of the latest release be updated? If bloom is being used I assume it can do that together with what it already does in the generated PR. But if bloom is not used is the expectation that the user creates a PR to update that release tag manually?

  • Should it be possible to specify a release tag without having a source entry?

  • Should it be possible to have a "normal" release entry in parallel to the latest release tag name? If yes, that would rule out option 2.

In general I am not in favor of requiring heuristic / crawling of any kind (as described in option 3). Simply because of the complexity as well as the non-deterministic result.

Also just to mention it: the approach of referring to a release using a tag in the source repo doesn't work for packages which rely on additional information being augmented in the release repo.

@mikepurvis
Copy link
Contributor Author

How should the tag name in the source repo pointing of the latest release be updated? If bloom is being used I assume it can do that together with what it already does in the generated PR. But if bloom is not used is the expectation that the user creates a PR to update that release tag manually?

Some combination of manual commits or a new lightweight tool. We (Clearpath) already are unable to take advantage of bloom's automatic PR generator since our rosdistro is on an internal GitLab instance, and prior to that on Bitbucket (see ros-infrastructure/bloom#257), so having to manually mutate the distribution.yaml is familiar territory.

Should it be possible to specify a release tag without having a source entry?

Not a url-less tag, as proposed in option 2. The first three states are valid today; in a world where option 2 has been implemented, the fourth state becomes valid, but it would never be valid to specify a release version without having either a release bloom URL specified or a source URL specified:

  • source entry present (cache contains only source repo package xmls)
  • release entry present, with bloom URL (cache contains package xmls from GBP tags)
  • source and release entry present (sum of the above)
  • source entry present, release entry present, but without bloom URL (cache contains source repo package XMLs from the source version, and also release package XMLs from the source repo, but at the release version pointer, eg:
control_msgs:
  release:
    version: 1.3.1  # Note no debian increment, since this is a source repo tag.
  source:
    type: git
    url: https://github.com/ros-controls/control_msgs.git
    version: indigo-devel

Of course the interesting/tricky thing here is ensuring that this is still a valid release cache state, since if the release tag and source branch repo states have substantially diverged, it may not be enough to just look up package.xml files in the same paths, but rather the repo should cloned again and re-spidered.

That being the case, maybe it's silly to try to repurpose the release stanza in this way, and something more in line with option 3 is the least disruptive way forward?

Should it be possible to have a "normal" release entry in parallel to the latest release tag name? If yes, that would rule out option 2.

I'm not totally sure what the scenario is here; could you supply an example?

the approach of referring to a release using a tag in the source repo doesn't work for packages which rely on additional information being augmented in the release repo.

Yes, packages which have those requirements would have to be bloomed as usual. And indeed, interoperability is important here— on the far side of the changes proposed, I would expect our distribution.yaml to be a mix of release stanzas referencing GBP repos (for upstream/open stuff) and then internal repos.

@mikepurvis
Copy link
Contributor Author

mikepurvis commented Dec 22, 2016

A combination of option 1 and option 3 (call it 5) would be to add a new attribute to the source stanza, but not otherwise change much about how cache building or rosinstall_generator works, eg:

control_msgs:
  source:
    type: git
    url: https://github.com/ros-controls/control_msgs.git
    version: indigo-devel
    release_version: 1.3.1

In a world where this is a valid distribution.yaml, and we want to generate a "latest releases" build from a file which looks like this, the process would be:

  1. Clone the rosdistro as-is.
  2. Switch the version strings to be the release version strings.
  3. Update a local rosdistro cache to catch cases where the package XMLs have changed between current development and latest release.
  4. Do a rosinstall_generator --upstream-development as with a latest devel build, but now it's the release tags.

This has the big advantage of requiring very little modification to the existing tooling— indeed, it would even be possible for us to just have an out-of-band distribution-releases.yaml file with this extra info, but that would be much less convenient than having it right there in the distribution.yaml.

I guess the main question would be what follow-on changes would be possible/desirable to take better advantage of a new source/release_version attribute? Some possibilities:

  • When bloom receives a custom tag/hash at release time, it could populate the field in the resulting PR so that information is not lost as it current is.
  • Make rosdistro_freeze_source --release-tag use the field if available instead of guessing from release version as it does today.
  • Make rosinstall_generator --upstream use the field if available instead of guessing from the release version as it does today.

@dirk-thomas
Copy link
Member

Should it be possible to have a "normal" release entry in parallel to the latest release tag name? If yes, that would rule out option 2.

I'm not totally sure what the scenario is here; could you supply an example?

E.g. I have released a repo using bloom but used crazy tag names somehow. If the distribution file would still let me mention that release tag name in the upstream repo that could be helpful. Then the user can decide where to get the code from (gbp, source-tag, source-branch).

@dirk-thomas
Copy link
Member

Adding an additional release_version attribute to the source entry (which must point to either a tag or hash) sounds good. But independent from how we specify this shouldn't the cache also contains the pkg locations and manifests for that case (since they might be different from the version branch)? This information would be necessary for the same reasons as why we have a cache for the source part in the first place.

@mikepurvis
Copy link
Contributor Author

mikepurvis commented Dec 22, 2016

I have released a repo using bloom but used crazy tag names somehow.

Yes, a perfect example of this is geographic_info, as given in ros-infrastructure/rosinstall_generator#33. So with release_version, its distribution entry would be like the following, which would allow rosinstall_generator --upstream to do the right thing for this repo (it currently does not):

  geographic_info:
    release:
      packages:
      - geodesy
      - geographic_info
      - geographic_msgs
      tags:
        release: release/indigo/{package}/{version}
      url: https://github.com/ros-geographic-info/geographic_info-release.git
      version: 0.4.0-0
    source:
      type: git
      url: https://github.com/ros-geographic-info/geographic_info.git
      version: master
      release_version: geographic_info-0.4.0

The cost of caching the release_version package XMLs is probably not terribly high, especially if in the public ROS case there aren't many repos where it is even used. And of course they would deduplicate well, just as the source/release ones do already, so the storage cost wouldn't be that high either.

My only concern about trying to do that right off the bat is that we'd be pretty much adding a new, third code path to the rosinstall_generator/rosdistro_build_cache logic, whereas the proposal above allows such an attribute to be added and useful immediately, with an expectation that a user who needs "perfect" package XML resolution will need to rebuild a local "source releases" cache in order to get it.

@dirk-thomas
Copy link
Member

The cost of caching the release_version package XMLs is probably not terribly high, especially if in the public ROS case there aren't many repos where it is even used. And of course they would deduplicate well, just as the source/release ones do already.

Once bloom will add / update release_version when it generates the release PR I would expect this to change and sooner or later all released will have a release_version (probably some even without having a source.version.

My only concern about trying to do that right off the bat is that we'd be pretty much adding a new, third code path to the rosinstall_generator/rosdistro_build_cache logic, whereas the proposal above allows such an attribute to be added and useful immediately, with an expectation that a user who needs "perfect" package XML resolution will need to rebuild a local "source releases" cache in order to get it.

You are right that the decision about caching or not is not tight to actually storing the tag name somehow. I would just expect the information to be useful for the same reasons why you added source caching in the first place. Could the operations be done without a cache? Sure, but it would likely be painfully slow and resource intensive 😉

This would be the rought outline for adding source.release_version:

  • write an REP about the proposed change (could be done after having prototyped the implementation
  • python-rosdistro needs to read this new optional attribute and expose it in the API
  • it needs to be checked if python-rosdistro as well as other tools (e.g. ros_buildfarm) are able to handle the case where a source entry contains a type, url, and release_version but no version correctly.
    • if the tool any tool needs to be modified and deployed to all users before starting to add these data to the distribution file that might slow down the time it can be rolled out
  • optionally add caching for the release_version

Anything I am missing?

@mikepurvis
Copy link
Contributor Author

Plan sounds good.

Keeping the existing version attribute as compulsory would considerably simplify matters, and avoid any possibility of a regression. If we do want to make it optional, perhaps there's a way to design the Python side of the rosdistro API to minimize breakage, eg in Repository, make the setup something like:

    self.doc_repository = DocRepositorySpecification(self.name, doc_data) if doc_data else None
    self.release_repository = ReleaseRepositorySpecification(self.name, release_data) if release_data else None
    self.source_repository = SourceRepositorySpecification(self.name, source_data) if source_data and source_data.version else None
    self.source_release_repository = SourceReleaseRepositorySpecification(self.name, source_data) if source_data else None

That diverges the Python API from the yaml structure, perhaps to an unacceptable degree, but it does also maintain the present contract that source_repository will only be an object that has a valid version and url.

(Another, less intense workaround might be to simply return the value of release_version in the API for version, if version is unset.)

@mikepurvis
Copy link
Contributor Author

To provide an update here: We've had some more discussions internally, and things are still a bit in flux, but I'm not sure we'll be able to make the (not insignificant) investment in tooling proposed above. The reality is, option 3 gets us where we need to go, and although it is inefficient to do it in the general git case (a local clone is required), all three of Github, Gitlab, and Bitbucket have simple API calls to grab a list of commit hashes on a branch, so the actual cost of looking up the most recent tag is pretty low.

https://developer.github.com/v3/repos/commits/#list-commits-on-a-repository

https://developer.atlassian.com/bitbucket/api/2/reference/resource/repositories/%7Busername%7D/%7Brepo_slug%7D/commits

https://docs.gitlab.com/ce/api/commits.html#list-repository-commits

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants