Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New attributes for license tag #347

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

ralph-lange
Copy link
Contributor

... as basis for automatic creation of copyright files in binary packages according to https://www.debian.org/doc/packaging-manuals/copyright-format/1.0/.

Questions to be discussed in this PR in particular:

  • Is the copyright attribute really required? If vendor packages would include the third-party source code directly, the copyright attribute would not be required.
  • Should this extension go into the existing version 3 (and thus REP-0149) or should the version number be increased?

Copy link
Contributor

@clalancette clalancette left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea overall, but I do think the implementation needs some changes to make it more generally useful.

rep-0149.rst Outdated
@@ -346,8 +346,8 @@ Example
</description>
<maintainer email="[email protected]">Someone</maintainer>

<license>BSD</license>
<license file="LICENSE">LGPL</license>
<license file="LICENSE">BSD-3-Clause</license>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One suggestion I have here is to add a type attribute to the license tag. The type attribute would describe what kind of license tag is being used, either an SPDX identifier, or a freeform identifier. If no type attribute is specified, we would assume freeform.

Thus you could have:

  • <license>BSD</license> - a common license tag today, which can be assumed to be a "freeform" license tag (and thus nothing can mechanically be determined about it)
  • <license type='freeform'>BSD</license> - which is the same as above but more explicit
  • <license type='spdx'>BSD-3-Clause</license> - which is very explicitly using the SPDX identifier, and thus can be mechanically verified

With this in place, we could put in linters (for instance) that would complain when the "type" is SPDX, but the actual text doesn't match a known SPDX identifier. We could also have linters that would complain if any packages in the workspace don't use the "spdx" type (though we'd have to do a lot of work to enable that by default).

Does that make sense?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it not be simpler to require the use of SPDX identifiers?

Other efforts to standardise licensing information (like REUSE) have gone that way and it makes parsing and validation much more straightforward.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally, yes, we would require SPDX identifiers.

The issue is backwards compatibility. There are thousands of ROS (1 & 2) packages out there in the wild, and almost none of the package.xml files in them follow SPDX. Explicitly putting a type field in will make it easier for tools (and humans) to determine whether the field is expected to be SPDX or not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of introducing a type attribute. There are many ROS 1 packages that simply specify BSD as license identifier, which is not enough information to infer the exact license.

which can be assumed to be a "freeform" license tag (and thus nothing can mechanically be determined about it)

I would be a bit more optimistic for the linter and in particular for generating a debian/copyright file:

  1. <license file="path_to_license.txt">BSD</license>: This is enough information to generate a Debian copyright file. The generator (and the linter) should check whether path_to_license.txt really contains a BSD license text.
  2. <license>BSD</license> (or <license source-files="*">BSD</license>) with a file LICENSE in the root of the repo: Again, this is enough information to generate a Debian copyright file in my opinion.
  3. <license>Some SPDX full name or SPDX identifier</license>: Again enough information to generate a Debian copyright file (and to validate it against a potential LICENSE file in the package or repo root).
  4. <license file="custom_license_text.txt">MyCustomLicenseIdentifier</license>: Also this would allow the generation of a debian/copyright file according to https://www.debian.org/doc/packaging-manuals/copyright-format/1.0/. ("If there are licenses present in the package without a standard short name, an arbitrary short name may be assigned for these licenses. These arbitrary names are only guaranteed to be unique within a single copyright file.")

Even the case that a debian/copyright file is generated where the license short name does not match the license text I do not consider dramatic: It simply repeats the inconsistent license names and texts from the source code and should be discovered by relevant tools applied to the debian/copyright file.

As a consequence, the type="spdx" attribute is not really necessary but its use would introduce a safety layer that prevents that a typo in an SPDX license identifier leads to the interpretation as a custom license identifier (i.e., as case 4).

What we must prevent in any case is that the future debian/copyright file generator refines the license specification for which there is no evidence - which is the case for some ROS 1 packages that specify <license>BSD</license> without any further license headers or LICENSE file.

Copy link
Contributor

@gavanderhoorn gavanderhoorn Apr 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally, yes, we would require SPDX identifiers.

The issue is backwards compatibility. There are thousands of ROS (1 & 2) packages out there in the wild, and almost none of the package.xml files in them follow SPDX. Explicitly putting a type field in will make it easier for tools (and humans) to determine whether the field is expected to be SPDX or not.

but don't we have the same problem with the attributes and elements proposed here? All the packages in existence today would also not have those.

if the package format version would be increased, tools could just assume "v4 manifest->spdx".

Anything below that does not (have to) conform to that.

If the concern is "there may be licenses in use for which there are no SPDX identifiers", then SPDX has a standard approach for that in place IIRC.


Edit: absence of the type attribute could of course also be used, but it seems much cleaner to me to be able to say "format 4 manifests must use SPDX identifiers" than "if the license tag doesn't have the type attribute then it's OK to not use SPDX identifiers".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for the type attribute.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for completeness: The SPDX Specification also describes a solution for naming custom licenses (cf. https://spdx.dev/spdx-specification-21-web-version/#h.1v1yuxt), but I still prefer the explicit type attribute to distinguish between licenses from the SPDX License List and custom ones.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

type attribution with default looks good to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Introduced the type attribute as proposed by @clalancette. (Sorry, I should have updated the PR before yesterday's discussion in the TSC meeting.)

Is the description of the new attribute sufficient or should I give explicit explanations of the two possible values freeform and spdx?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@clalancette, are you fine if I mark the discussion on the type attribute as resolved, cf. my last question on the explanation of the attribute in the REP?

rep-0149.rst Show resolved Hide resolved
rep-0149.rst Outdated Show resolved Hide resolved
@nuclearsandwich nuclearsandwich self-requested a review April 21, 2022 19:59
@ros-discourse
Copy link

This pull request has been mentioned on ROS Discourse. There might be relevant details there:

https://discourse.ros.org/t/ros-2-tsc-meeting-minutes-2022-04-21/25293/1

@ros-discourse
Copy link

This pull request has been mentioned on ROS Discourse. There might be relevant details there:

https://discourse.ros.org/t/ros-2-tsc-meeting-minutes-2023-01-19/29423/1

rep-0149.rst Outdated
@@ -346,8 +346,8 @@ Example
</description>
<maintainer email="[email protected]">Someone</maintainer>

<license>BSD</license>
<license file="LICENSE">LGPL</license>
<license file="LICENSE">BSD-3-Clause</license>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

type attribution with default looks good to me.

rep-0149.rst Outdated Show resolved Hide resolved
@ros-discourse
Copy link

This pull request has been mentioned on ROS Discourse. There might be relevant details there:

https://discourse.ros.org/t/ros-2-tsc-meeting-minutes-4-20-2023/31087/1

rep-0149.rst Show resolved Hide resolved
<https://www.debian.org/doc/packaging-manuals/copyright-format/1.0/>`_.
Consequently, more general tags should be given first.

``type="LICENSE-IDENTIFIER-TYPE"`` *(optional)*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better to eliminate this type attribute. I haven't seen it in any other package managers. We can still state that the license SHOULD be an SPDX identifier (and linter can throw a warning).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that this is important to have as a way to suppress linter errors. In particular, I think that we should change it to be spdx is the default type. As that should be our recommendation. But if you have a non-spdx license you can set it to be "freeform" explicitly and then the linter will be suppressed and not complain about the string not matching spdx.

This will encourage spdx by it being the shortest path to usage. And if they aren't using an spdx license people can be given a linting warning. And the fix for the linter is to either fix their license declaration to be spdx compatible, or declare that it's freeform with the non-default type attribute.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If spdx is the default that works 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am also in favor of making spdx the standard. @clalancette understandably argued that this would lead to very many error messages for older packages.

Idea: We make freeform the default (as currently stated in this PR and approved by the TSC yesterday) but make a prominent info message in the linter tooling if the type-attribute is not specified explicitly and the license identifier is not from the SPDX list. Furthermore, we raise awareness for the license documentation topic by a Discourse post and a ROSCon talk. Then, in about a year, we can consider changing the default of the type attribute to SPDX (by a new PR to this REP) to force the maintainers of the remaining packages to resolve/clarify the license documentation. What do you think, @amacneil, @clalancette ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for doing a followup/tick tock process to make it the default with strong linting/warning. But lets close this out first.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the way I would do this is the following:

  1. Make freeform the default for now.
  2. Raise awareness with the Discourse post and ROSCon talk.
  3. Update the core packages (everything in https://github.com/ros2/ros2/blob/rolling/ros2.repos) to use SPDX.
  4. Add in a linter warning in ament_copyright for packages that don't specify the type and the identifier is not in SPDX.
  5. Later on, consider changing the default to SPDX.

@ros-discourse
Copy link

This pull request has been mentioned on ROS Discourse. There might be relevant details there:

https://discourse.ros.org/t/ros-2-tsc-meeting-minutes-2023-05-18/31587/1

should be a very rare case - have to be replaced by a ``?`` in the filename
pattern as described at the end of Section 6.9 of the `Machine-readable
debian/copyright file specification V1.0
<https://www.debian.org/doc/packaging-manuals/copyright-format/1.0/>`_.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this required? Since you describe above that multiple space-separated paths are not allowed, I don't understand why we cannot support spaces in paths.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's how the Machine-readable debian/copyright file specification V1.0 states it. We could do differently, but then we would make the process how the package.xml is translated to the copyright file less transparent. And I consider spaces in source file/folder names an absolute exception.
@ct2034 has built a linter for the new package.xml format, which will of course include a check for space characters.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we would simply replace space characters by ? during the generation of the copyright file, this could even change the set of matched files. This must be prevented. Therefore I would force those developers, who want to use space characters in the filenames/paths, to think about it themselves.

@ros-discourse
Copy link

This pull request has been mentioned on ROS Discourse. There might be relevant details there:

https://discourse.ros.org/t/ros-2-tsc-meeting-minutes-2023-06-15/32038/1

@ros-discourse
Copy link

This pull request has been mentioned on ROS Discourse. There might be relevant details there:

https://discourse.ros.org/t/changes-in-the-package-xml-about-licensing/32118/1

@ros-discourse
Copy link

This pull request has been mentioned on ROS Discourse. There might be relevant details there:

https://discourse.ros.org/t/changes-in-the-package-xml-about-licensing/32118/3

@peci1
Copy link
Contributor

peci1 commented Jul 3, 2023

Should this extension go into the existing version 3 (and thus REP-0149) or should the version number be increased?

I think increasing schema version is inevitable. Current catkin with format 3 cannot read the files with added type attribute.

@ralph-lange ralph-lange requested a review from tfoote July 9, 2024 09:43
Copy link

@ct2034 ct2034 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am happy with these changes. They match what was accepted in the TSC Meeting 2023-06-15. What are the next steps to get this merged?

@clalancette
Copy link
Contributor

I am happy with these changes. They match what was accepted in the TSC Meeting 2023-06-15. What are the next steps to get this merged?

In short: I think a big mistake was made here, and I was a major part of that mistake.

During the discussion for this REP, I pushed hard for making these changes without bumping the package.xml version number. My thinking at the time is that avoiding this bump would save us the (substantial) work across the ROS tooling to make the change.

However, I have been convinced in the meantime that it is a bad idea to add in new features like this, even in a backwards-compatible way. It's just too unclear whether a compliant implementation is using the "new" standard, or the "old" one.

Thus, I think this needs to be revisited with a thought to bumping the package.xml version format. I'm very sorry for my big mistake here, but I do think that we need to make this correct before implementing and merging this.

@ralph-lange
Copy link
Contributor Author

Hi @clalancette, no problem. Please let us know what has to be done for bumping the package.xml version number. Changing this PR to create a new REP document instead of updating REP-149, right? Plus a decision request to the ROS PMC?

@destogl
Copy link
Contributor

destogl commented Jan 15, 2025

Is there any news about next steps on this proposal?

@ct2034
Copy link

ct2034 commented Jan 15, 2025

Thanks for the ping @destogl. I wrote a summary in ros-infrastructure/bloom#709 (comment). I think I will now try to bring this topic in the Infra and / or ROS PMC. I will update you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.