Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move CUDA-dependent packages to a separate repository #3135

Open
3 tasks done
Tracked by #3222
esteve opened this issue Mar 22, 2023 · 14 comments
Open
3 tasks done
Tracked by #3222

Move CUDA-dependent packages to a separate repository #3135

esteve opened this issue Mar 22, 2023 · 14 comments
Labels
status:stale Inactive or outdated issues. (auto-assigned) type:new-feature New functionalities or additions, feature requests.

Comments

@esteve
Copy link
Contributor

esteve commented Mar 22, 2023

Checklist

  • I've read the contribution guidelines.
  • I've searched other issues and no duplicate issues were found.
  • I've agreed with the maintainers that I can plan this task.

Description

As discussed in autowarefoundation/autoware#3222 (comment), in order to be able to build Autoware.Universe with the available packages in the ROS index, we decided to move those packages that depend on CUDA to a separate repository

Purpose

Be able to compile a subset of the Autoware.universe without depending on CUDA

Possible approaches

Move the the packages that depend on CUDA to a separate repository

Definition of done

There are no packages that depend on CUDA in Autoware.universe

@miursh
Copy link
Contributor

miursh commented Mar 23, 2023

Is there any options not to separate CUDA dependent packages?
Because of reasons below, I would not want to separate just a part of perception packages

  • If some of perception packages are placed in the different repository, it is very confusing for developers.
    • For example, when we want to add some change, we might need to make 3-4 PRs to different repository, such as new perception repo, autoware.unvierse, autoware_launch, and perception utils (new repo too?)
  • Very difficult to understand especially for new users
  • There might be some dependency between autoware.universe and separated packages. If there is many dependency, github CI might not work well.

@xmfcx
Copy link
Contributor

xmfcx commented Mar 24, 2023

Name of the new cuda packages repository

What are your naming choices about this new repository?

I'd go for cuda_autoware_pkgs, what are your suggestions?

Make naming consistent

Also after moving the packages, we should make the naming conventions uniform too.

And for that my proposal is:

  • cuda_trt_package_name
  • cuda_package_name (we don't have any cuda only packages yet, just convention)
  • cuda_tvm_package_name (this is also preserved for future, we will have a wide discussion on what to do with tvm packages in general soon)

Handling of TVM packages

Also we should move tvm packages to tvm_autoware_pkgs and they will contain tvm packages for non-cuda targets.

This topic will also be discussed by @ambroise-arm @angry-crab @mitsudome-r @esteve @kenji-miyake and me.

to separate or not to separate

Edit: @miursh @yukkysaito I've just saw your comment after refreshing. Yes, I will think about not separating too. I understand the concerns, this is an issue with all separated repositories.

@BonoloAWF BonoloAWF added the type:new-feature New functionalities or additions, feature requests. label Mar 24, 2023
@BonoloAWF BonoloAWF added this to the Mar - Apr Milestone milestone Mar 24, 2023
@kenji-miyake
Copy link
Contributor

I'm sorry for joining late.

To give you some additional information, splitting repositories helps us reduce CI time. There is no need to test all packages with CUDA.

container-suffix:
- ""
- -cuda

But I understand splitting repositories may increase maintenance costs and troubles.

I think it's okay not to split repositories right now, we should do the following at least:

I'll discuss how to proceed with this also in TIER IV.

@esteve
Copy link
Contributor Author

esteve commented Apr 18, 2023

@kenji-miyake another reason for splitting the CUDA-related packages to another repository is because when/if we submit Autoware to the ROS buildfarm that Open Robotics maintains (https://build.ros.org/), all the packages in the repository are submitted and they need to be distributable and not covered by a proprietary license.

However, following @mitsudome-r 's rationale of only submitting Autoware.core and not all of Autoware.universe to the ROS buildfarm, we can either leave CUDA packages in universe, or move them in a separate repository, but we can't migrate them to Autoware.core.

@kenji-miyake
Copy link
Contributor

However, following @mitsudome-r 's rationale of only submitting Autoware.core and not all of Autoware.universe to the ROS buildfarm, we can either leave CUDA packages in universe, or move them in a separate repository, but we can't migrate them to Autoware.core.

Yes, I think so, too.

@kenji-miyake
Copy link
Contributor

As a result of TIER IV's internal discussion, we'd like to propose the following actions:

  • To resolve the build issues in Debian package release, we'll rename existing packages, and don't move packages to another repository yet.
    • Having a unified naming convention can help us ignore specific packages in CI.
    • Moving packages to another repository has significant impacts, so we need step-by-step improvements.

For that,


@xmfcx @esteve cc @mitsudome-r What do you think about this?
If it looks good, we'd appreciate it if @esteve -san would work on these tasks as part of the Debian package issue.

Regarding the long-term milestones, I think as follows:

  • We should continue discussing how we should utilize TVM.
  • Each one should try to gain a deeper understanding around TVM through learning/experiments.

@xmfcx
Copy link
Contributor

xmfcx commented Apr 24, 2023

I agree that we can start by renaming.

  • For CUDA/TensorRT packages, use _trt.
  • For TVM packages, use _tvm.

I'd prefer prefix over suffix. But I'm OK either way.

Also merging of _nodes double-packages will simplify the maintenance in general.

@esteve
Copy link
Contributor Author

esteve commented May 15, 2023

I agree that we can start by renaming.

  • For CUDA/TensorRT packages, use _trt.
  • For TVM packages, use _tvm.

I'd prefer prefix over suffix. But I'm OK either way.

I also agree with using suffixes, makes it clear that they are an extension to the base packages or that they are based on other packages.

Also merging of _nodes double-packages will simplify the maintenance in general.

I'm afraid I don't follow you here, can you elaborate? Thanks.

@xmfcx
Copy link
Contributor

xmfcx commented May 15, 2023

@esteve
Copy link
Contributor Author

esteve commented May 15, 2023

@xmfcx thanks for elaborating.

I can't clearly remember what was the rationale for having _nodes packages, I think it was inherited from Autoware.Auto because Apex.AI wanted the core functionality separate from the node code because it was incompatible with Apex.OS, but I might be making this up, I don't remember.

Anyway, I agree with you, it'd simplify our tree, especially now that we're enforcing all nodes to be components, so they all have uniform instantiation.

@xmfcx
Copy link
Contributor

xmfcx commented May 15, 2023

I can't clearly remember what was the rationale for having _nodes packages, I think it was inherited from Autoware.Auto because Apex.AI wanted the core functionality separate from the node code because it was incompatible with Apex.OS, but I might be making this up, I don't remember.

From what I remember, it was to separate ROS libraries from the rest of the packages. And it's a good practice in theory because the library can also be reused by other packages too. But in the implementation, this wasn't adhered strictly. So I think it lost its purpose along the way.

Found the source:
https://autowarefoundation.gitlab.io/autoware.auto/AutowareAuto/contributor-guidelines.html#autotoc_md24

This design pattern helps to promote separation of concerns and code re-use. The core and the ROS node(s) can be implemented in separate packages; e.g. foo and foo_nodes. There are some trivial cases where a simple ROS Node that does not require a "core" are acceptable but these should be the exception, not the rule.

Also I used to be 3-tier before (but I think it's just a matter of semantics for the executable):
https://gitlab.com/autowarefoundation/autoware.auto/AutowareAuto/-/merge_requests/255#note_316711298

@stale
Copy link

stale bot commented Jul 15, 2023

This pull request has been automatically marked as stale because it has not had recent activity.

@stale stale bot added the status:stale Inactive or outdated issues. (auto-assigned) label Jul 15, 2023
@esteve
Copy link
Contributor Author

esteve commented Jan 21, 2025

@xmfcx @idorobotics @mitsudome-r sorry for reviving a stale issue, but it's not clear to me whether we reached an agreement here. I'm working on adding support for building CUDA-enabled packages in our buildfarm and it'd be good to sort this topic out. Thanks.

@xmfcx
Copy link
Contributor

xmfcx commented Jan 21, 2025

How to structure the separated CUDA packages. The reason to separate CUDA packages is because CUDA is hardware specific.

For both Universe and Core, 2 new repositories:

  • autoware/src/universe/autoware.universe_cuda
  • autoware/src/core/autoware.core_cuda (if necessary, in the future)

Additional:

  • We should update the installation instructions as well for separated packages.
  • Update the autoware.repos and autoware-nightly.repos files to have them.

Pros:

  • Easier to submit Universe to build farm
  • Easier to maintain CI
  • Easier to generate debian packages

Cons:

  • More work to maintain with Universe
    • Multiple PRs

@YoshiRi @miursh @youtalk @oguzkaganozt

In the future:

  • CUDA dependent packages should fail if there is no CUDA.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status:stale Inactive or outdated issues. (auto-assigned) type:new-feature New functionalities or additions, feature requests.
Projects
None yet
Development

No branches or pull requests

5 participants