-
Notifications
You must be signed in to change notification settings - Fork 572
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Undefined references when using Trilinos built with Trilinos_ENABLE_EXPLICIT_INSTANTIATION=OFF #13005
Comments
Actually, I take back what I wrote about shared libraries: they are being generated even with ETI turned off, it is just that no shared libraries without version suffix are being installed for some reason. After manually creating those symlinks and linking again to all the libraries, the list of undefined references is greatly reduced, only one issue remains (in multiple similar instances):
Any hints what I might be missing? |
After digging a bit, I found that the issue arises from some explicit static definitions within Tpetra. The following patch fixes the problem for me: diff --git a/packages/tpetra/core/compat/Tpetra_KokkosCompat_ClassicNodeAPI_Wrapper.cpp b/packages/tpetra/core/compat/Tpetra_KokkosCompat_ClassicNodeAPI_Wrapper.cpp
index 1a725fd6b59..b389be5ee36 100644
--- a/packages/tpetra/core/compat/Tpetra_KokkosCompat_ClassicNodeAPI_Wrapper.cpp
+++ b/packages/tpetra/core/compat/Tpetra_KokkosCompat_ClassicNodeAPI_Wrapper.cpp
@@ -1,6 +1,8 @@
#include <Tpetra_KokkosCompat_ClassicNodeAPI_Wrapper.hpp>
#include "Teuchos_ParameterList.hpp"
+#ifndef HAVE_TPETRA_EXPLICIT_INSTANTIATION
+
namespace Tpetra {
namespace KokkosCompat {
@@ -54,5 +56,5 @@ namespace KokkosCompat {
} // namespace KokkosCompat
} // namespace Tpetra
-
+#endif
diff --git a/packages/tpetra/core/compat/Tpetra_KokkosCompat_ClassicNodeAPI_Wrapper.hpp b/packages/tpetra/core/compat/Tpetra_KokkosCompat_ClassicNodeAPI_Wrapper.hpp
index ee06186d44e..70510b09f12 100644
--- a/packages/tpetra/core/compat/Tpetra_KokkosCompat_ClassicNodeAPI_Wrapper.hpp
+++ b/packages/tpetra/core/compat/Tpetra_KokkosCompat_ClassicNodeAPI_Wrapper.hpp
@@ -47,7 +47,11 @@ public:
KokkosDeviceWrapperNode () = delete;
//! Human-readable name of this Node.
+#ifndef HAVE_TPETRA_EXPLICIT_INSTANTIATION
+ static std::string name () { return "N/A without ETI"; }
+#else
static std::string name ();
+#endif
};
#ifdef KOKKOS_ENABLE_SYCL Would it be possible to have this (or some better solution, e.g. using |
@trilinos/tpetra |
@sekoenig How are you linking your project against Tpetra? I built with ETI off but I do see the OpenMP wrapper node
|
My exact linking setup is admittedly a bit intricate: since a (small) part of my project is written in Haskell, I funnel linking through ghc so that the Haskell runtime library is properly added. I also use MPI (OpenMPI on Debian testing on my development machine), and therefore I add the output of Initially I was using shared libraries, as part of tracking this problem I switched to static libs, it didn't change anything. From your output I suspected that perhaps I need libtpetraclassic, but checking all libtpetra... files now with I was testing so far with Trilinos 14.4, I suppose I should try the latest version to see if the issue is still there. |
Does libtpetraclassic contain any symbols? I looked briefly at the files containing |
@brian-kelley There are some symbols in libtpetraclassic.a, but not many:
I tested with Trilinos 15.1.0 on another machine with slighlty different configuration (Debian stable, MPICH), and found the same issue there. I will test with the latest github version also. |
@bartlettroscoe Do tests being disabled imply there is no support for that anymore? I did not find a note anywhere that this was being deprecated. For what I am doing (successfully, as of today), adding a custom Kokkos memory space, unfortunately it seems I need to disable ETI. |
@brian-kelley Following up, I still see the issue with yesterday's master branch from github (HEAD is at bf2943a), and my fix/workaround also still works. |
@sekoenig Could you share the way you're configuring Trilinos? The only thing I can think of is that these |
@brian-kelley Of course, here is how I call cmake:
I believe |
@sekoenig Is Tpetra even enabled in this build? (A list of enabled packages is printed during configure) It's possible that the libtpetraclassic you're linking is from a previous build with a different configuration. |
@brian-kelley Very good point, but it is being enabled (admittedly I don't know how exactly, in my previous build with ETI, I was setting Here's some relevant CMake output:
I was previously confused by those lines like |
@brian-kelley I forgot to add before: according to the file dates, the |
@sekoenig Your understanding of the I tried your exact config with master bf2943a but I still see the name() symbols:
Is there any possibility that the compiler is finding Kokkos headers from a different installation on your system? The only thing I can think of at this point is that KokkosCore_config.h (the file that defines macros like Adding this patch to Trilinos and rebuilding will trigger an error if
|
@brian-kelley Thanks for the patch! I applied it to my HEAD and get the following in the CMake output:
So that looks good, I think. I ran In case it matters:
The output from By the way, coming back to my original reason for turning off ETI, another question: if ETI is enabled, should it still work to have a project use template classes that were not explicitly instantiated at Trilinos build time? I tried that first, naturally, but got so many undefined references that I suspected I have to try without ETI. |
It depends on the package - KokkosKernels always lets you call functions with types that were not enabled in ETI (as long as the kernel supports the type combinations - for many things you can't just plug in any custom user-defined type). On the other hand, Tpetra and packages downstream of it generally don't allow it. They organize their internal headers differently (decl and def files) and the user-facing headers like Tpetra_CrsMatrix.hpp will only include the decl in an ETI build. This means the actual function definitions won't be available. |
I see, thanks for that background info! Then indeed I will need to keep working without ETI (which I actually didn't find as terrible in terms of build times as I thought it might be). |
Okay, so looking at Clearly I need to make and compare more builds. It is possible I was updating both my |
@brian-kelley After more testing, I think everything is actually fine. Here is my best reconstruction of what happened:
Very sorry to have you chase this for nothing! At least for me, I found the discussion here very useful, though, as it helped me understand some Trilinos concepts better. Thanks a lot for that! |
@bartlettroscoe While I think the issue here per se can be closed, I would like to come back to your comment. If building without ETI is already not actually supported anymore, or being considered for deprecation, I would like to request this to be reconsidered. For that, it may be good to explain my use case a bit more: Trilinos is a relatively late addition to my project, which was not written with Kokkos or Tpetra in mind. When I needed an efficient distributed linear solver, I found Belos, and I found it relatively easy to integrate it into my code, inserting just some relevant traits into At some point I started porting more and more aspects of my code to run on GPUs via CUDA, and I particularly like CUDA unified memory. Then I noticed that parts using Belos stopped working when Belos iterations were called using such UVM memory, presumably because Belos internally allocates memory that then my operator is applied to as part of the iteration, assuming CUDA unified memory but not getting that from Belos. My initial workaround was to always perform extra copies at each step, but now I finally decided I don't want to take the performance hit from that anymore (primarily because for a new use case it was particularly significant). I found the existing CudaUVMSpace and got excited, but then quickly I noticed that in order to use that I have to enable full CUDA support (fine) and also funnel all my code through For this to actually work, I had to inject a bit of boilerplate code into namespace like I realize from the fact alone that I have to inject template specializations into those Or you might dismiss my use case as just an ugly manifestation of Hyrum's Law ;) |
@sekoenig, your use case is one of the reasons that Trilinos supported
@sekoenig, we would need to take this before the Trilinos leadership team. I am just relaying what I learned as part of #12932 (see #12932 (comment)). @rppawlo and @sebrowne, what is the official Trilinos position on support for implicit template instantiation going forward (now that it is not being tested anymore)? What is the expectation when someone posts and issue where they used a configuration of Trilinos with @sekoenig, as an alternative, would it work if Trilinos provided a mechanism at configure time to declare your additional explicit instantiation sets (and provide includes and include directories to support these)? I don't know how this would get done but that is one option that I can think of. (But it could be quite a bit of work to get that into all of the instantiations in Trilinos so this could be a non-trivial ask.) |
We spent some time discussing ETI motivated by this ticket at the leadership meeting yesterday. As it stands, we currently do not test ETI=OFF in any testing (PR or nightly) and consider it an unsupported feature. There was no decision yesterday on whether to support or remove the flag. There will be continued discussion at the next leadership meeting and will probably bring it up at the next developer and sart meetings for input. If it is easy to add configure time support for added types, that would probably be best, but I'm not sure how well that would work in practice for every package. |
@bartlettroscoe, @rppawlo Thanks a lot for the discussion. It is reassuring to hear that my use case is in line with one of the motivations to support disabling ETI. I looked at the discussion #12932, which notes that for the particular use case mentioned there (arbitrary precision types), users eventually backed off from it due to problems. I hope that the details I provided above regarding my use case may provide additional perspective for the discussion. Regarding a mechanism to instantiate custom types at configure time, naively I suspect that would be very tedious and probably involve adding source files somehow when Overall, since the design of Kokkos, Tpetra, and downstream packages is so heavily based on templates, I would find it most natural to let users benefit from this design decision. The cost for testing with ETI disabled I admit though is a valid concern, so I think the minimum I would ask for is that at least the option to disable ETI remains, even if unsupported. That of course then leads to the question what to do when someone submits a PR that fixes some issue only manifest when ETI is disabled. From what I read. I think this is part of the discussion within the leadership team, and I know too little about it to comment further. I understand that in particular in Tpetra there are some template parameters that are not actually meant to be changed from their defaults (I ran into this for example when I wanted to use long in local ordinals so that I can link against ILP64 MKL BLAS/LAPACK libs). I also saw somewhere a discussion to move Tpetra in the long run towards using fewer template parameters. If that happens, then perhaps for the remaining ones the burden of supporting non-ETI builds is reduced. From what I saw while implementing my custom memory space, though, my impression is that there are some opportunities to streamline the design to make it easier for users to instantiate their own extensions. The Finally, if non-ETI support will be removed, I should think about alternatives for what I am doing now. I am going off topic here, but a very simple way to achieve what I want to do would be extending |
Question
@csiefer2, @bartlettroscoe, @srajama1:
In my code I am using Belos with Tpetra vectors and Teuchos helpers for solving large linear systems. To implement certain Kokkos extensions (I can elaborate if necessary), I would like to use Trilinos built with
Trilinos_ENABLE_EXPLICIT_INSTANTIATION=OFF
. Building itself works fine, and my code also compiles (slowly :), but then at linking I get an impressive host of 'undefined reference" errors. Basically it looks like none of the required templates actually gets instantiated. Previously, with ETI enabled and linking to the relevant libraries, everything worked fine.I assume there is something I am doing wrong. Since when ETI is disabled no shared libraries seem to be generated, I think there is nothing I need to link to. To me that makes sense because the templates should be instantiated when I compile my code, but am I correct? If so, is there anything specific I need to do in order to use Trilions without ETI? I tried searching for guidance, but could not quite find anything useful. It would be great if someone here could help me!
The text was updated successfully, but these errors were encountered: