You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
To use the function timer add the argument --use_timer=True to your terminal command. Ex: python input.py --use_timer=True or python input.py --use_timer=True --mode=numba.
The timing functionality uses wrappers, located in decorators.py, to place timing calls before and after a function call. The wrapper must go into objmode() to use call MPI.Wtime() for any numba-fied function. This places significant overhead on small functions that are called many times.
As jpmorgan98 listed. There are two TODO items:
The timing calls are not configured for multiple processors. So each processor will record its own times but there is no compilation at the end.
There is significant discrepancy between the decorator and native runtime reports when in numba mode. See the example outputs below
Below are example outputs from the Slab Absorbium example with N=1e5. The first image is from a pure Python run, notice the discrepancy between the native runtime report and decorator runtime report is 1.0 seconds for a 244 second run.
Here is an output of the same problem in Numba mode. In this case, the discrepancy is 92.8 seconds. The decorator seems to be ignoring compilation time? But I'm not sure how that's possible.
Adding some kind of timer for individual functions to more readily identify hotpots in both Numba and Python mode. Working fork and branch
Obj mode in numba screws us over and the timing doesn't seem to be correct when comparing total wall clock runtime.
Things todo:
The text was updated successfully, but these errors were encountered: