-
Notifications
You must be signed in to change notification settings - Fork 586
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Working example of fwd_gradient #27
Comments
Yes, we have a pending PR for this. |
Thanks I'll take a look at it. Meanwhile do you have some high level docs or maybe can you explain in a few words which parts of the FD scheme are parallelized on the GPU, since it's mostly a sequential algorithm where you need to loop over each timestep and solve the tridiagonal system step by step as the solution of N+1 depends on the solution of N.Do you constraint the types of schemes to be used (maybe just implicit) or? I was reading through the code but it's not immediately apparent. Thanks! |
The primary benefit of the GPU probably comes from the tridiagonal solver. We wrote a GPU kernel for that which we upstreamed to TensorFlow core (https://www.tensorflow.org/api_docs/python/tf/linalg/tridiagonal_solve) The rest of the computation is pretty serial as you say. |
@saxena-ashish-g do you have more details on the implementation or the cuda kernel? I couldn't see it in the TF repository |
It is located under tensorflow/core/kernels/tridiagonal_solve_op.cu.cc but you can't find it on github because of the size of the kernels directory which causes truncation in the directory listing. If you clone the repo, you should be able to find it there. Where it ought to be: |
Hi Feras, GPU kernel for tridiagonal solver is imported from cuSparse : As I remember, the solver should be this one: As for the forward gradient, we had to have a crack on differentiating the while loop to make it performant (as I assume you are interested in Greeks for Monte Carlo). I'll try to send a hack colab version tomorrow but it will take some time before we check it in properly. |
Thanks Cyril, I took a look but the Nvidia docs don't suggest the algorithm used for the solver, so although the underlying library is clear im still not sure how they made an inherently iterative scheme parallelized I guess it's just parallel in the batch dimension, not the actual computation? My use case is for the FD scheme no monte carlo yet, I just want to compare the accuracy of derivatives computed off the grid directly vs ones computed numerically. I assume that when doing the fwd_gradient the gain is that we get delta, gamma,vega,theta,rho at once rather than having to do 5 numerical estimates, each requiring 2 evaluations (or this is not how autograd works). I'm also working on comparing that to 2 other probabilistic solutions -> a neural network from inputs -> price (works reasonably well for the BS case, not yet validated for Heston or fBM models) and a more of neural PDE solver approach, so using tf-quant mostly as a data generator to test convergence on training data generated by it and by quantlib for cross reference. |
Seems they use Cyclic reduction on a GPU since the documentation for As for the gradient, you can differentiate through the PDE solver but it is not cost free. Say, Please pay attention that the gradients need to be aggregated (see the pull request Ashish mentioned above) Here is an example of gradient computation: import tensorflow as tf
import tf_quant_finance as tff
dtype = tf.float64
d = tf.constant([1, 1, 1], dtype=dtype)
super_d = tf.constant([0.1, 0.1, 0.1], dtype=dtype)
sub_d = tf.constant([0.2, 0.2, 0.2], dtype=dtype)
rhs = tf.constant([2, 2, 2], dtype=dtype)
# This is the function you are going to differentiate.
def f(super_d):
return tf.linalg.tridiagonal_solve([super_d, d, sub_d], rhs, diagonals_format='sequence')
# Here is a gradient with aggregation df/d super_d[0] + df/d super_d[1] + df/d super_d[2]
tff.math.fwd_gradient(f, super_d)
# Outputs : <tf.Tensor: shape=(3,), dtype=float64, numpy=array([-1.31076389, -1.47569444, 0.29513889])>
#which equals to the limit of
e = 0.00001
(f(super_d + e) - f(super_d)) / e
# Outputs: <tf.Tensor: shape=(3,), dtype=float64, numpy=array([-1.31074852, -1.47570059, 0.29514012])>
# Here is a gradient without aggregation, simply df/d super_d[0]
tff.math.fwd_gradient(f, super_d, input_gradients=tf.constant([1, 0, 0], dtype=dtype))
# Outputs : <tf.Tensor: shape=(3,), dtype=float64, numpy=array([-1.48871528, 0.30381944, -0.06076389])>
#which equals to the limit of
e = 0.00001
e1 = tf.constant([e, 0, 0], dtype=dtype)
(f(super_d + e1) - f(super_d)) / e
# Outputs: <tf.Tensor: shape=(3,), dtype=float64, numpy=array([-1.48871838, 0.30382008, -0.06076402])> Hope this helps. Do not hesitate to reach out. |
Thanks @cyrilchim , your example is actually a lot more detailed than the PR, but since the solver returns some kind of tuple how do you differentiate without aggregation through it? I assume you return
Another side question that I have is that it seems the grid is fixed per batch, regardless of what vol, timestep and spot are present in the batch. What scheme do you use internally to guarantee convergence and avoid ill defined grids, or that's up to the caller to ensure ? (which would depend on the scheme you use) |
Sorry if this is a bit off-topic, but is there any reason that JAX has not been leveraged for these functionalities here? I have found it very straightforward to implement the things discussed here with JAX, and all of the option greeks with batching and GPU functionality. My understanding was that there was already quite a bit of cross-collaboration between JAX and TensorFlow as well. There is a brief example shown on this Towards Data Science post. |
Thank you, @Shellcat-Zero Indeed, JAX has native forward-mode differentiation. It is definitely possible to add this functionality to TensorFlow but it requires quite a bit of effort. For now you could use It is indeed true that JAX and TensorFlow have a lot of overlap (there is even JAX to TF converter) as they both can use XLA compiler. We stick to TensorFlow as it has a lot of production level infrastructure (e.g., TF serving), community support, as well as more core ops (for example, I am not sure if JAX has tridiagonal solver that is used to for numerical PDE methods) |
Hi @cyrilchim ! As the Monte Carlo algorithm mainly use |
Hi @luweizheng, Yes, you can do forward gradient through the while loop. You can do some experimenting in the colab example here. It can be more efficient/beneficial to use custom for_loop for which we have a better gradient implementation. Example of using the custom loop is in the aforementioned colab. |
Thank you so much for all these information. @cyrilchim |
That is about correct. Just a few notes:
Hope this helps |
Do you have a running example of tff.math.fwd_gradient either in the notebook or somewhere in the tests, only saw a very contrived one in https://github.com/google/tf-quant-finance/blob/39a1c03cc9938c6e08fceb63e405eff0fda835be/tf_quant_finance/math/integration/integration_test.py
Mainly for the purposes of calculating:
a) first order greeks
b) higher order greeks
The text was updated successfully, but these errors were encountered: