Skip to content

Native utilities for datasets / TensorBoard? #1171

Answered by jheek
bhchiang asked this question in Q&A
Discussion options

You must be logged in to vote

There are some issues with a "JAX native" data loading pipeline. At its core tf.data is like a scheduler with buffers and tasks that run in parallel (map is not vectorizing like jax.vmap but instead parallelising over CPU threads).
Secondly, JAX doesn't support dynamic shapes and it isn't trivial to handle things like JPEG, audio, video formats etc.
TF has ops that support all these things natively.

PyTorch has the same issue. It provides a thin wrapper around multiprocessing which is just another library for scheduling tasks into a pool of threads/processes but PyTorch itself doesn't know how to parse a JPEG. The big difference is that TF embeds preprocessing into the TF graph so it's mo…

Replies: 2 comments 1 reply

Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
1 reply
@bhchiang
Comment options

Answer selected by bhchiang
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants