Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leaks, CUDA errors, and the state of this project #67

Open
a3lem opened this issue Feb 28, 2023 · 0 comments
Open

Memory leaks, CUDA errors, and the state of this project #67

a3lem opened this issue Feb 28, 2023 · 0 comments

Comments

@a3lem
Copy link

a3lem commented Feb 28, 2023

Based on qualitative analysis, I concluded that Trankit's dependency parser is pretty amazing compared to alternatives (e.g. spaCy, Stanza, UDPipe), and I set about incorporating Trankit into a highly parallelized job to create a new treebank of Dutch. In the testing phase, everything worked smoothly, including on GPUs. As soon as I began to scale up to more data, async CUDA memory errors started to appear.

I've tried limiting the size of texts, calling cuda.empty_cache, wrapping things in 'redo'-type logic with a backoff interval... Every time, CUDA memory eventually ran out. In the end I gave up on trying to get it to run reliably on GPU. My job has been running, albeit much slower, on CPU for a week now. I'm starting to notice that my remaining memory is becoming smaller and smaller. So here, too, it seems there is some sort of memory leak.

I'm noticing most issues here have been going unanswered. My guess is that the clever person that originally worked on this has moved on to other things. I'd just like to express that I personally find it regrettable that a project which such great model accuracy has apparrently fizzled out. (I realize this might sound somewhat entitled.)

I'll be analyzing the code to see if I can't locate the source of the memory issues. For now, I propose that the docs add a caveat about the real-life performance issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant