Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Which corpora were used for training? #1

Open
Menschenkindlein opened this issue Apr 23, 2021 · 0 comments
Open

Which corpora were used for training? #1

Menschenkindlein opened this issue Apr 23, 2021 · 0 comments

Comments

@Menschenkindlein
Copy link

Hi,
I'm trying to train a model with comparable performance. I'm using ontonotes 5 and questionbank, converted using clearnlp-3.1.2, but my model is significantly worse in many cases. It is especially clear on questions. As I understand it, it's because questionbank has simpler constituency tags (e.g., just NP, instead of NP-SBJ-2), which prevents high-quality conversion to dependency. Did you use manually annotated/reviewed corpora? Is it possible to achieve the same level of quality using only open corpora?
Thank you in advance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant