-
Notifications
You must be signed in to change notification settings - Fork 133
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RULEDST evaluation #227
Comments
We did not evaluate the rule DST solely since it needs dialog acts as input. If you want to compare rule DST with other DST models, you may use the golden dialog acts as input or use an NLU model such as BERTNLU to parse both user and system acts. |
I would like to use the output of BERTNLU as the input for the dst; however, it is not clear for me how to pass the data from one module to another, and I haven't find any code for that in convlab, for the moment. Could you kindly link the convlab's page where this is described, or provide me more information about this process? |
You can refer to the Colab tutorial or the interface class for nlu and dst. You can see PipelineAgent for how to build an agent with modules. Example usage: |
Thank you for the info. Nevertheless, the Colab tutorial refers to an overall evaluation (nlu + dst+ nlg). |
Sure. Just feed the output of NLU to DST: ConvLab-2/convlab2/dialog_agent/agent.py Lines 122 to 132 in ad32b76
|
From the code you posted it doesn't seem that the module is evaluated with F1 scores or a similar measure... perhaps I don't understand your point... |
Sorry, I thought you need instruction about how to pass the output of NLU to DST. If you want to evaluate NLU+DST, you can write a script to: 1) read the original data; 2) pass utterances to NLU to get the user dialog acts; 3) pass user dialog acts to RuleDST to get predicted state; 4) compare predictions with references |
refer to https://github.com/thu-coai/ConvLab-2/blob/master/convlab2/dst/evaluate.py for dst metric |
Ok, thanks, I'll try in this way! |
Hi, could you please provide more information on how the Rule DST module is evaluated?
Thanks
The text was updated successfully, but these errors were encountered: