Reproduce evaluation result
#5
by
r1ck
- opened
I ran the evaluation using the script you provided, but I obtained a much higher score than what you reported in the paper. Here are the details:
doc2dial
top-1 recall score: 0.5011
top-5 recall score: 0.8385
top-20 recall score: 0.9533
quac
top-1 recall score: 0.6002
top-5 recall score: 0.8651
top-20 recall score: 0.9660