Vilém Zouhar's picture

4 8 9

Vilém Zouhar

zouharvi

·

https://vilda.net/

AI & ML interests

MT/NLG metrics, human evaluation, uncertainty

Recent Activity

upvoted a collection about 1 month ago

MT Sentinel Metrics

upvoted a collection about 1 month ago

✍️ QE4PE & GroTE

updated a model about 1 month ago

zouharvi/COMET-partial

View all activity

Organizations

zouharvi's activity

upvoted 2 collections about 1 month ago

MT Sentinel Metrics

Machine Translation (MT) metrics designed explicitly to scrutinize the MT meta-evaluation process’s accuracy, robustness, and fairness. • 7 items • Updated Dec 4, 2024 • 7

✍️ QE4PE & GroTE

Materials for "QE4PE: Word-level Quality Estimation for Human Post-Editing" • 3 items • Updated Mar 6 • 1

upvoted a paper about 1 month ago

QE4PE: Word-level Quality Estimation for Human Post-Editing

Paper • 2503.03044 • Published Mar 4 • 6

upvoted a collection about 2 months ago

COMET-early-exit

Models introduced in the paper Early-Exit and Instant Confidence Translation Quality Estimation https://github.com/zouharvi/COMET-early-exit • 4 items • Updated Feb 21 • 2

upvoted 2 papers about 2 months ago

We Can't Understand AI Using our Existing Vocabulary

Paper • 2502.07586 • Published Feb 11 • 10

Early-Exit and Instant Confidence Translation Quality Estimation

Paper • 2502.14429 • Published Feb 20 • 4

upvoted a collection about 2 months ago

PreCOMET

COMET-like models for MT evaluation that predict some scores given only the source segment. https://github.com/zouharvi/subset2evaluate • 8 items • Updated Feb 25 • 2

upvoted a paper about 2 months ago

How to Select Datapoints for Efficient Human Evaluation of NLG Models?

Paper • 2501.18251 • Published Jan 30 • 2