Scandeval benchmarking error

by sarthu - opened Oct 3, 2024

Oct 3, 2024

I tried benchmarking the model on the dutch language benchmarks using scandeval library and I get the following error:
Benchmarking library: https://github.com/ScandEval/ScandEval

ferran-espuna

Language Technologies Unit @ Barcelona Supercomputing Center org Oct 14, 2024

•

edited Oct 14, 2024

Hello,

I'm trying to work out exactly what's the problem so I need to reproduce the error. What command are you running exactly? I tried running

 scandeval -m path/to/salamandra-2b-instruct/

But I get the following error:

Traceback (most recent call last):
  File "/venv/bin/scandeval", line 8, in <module>
    sys.exit(benchmark())

  File "python3.11/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)

  File "python3.11/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)

  File "python3.11/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)

  File "python3.11/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)

  File "venv/lib/python3.11/site-packages/scandeval/cli.py", line 342, in benchmark
    benchmarker(model=models)
  File "venv/lib/python3.11/site-packages/scandeval/benchmarker.py", line 781, in __call__
    return self.benchmark(*args, **kwargs)

  File "venv/lib/python3.11/site-packages/scandeval/benchmarker.py", line 597, in benchmark
    benchmark_output = self._benchmark_single(

  File "venv/lib/python3.11/site-packages/scandeval/benchmarker.py", line 730, in _benchmark_single
    dataset = dataset_factory.build_dataset(dataset_config)

  File "venv/lib/python3.11/site-packages/scandeval/dataset_factory.py", line 57, in build_dataset
    raise ValueError(
ValueError: Could not find a benchmark class for any of the following potential names: swerec, sentiment-classification, sequence-classification.

sarthu

Oct 14, 2024

•

edited Oct 14, 2024

The command I used was

scandeval --model <model-id>  --language nl

Also the error happens in the outlines library but the odd part is all others models are fine, only happens with this model

ferran-espuna

Language Technologies Unit @ Barcelona Supercomputing Center org Oct 14, 2024

Hello again,

I'm trying this but now I get a similar error in the same place:

ValueError: Could not find a benchmark class for any of the following potential names: dutch-social, sentiment-classification, sequence-classification.

Do you know what could be the problem?

Meanwhile, I found this discussion on a similar issue:
https://github.com/dottxt-ai/outlines/issues/820

Can you try doing what the solution there suggests on your end and see what happens?

sarthu

Oct 14, 2024

What is the version of scandeval you are using?
Mine is 13.0.0

ferran-espuna

Language Technologies Unit @ Barcelona Supercomputing Center org Oct 14, 2024

I'm using 13.0.0 as well

sarthu

Oct 14, 2024

Can you try this:
https://github.com/ScandEval/ScandEval/issues/408

ferran-espuna

Language Technologies Unit @ Barcelona Supercomputing Center org Oct 14, 2024

Hi, I seem to have fixed the import error manually, I'm running the tests now.

ferran-espuna

Language Technologies Unit @ Barcelona Supercomputing Center org Oct 14, 2024

Hello, sorry for the delay.

I have experienced the same error on the conll-nl benchmark.

Please kindly try running

pip install outlines==0.0.36

before executing the evaluation. I have done so and the evaluation seems to be running fine.

sarthu

Oct 14, 2024

Thanks a lot, will be amazing to put the results on the benchmark table. Nice work !!

ferran-espuna

Language Technologies Unit @ Barcelona Supercomputing Center org Oct 14, 2024

You're welcome. I'm closing this discussion for now. Good luck!

ferran-espuna changed discussion status to closed Oct 14, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment