HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models Paper • 2409.16191 • Published Sep 24 • 41
TableBench Collection TableBench: A Comprehensive and Complex Benchmark for Table Question Answering • 8 items • Updated 5 days ago • 2
FuzzCoder: Byte-level Fuzzing Test via Large Language Model Paper • 2409.01944 • Published Sep 3 • 44