AI & ML interests

Benchmarking

We refer to https://arxiv.org/abs/2412.09385

to implement a LLM peer review to end up to a new benchmark

.

models 0

None public yet

datasets 0

None public yet