Sánchez García, Pablo(Universitat Politècnica de València, 2024-09-11)
[EN] AI systems are usually evaluated with a variety of benchmarks to determine their
performance for specific tasks, using a single metric which provides a simplistic image
of their capabilities. However, this procedure ...