BSC AI Factory Talk: “Less is More: Accelerating AI with Advanced Data Strategies”

Artificial intelligence

Thursday, 17 July 2025 At 11.00 a.m.

In this talk, we will address three key questions:

How do we automatically what high-quality data accross domains look like?
How do we automatically and efficiently find or create these best high-quality data for training or improving behemoth models?
How can we lower overall compute and training costs by reducing massive datasets into samller sets without loss or even graing in performance?

The recent emergence of powerful foundation models has sparked a new wave of applications across multiple disciplines, from healthcare to biology. This has led to an insatiable demand for data across many disciplines. To keep up with the demand, data is aggressively consumed and labeled when abundant, and auto-labeled or generated by foundation models when scarce.

Typically, the focus remains on data with high quality annotations or data from high quality sensors or generation. However, this limited view on quality can introduce several biases and fails to reduce massive datasets into manageable amounts.

Speakers: Nadine Chang, senior research scientist in the Autonomous Vehicles Applied Research Group at NVIDIA and Jose Alvarez, director of research at NVIDIA.
Host: Mariona Sanz, Head of Innovation and Business Development, BSC

FURTHER INFORMATION