News
Video
Author(s):
Eva-Maria Hempe, NVIDIA, explains that accelerated computing, data governance, and “lab in the loop” are key to bio/pharma’s AI strategy for transforming drug discovery and R&D.
In this part 1 of a 3-part interview regarding the presentation “The State of AI in Next-Generation R&D” at CPHI Europe 2025, held Oct 28-30 in Frankfurt, Germany, Eva-Maria Hempe, Head of Healthcare & Life Sciences, NVIDIA, discusses the pivotal role of artificial intelligence (AI) and accelerated computing in transforming the bio/pharmaceutical ecosystem.
A core challenge and competitive strength in this sector stems from the prevalence of siloed and proprietary data. Leading organizations are tackling this by adopting advanced AI platforms designed to integrate, process, validate, and harmonize these complex, disparate data sources, Hempe explains. She adds that specialized tools operate atop these platforms, including Parabricks for genome sequencing, MONAI for medical imaging, and BioNeMo for biological language models, all aimed at accelerating analysis and extracting rich insights from often unstructured data.
According to Hempe, success in AI hinges directly on data quality and integrity, adhering to the principle that "garbage in garbage out is just a matter of fact." This necessity drives the implementation of robust data governance frameworks, such as ALCOA (attributable, legible, contemporaneous, original, accurate), alongside rigorous automated validation routines and training, she says. These AI platforms support safe model development practices by allowing the tracing of data provenance.
Hempe notes that the single greatest operational opportunity identified is what is termed the “lab in the loop”—the seamless integration of the wet lab (experimental work) and the dry lab (computational analysis). By automating wet lab processes, high-quality, repeatable data is generated, complete with necessary metadata. This reliable data then feeds computational models, which in turn predict the optimal next set of experiments to run, thereby accelerating the entire research workflow, she states. This systematic, data-driven exploration is essential given the immense scope of chemical possibilities. Speaking to this scale, Hempe notes, "This is more than sand grains in the universe." Furthermore, foundation models, such as Evo 2, are being deployed to deepen the understanding of genetic mutations at scale, predict their functional impact, enhance candidate selection, and encourage crucial cross-disciplinary collaboration.
Check out Part 2 and Part 3 of this interview and access all our CPHI Europe coverage!
*Editor’s Note: This transcript is a direct, unedited rendering of the original audio/video content. It may contain errors, informal language, or omissions as spoken in the original recording.
I'm Eva-Maria Hempe and I lead Healthcare and life Sciences for NVIDIA in Europe, Middle East, and Africa. What this means is we're working with the whole ecosystem, so both healthcare and life sciences. I sometimes joke I do everything from genomics over direct discovery, surgical robots all the way to helping physicians and nurses to do their job better. But that's a bit the span we're covering. We're working with companies of all different sizes. We're working with both established companies as well as startups and really helping them to use AI to the fullest and avoid reinventing the wheel and really tapping into the potential of accelerated computing.
It's one of the big strengths or big competitive advantages actually in this field is that there is, you can call it siloed, you can call it proprietary data, but of course it is something that has to be managed. And in general, the way we see is that leading organizations have advanced AI platform, which really allows them to integrate the data, to process the data, and to validate the data.
And then on top of such a platform, we're offering specialized tools like, for example, Parabricks for genome sequencing or MONAI for medical imaging or BioNeMo or biological language models, as well as some kits for agentic AI that then enable accelerating data analysis and really reducing the manual input and you really get this rich insights from these complex data sets, which are just, they're just a matter of fact, this ius nothing we're going to be changing in a way, that is the richness of this field, is the unstructured data, is the multitude of data. And so these platforms are then really supporting centralization without really having to repackage everything and the harmonization of those disparate data sources.
The question of data quality is actually really important, and integrity. So, the old proverb of garbage in garbage out is just a matter of fact. So, it's really about enforcing compliance at the data lifecycle and having a really robust data governance framework. So we're seeing concepts like ALCOA (attributable legible, contemporaneous, original, accurate) or other frameworks being rolled out. Then it's really about having the right processes on audits, training, and as well automated validation routines. And we can help with that by having, again, a platform that allows us to trace data provenance and to really have safe model development practices that you can rely on as as a basis for the more behavioral aspects, which I just outlined.
I think the greatest opportunity really is what we call the lab in the loop or bridging the wet lab and the dry lab, and that has kind of AI challenges on both sides. In the web lab, it's really about automation, so you want to have data, which is, as you said before, high quality. So you have to have the right metadata, and you want to have a really repeatable loop, a really repeatable workflow here. So we are seeing companies working in that space who are, automating the entire wet lab, like all the processes of transferring plates between different machines and capturing all that data in a harmonized form. And that data then provides the basis for better predictions of what experiments we should be running, and then you're basically closing the loop. So you're having the wet lab, which is the validation, which gets you the real data in a good quality thanks to automation, and that data quality then feeds models which can predict which experiments to do next. I think that's once this flywheel, you get to go, it can really accelerate things because the chemical space is just so huge. I mean, this is more than sand grains in the universe. And so far we've just been looking at where we felt comfortable in a way, looking at where we've been always looking. And with this more data-driven approach, we can really explore that space in a much more structured and systematic way.
And I think there's another opportunity. So these are foundation models like Evo 2, that's to really dive deeper into genetic mutations at scale and understand and predict the functional impact of that. And then, again, this allows better candidate selection and, again, drives cross-disciplinary collaboration.
Get the essential updates shaping the future of pharma manufacturing and compliance—subscribe today to Pharmaceutical Technology and never miss a breakthrough.