Q&A with Yves de Montcheuil of Talend

Published on: 
Pharmaceutical Technology, Pharmaceutical Technology-11-02-2012, Volume 36, Issue 11

A Q&A with Yves de Montcheuil, vice-president of marketing at Talend, a provider of open-source integration software.

Q. How can big data be used to increase the efficiency of drug development?

The amount of data in the pharmaceutical industry is exploding, and companies are experimenting with large data to ascertain their potential value in clinical trials and personalized medicine. The push for a complete change in the way drugs are developed has coincided with the ability to generate more and more data. The analysis of large data sets is now a must have competitive weapon because it provides clearer, deeper visibility into the research and development phases, resulting in faster time-to-market, resource, and financial savings.

Yves de Montcheuil


With the introduction of advanced technological tools and resources, information from around the globe can be aggregated, analyzed, and become more beneficial to researchers. A recent McKinsey report estimates that $300 billion in annual healthcare savings—8% of current US total expenditure—could be found by using big data more efficiently (1). While the complexity of data has been present in clinical trials since their initiation, the actual function of aggregating the information is relatively new and advancements in the area now allow research professionals to be more accurate and efficient.

Each phase of the pharmaceutical process can be broken down by understanding the patterns, commonalities, and correlations found among patients. Doctors and specialists are trained to diagnose, treat, research, and report diseases and ailments. To do this in a more effective way, they rely heavily on the work of data scientists. Whether they realize it or not, physicians and clinicians are accessing patient and study information through customized front and back-facing portals. The combination of personal electronic health records, study data, and specialized test results from X-rays, CAT scans, and blood work are mined for patterns by physicians using these portals to cumulate the detailed information from a variety of sources.

The University for Health Sciences, Medical Informatics and Technology (UMIT), based in Hall, Austria, demonstrated how big data can be used in the pharmaceutical industry. As a key participant in the IMGuS project, a life-science data warehouse system supporting systems biology in prostate cancer, UMIT uses big data to identify molecular signatures, identifying patients who are good candidates for prostate cancer treatment. In coordination with five other research groups located in Germany and Austria, UMIT manages the technical infrastructure and the data warehouse part of the project. Patient samples come from a bank at the University of Innsbruck's Clinic of Urology. The established technology platforms of the different partners are used to generate complementary genomic, proteomic, and metabolomic data using samples from healthy controls, low-risk and high-risk prostate cancer patients. The results for both groups are analyzed using statistical and data mining methods to determine new therapy and prediction approaches. The data are then integrated and stored in a clinical data warehouse.

Administrative data are also loaded at this stage, including patient demographics, information about the biological source a certain sample comes from (e.g., tissue and serum), or information on the data source where the information is stored. The frequent refresh of the data warehouse (performed nightly) ensures that researchers can use ad-hoc query and data mining tools and apply advanced statistical models to extract data relevant to their research. By looking at qualifiers such as gender, heredity, ethnicity, and age, clinicians were able to mine for data and navigate key patters that were found in criteria with people who had similar disease states.

Big data is entering into step one of the drug development process as research becomes increasingly computer-driven rather than test tube-driven. The more data that researchers are able to analyze, the better chances they have for detecting patterns that can lead to fewer wasteful and often painful procedures and tests, and for finding new causes, treatments and even cures for diseases.


1. J. Manyika et al., "Big Data: The Next Frontier for Innovation, Competition, and Productivity" (McKinsey&Co., May 2011).