The devil’s in the data – clinical data integration

8 July 2016



As clinical trials become more complex, there is a growing need to integrate data from different sources. Clinical Trials Insight speaks to Nicholas Pemble, science director and head of ancillary data handling at Janssen Pharmaceuticals, about the benefits and challenges of clinical data integration, and what it means for the future.


The pharmaceutical industry of the early 1990s was very different from what it is today. This is according to Nicholas Pemble, science director and head of ancillary data handling at Janssen Pharmaceuticals, who began his career in 1991. “I was very excited to have a 256 PC, and even more excited when it got upgraded to a Pentium, he says. “We didn’t have next-generation sequencing, and Sanger sequencing was so expensive it wasn’t really done on clinical trials.”

Today, things in the wider pharmaceutical industry could not be more different. Clinical trials are run across the world, and new technologies from mobile apps to different kinds of monitoring and diagnostic devices offer unprecedented benefits to medical professionals and patients. For those like Pemble who work in clinical data management, these technological shifts have had a profound impact.

“It has become a lot more technical than it used to be at the beginning when I started,” he says. “With more research and science, there is a lot more valuable information for us to collect. In terms of data, we used to have some pharmacokinetics and some safety lab data, perhaps around two or three data streams. Now, on a recent trial, we had 12 different data streams, and one data-capture system. So, the data streams are increasing in volume and complexity.”

As a clinical data manager, Pemble has seen industry standards develop significantly as technological change has taken place. “When I was working for a contract research organisation, we would have a different set of requirements for how to store and record data for every sponsor that we were working for,” he says. “So, sometimes gender would be called gender, sometimes gen, sometimes sex, and the content of that variable was perhaps one or two, male or female, M or F. Now, we have a whole set of industry standards, which means I can take data from one company, pick it up, look at it and understand exactly what it is. That for me has been a massive benefit to clinical trials data management.”

The importance of integration

On top of better industry standards Pemble emphasises the importance of integration in managing different streams of clinical data. “It’s about enhanced functionality,” he says. “If you integrate an interactive voice-response system with an electronic data-capture system, for example, you can benefit from systems integration. You can limit your data entry, have control over that data and push it between two systems to get enhanced functionality.”

The benefits of this for pharmaceutical companies and patients are obvious: with data consolidated into a single back end, medical monitors get a genuinely holistic view of the trial’s subject data. “That gives them a much better opportunity to pick up safety signals, which is important to the companies and the subjects themselves,” Pemble says. “It’s also important because our staff don’t like to see the data in bits and pieces and go to different systems for lab results, genotypes and those kind of things. They want to get an overall view of the subject. For integrated systems, data quality is also higher because you are using the single source, and you are able to work much more efficiently. You don’t have to enter which patients had which visits at which clinical sites into a separate database to make a payment. You can just pull that from one system and push it to another. Obviously, that’s a lot more cost effective and a lot more efficient.”

Using consolidated data can also help adaptive trial designs, an emerging trend in the pharmaceutical sector where clinical data is used to adjust different aspects of a trial, such as sample size, dosage and subject population. “We need all of this data consolidated to be able to make the best data-driven decisions for adapted trial design, which is in the interests of all parties,” Pemble says. “We have a lot of trials where we have a planned interim analysis; and the sooner we can get the data together and consolidated the sooner we can perform that analysis and the sooner we can make a decision.”

Risk-based monitoring is another evolving area in clinical trials that Pemble says could benefit from data integration. “Instead of going out and doing 100% source documentation verification at every single site, we have a monitoring approach, which is driven by data,” he says. “This means you can work out how far away from the norm a site is in terms of the number of adverse events and the queries being generated on unclear data. Using this, and many other factors, we are able to use consolidated data to drive a risk-based approach to clinical monitoring. The cost benefit of not having to do 100% source document verification at every site is significant.”

Of course, while the pharmaceutical industry clearly does see the value of data integration, challenges remain. When it comes to systems integration, getting things up and running can be complicated and costly, according to Pemble. “Individual systems need to have APIs (application program interface) and the functionality to allow integration,” he says. “It needs to be set up, tested and monitored to make sure it works throughout the course of a trial.”

When it comes to data integration, the challenge for Pemble is getting the right data available and integrated in real time. “I think this is probably an area where the whole industry is struggling right now,” he says. “There is a lot of software out there that can work off of a consolidated database and give wonderful charts and facts and figures, which are useful in a clinical trials process, but a lot of companies are struggling to get that data from the disparate data sources in a standardised way into a consolidated standardised format.”

So, what can be done? An obvious improvement is having a dedicated team capable of understanding the scientific and data components of a clinical trial. “Within companies like ours, we have specialist scientists who are very aware of the science, but perhaps less aware of the data submission requirements,” Pemble  says. “So, an additional thing to help overcome these challenges is to have a group of people with some specialist knowledge the data and the science. That way, they can help both parties understand what each other actually needs.”

Better technology

Technology is another area that needs improvement. Pemble says: “We need to work with the providers of these data streams to see what we can do on the technological side to get access to the data. Are we able to pull from their system? Are they able to push it on a pre-defined daily basis to ours? If you are working with a large central lab, which has a big employee base, an IT group and a large customer base, then it’s easier to get that kind of technological support. But when you are working with a specialised genetic lab in a university hospital, and they only have an excel spreadsheet, the challenge is obviously quite different.”

Finally, Pemble points to the structure of the data itself. While the industry has developed better standards for storing and recording data over the past few years, he says more needs to be done. “It’s about agreeing on a standardised structure that we can work with for all of our data providers within these electronic data streams. That way, we know what to expect and how to turn data into a format that is suitable for submission, while keeping the traceability back from the original data. The industry has built standards around how we present data to FDA, but we don’t really have industry-wide standards on data transfer. We need transfer agreements and agreements with vendors on structure and format.”

If all of this sounds like hard work, unfortunately, the task is unlikely to get any easier. From what Pemble says of the diverse and evolving levels of “wacky information” about the sleeping habits of patients and the data on how trial medication can affect driving skills, data types will continue to increase in volume and complexity.

“We’re already feeling the impact of multiple vendors and multiple data types,” Pemble says, “but this is an area that will impact us much more in the future. We need to get the focus on this now, and we need to make sure we keep bridging the gap between the needs of the science and the needs of the data.”

Nicholas Pemble is scientific director and head of ancillary data handling at Janssen Pharmaceuticals. He has worked in clinical data management since 1991 at a number of data management clinical research organisations and pharmaceutical companies.


Privacy Policy
We have updated our privacy policy. In the latest update it explains what cookies are and how we use them on our site. To learn more about cookies and their benefits, please view our privacy policy. Please be aware that parts of this site will not function correctly if you disable cookies. By continuing to use this site, you consent to our use of cookies in accordance with our privacy policy unless you have disabled them.