Making Use of Process Data

Published on: 

Equipment and Processing Report

Equipment and Processing Report, Equipment and Processing Report-09-21-2016, Volume 9, Issue 9

Manufacturing data collected in the process historian can be analyzed to better understand the process and use past behavior to predict future results.

With the growing prevalence of “smart,” connected machines and automated data collection, there is no shortage of process data. Analyzing this “big data” and using it to make decisions, however, can be a challenge. Pharmaceutical Technology spoke with Bert Baeck, CEO at TrendMiner, which provides data mining for process industries, about some of the challenges and best practices for using the process data that is collected in the process historian.

PharmTech: What are some of the challenges with getting information from the process historian?

Baeck (TrendMiner): Pharma is similar to other kinds of industrial processes in that the information in the process historian includes time-series data (e.g., temperature, flow, pressure, concentration, calculated aggregates, or derivative data/profiles). Context information-such as product or recipe names, process phases, and batch identification-adds value for the process engineer, but it is not always stored in the historian. If you have the right software, you can find specific batches, filter on products or phases, or overlay profiles for good and bad batches (complete or phase by phase). Typically, information such as laboratory measures (i.e., quality data) is still stored in an isolated laboratory information management system (LIMS); however, it’s good practice to integrate these data. 

A technical constraint is that historians are not ‘read’ optimized. They are ‘write’ optimized, which means they store and compress new data. Therefore, extracting data from historians is generally slow. For example, searching through 10 years of historical data would overload the system and/or would take hours or days to compute. 

Process engineers and other subject matter experts must be able to search time-series data over a specific timeline and visualize all related plant events quickly and efficiently. This includes the time-series data generated by the process control systems, laboratory systems, and other plant systems and the usual annotations and observations made by operators and engineers. Typically, if a certain search takes too long, users will abandon the search and this is harmful for diagnosing problems and even more for preventing abnormal situations. The challenge typically presented by historian time-series data is an inability to provide an intelligent mechanism to search through this data to solve day-to-day business questions (e.g., Have I seen similar behavior before? Have I seen similar batch patterns?) or to make annotations (e.g., process context/metadata) effectively. By combining both structured time-series process data from the historian and data captured by operators and engineers, users can predict more precisely what is occurring or what will occur in the future with continuous and batch industrial processes.


TrendMiner’s approach uses an on-premises, packaged virtual server deployment that integrates to the local copy of the plant historian database archives and uses ‘pattern search-based discovery and predictive-style process analytics.’ It is also cloud-ready with a more scalable architecture, and it targets the average user, without a big data or modeling background and no need for a data scientist. Searches provide intelligent ways to navigate through the historical data, solving day-to-day operational or asset-related questions, as well as diagnostic capabilities for explanatory power. Because the past is the best predictor for the future, it provides predictive capabilities.  Pattern recognition and machine learning technology turns human intelligence into machine intelligence. Saved or learned patterns (e.g., abnormal situation signature, golden batch) can be used to alert that a certain event is likely to happen.

PharmTech: Can you share some best practices or guidelines for setting up process data collection systems to optimize the ability to get information back?

Baeck (TrendMiner): Best practices include:

  • Store data with the timestamp of interest. For instance, LIMS data should be logged with the sample time as the primary date time, not the analysis time.

  • Think about which context information is useful and save it as digital tags. Have a plan for how this information should be saved to keep it consistent throughout the plant.

  • Think carefully about the data compression. Save the data at a rate that contains the most information for the user. For example, one point for every 10 minutes will probably not be enough for a thorough analysis.

  • Think about asset structure. How will you break down your process data structure? Which system should be the master for this asset structure?

  • Make sure there is a way to save certain calculations or patterns of certain profiles (e.g., golden batch profiles, model deviations).

  • Utilize capture capabilities that support manually created event frames or bookmarks by users or automatically generated by third-party applications. These annotations are visible within the context of specific trends.

  • Monitoring capabilities integrated with predictive analytics and early warning detection can offer a live view to operators in determining if recent process changes match expected process behavior. This gives the opportunity to proactively adjust settings when this is not the case.