Smart Data Visualization for High-Throughput Data
How to visualize Trending Data?
Assume that the reaction pressure is a critical process parameter, which you want to monitor closely. A common approach would be to create 64 scatter plots (one per reactor) showing the trends of reaction pressure versus time (see Figure 1). Reviewing these 64 individual figures would be a lengthy process and it could even become confusing with so many plots. In a different approach, you could make a scatter plot that includes the data of the 64 reactors and choose different colors (or markers) to differentiate them. In this case, the density of the data is so high that one could barely distinguish data from one reactor to the other, and there will be little understanding of what is happening with this critical process parameter, as shown in the picture.
Figure 1 : Reaction Pressure in dependence of time
A smarter way to visualize trending data
The use of a smart figure would allow both operators and scientists to quickly identify if a process variable is within specifications or if actions are required, e.g. a pressure drift could indicate the plugging or bypassing of the catalytic bed. The smart plot we have chosen for this case is called a Cell plot. At Avantium we automatically generate these type of figures using JMP and using the trending data from our Flowrence software. A typical cell plot representing the reaction pressure as a function of time, for 64 reactors working in parallel, is shown in Figure 2.
In this case, the color represents a continuous scale, showing red and blue for the upper and lower specification limits, respectively. When you look at the graph, you can now easily see that the pressure in all 64 reactors is within the limits of the specifications. In addition, as the color per reactor varies slightly one could interpret that the pressure per reactor is stable. Besides, one could quickly detect that R21-R23 have a higher-pressure trend compared to the other reactors. This fast analysis could be followed by a more detailed review of process conditions of reactors showing irregular behavior, for example by use of Shewhart control charts, but instead of drawing 64 plots, we would need to focus only in 3 of them, corresponding to reactors R21 to R23.
Figure 2 : Cell Plot of Reaction Pressure in dependence of time
Visualize Catalyst Performance Data
In high-throughput catalyst testing programs, the target is usually the evaluation of different catalyst formulations to identify the most promising materials considering key performance indicators, catalyst synthesis price and/or catalyst synthesis recipes. Within Avantium, we have specialized catalyst-testing services to evaluate a broad range of catalysts in terms of conversion and yield to the desired product while keeping an eye on the final price. In these large screening campaigns, the priority is to quickly identify the catalysts with the highest yield. However, we observe that there is an increasing desire to cluster the data based on the composition, which allows the customer to understand better the results.
Figure 3 : Catalyst Yield vs. Catalyst Price
Figure 3 : Catalyst Yield vs. Catalyst Price
A disadvantage is that the x-axis is saturated because of the number of catalysts screened in this test, and therefore, its analysis becomes cumbersome. This is especially ineffective for presenting data or results at management level meetings. The use of a smart figure would allow clear and fast interpretation of the data.
Ideally, everyone should be able to look at the plot and in an instant be able to judge which catalysts are performing better, and answer questions like “Did I select a reasonable amount of materials from each class?” and “What cost level is the catalyst?”. For this case, we have selected a Treemap as a smart plot visualization.
Figure 4 : Bar chart of Catalyst Yield vs. Catalyst Price
Figure 4 : Treemap plot of Catalyst Yield vs. Catalyst Price
It can easily be observed that, within these tests, a similar number of total catalysts containing USY compared to the sum of SAPO-34 and BETA were tested, and approximately a third of the samples tested contained ZSM-5 as active zeolite material. Within each of these areas, one can identify the best performing catalyst on the upper left corner and the color indicates the normalized cost.
General guidelines for a clear visualization
No matter which smart visualization you choose, the message should be conveyed in a simple and concise form to facilitate the communication process. Always keep in mind, advanced statistical and modeling tools are needed to make quantitative conclusions and identify optimal values for key parameters. In this newsletter, we emphasize those that must be considered sine qua non:
1. Keep it simple
Whenever possible, make simple figures and avoid saturation with unnecessary information.
2. Know your audience and define the key message
Determine your audience to tailor the graphics based on your key message. Consider three main types of audiences: managerial, technical or academics and non-technical. They all have different needs.
3. Use colors effectively
Minimize the use of color whenever possible. If you have troubles selecting colors, remember the color wheel.
4. Use the correct tool
Plots need to be informative and engaging. The correct tool, e.g. Matplotlib, R or JMP, can facilitate the preparation of your story and the visual context around it.