Skip to content

The effect of unmeasured root causes in problem-solving

“What if I am not measuring all the potential root causes?” is a question we frequently encounter from industry experts. While it’s important to have comprehensive data for problem-solving, capturing every root cause is unfeasible. This article illustrates that robust algorithms for root cause analysis can uncover significant production issues even amidst other unexplained effects.

Why the real world differs from theory

Any root cause analysis starts with aggregating the right set of data. In an ideal world, we would measure all variables that influence an outcome variable of interest (e.g., quality). Such holistic data coverage would enable an AI-based analysis to identify all drivers of production issues. However, reality often presents challenges with some root causes not being measurable or reflected in the data. For instance, consider a missing temperature sensor in an injection molding machine or unmeasured sources of particles in a semiconductor fabrication process. In such scenarios, even the most advanced AI algorithms cannot directly reveal the unexplained root causes.

There is a common misconception that for AI-based root cause analysis to be effective, the data must be perfect. This is not the case. While it is true that unmeasured variables limit the ability to make process improvements, useful insights can still be gained from the data that most manufacturers collect today. The presence of unexplained variation does not preclude the value of such analyses. Imperfect models can still enhance process understanding. In this article, we will explore an example demonstrating how, despite the absence of some sensors, robust algorithms are capable of reliably identifying key root causes amidst unexplained variation.

Simulated production setup

We introduce a practical case for our root cause analysis by simulating data for five sensor measurements and a quality metric defined as yield. Our simulation aims to uncover root causes of yield losses using data from these sensor measurements across 10,000 production batches. The relationship between the sensor measurements and the yield is captured by the following formula:

Here’s a breakdown of the above formula:

  • The ideal value of Sensor 1 measurement is 100. Deviations from this value reduce the yield.
  • The ideal value of Sensor 2 measurement is 20. Deviations from this value reduce the yield.
  • The ideal value of Sensor 3 measurement is 50. Deviations from this value reduce the yield.
  • Sensor 4 measurement and Sensor 5 measurement have no impact on the yield.

The below figure displays the distributions of the five sensor measurements and the production yield. Our goal is to identify sensor measurements that cause yield variation by utilizing the EthonAI Analyst software. The EthonAI Analyst employs causal algorithms to pinpoint the root causes behind production issues. Importantly, we approach this analysis as if the above ground-truth function linking sensor measurements to yield is unknown.

In the following, we will systematically omit sensor measurements from our dataset to observe any changes in root cause analysis outcomes. This approach tests the robustness of our analysis, ensuring it can still accurately identify all measured effects. Despite the inability to account for unmeasured sensors, we demonstrate that even models with incomplete data can significantly improve our understanding of the process.

Numerical experiments for unmeasured root causes

In the first scenario, we will investigate the situation where all sensors are operational. Here, the EthonAI Analyst should be able to detect all root causes accurately as all information is contained in the dataset. Upon analyzing the data, the EthonAI Analyst presents a ranking of sensor measurements based on their impact. It can be observed that the first three sensor measurements are correctly identified as root causes, whereas Sensor 4 measurement and Sensor 5 measurement get no attribution in the analysis. Therefore, the attained root cause model is an accurate approximation of the ground truth relationships.

In the next scenario, we will remove Sensor 1 from our dataset without changing the rest of the data. Therefore, the effect of Sensor 1 measurement will result in unexplained variation in the yield. However, a good root cause analysis should still detect the measurements of Sensor 2 and Sensor 3 as root causes of yield losses. As can be seen in the below root cause ranking, the EthonAI Analyst still gives the same weight to Sensor 2 measurement and Sensor 3 measurement. Also the magnitude of the effect is close to the previous root cause model, where the entire variation in the yield could be explained.

In the final scenario, we will remove both Sensor 1 and Sensor 2 from our dataset without changing the rest of the data. Now two out of the three root causes cannot be explained, which results in a large portion of the variation to be unexplained. We analyze the data with the EthonAI Analyst and still get the expected results. In particular, Sensor 3 measurement is detected as a root cause and its magnitude is comparable to the one in the root cause model where the entire variation could be explained.

Conclusion

This article has demonstrated that comprehensive data collection is crucial for effective root cause analysis, but it’s not necessary to measure every variable to begin the process. Robust algorithms can uncover significant production issues even when faced with incomplete data. The real world often presents challenges where certain root causes remain unmeasurable or unaccounted for, such as missing sensors. Despite these limitations, AI-based analyses can still provide valuable insights, which enhances process understanding and facilitates KPI improvements.

Through numerical experiments, we have illustrated the effectiveness of the EthonAI Analyst software in identifying root causes, even when sensor data is systematically removed. Our simulations revealed that the EthonAI Analyst accurately identified key root causes in scenarios where all sensors were operational, as well as in scenarios where sensors were deliberately omitted. Importantly, the Analyst’s ability to maintain accurate root cause models, even with incomplete data, underscores its reliability in real-world production settings.

In our experience, a significant portion of problems can be addressed by the existing data manufacturers collect today. Initiating root cause analysis early not only aids in problem-solving but also guides decisions regarding sensor deployment. Often, unmeasured relationships can be approximated using proxies (e.g., machine IDs). For example, adding routing information (i.e., how individual units flow through production) can already point process experts to the sources of problems (e.g., towards suspicious machines). Our advice is clear: don’t wait for perfect data before embarking on data-driven analysis. Start with what you have, and progressively enhance data coverage and quality to drive continuous improvement in your manufacturing processes.

Deploying a Manufacturing Analytics System: On-premises vs. cloud-based solutions

A Manufacturing Analytics System (MAS) integrates across data sources and provides valuable insights into production processes. As companies evaluate their options, a key decision emerges: should they deploy the MAS onto their own premises, or opt for a cloud-based Software as a Service (SaaS) solution?

This article discusses the merits of each approach to help businesses make an informed decision. It focuses on five major discussion points: data security, scalability, maintenance, cost effectiveness, and support.

Data Security and Compliance

On-Premises: Tailored to Specific Needs

The primary advantage of on-premises deployments lies in the enhanced control and security it offers. Companies with highly sensitive data often prefer on-premise solutions due to their stringent security requirements. It can be easier to conform to stringent or inflexible policies by hosting the MAS internally. This setup allows for a more hands-on approach to data management, ensuring compliance with standards like GDPR, HIPAA, NIST, or other industry-specific regulations.

Cloud-Based Solutions: Robust, Standardized Security

Cloud-based MAS solutions have often been perceived as less secure, and some companies generally distrust the cloud. However, especially in recent years, cloud offerings have evolved significantly. Reputable cloud providers employ robust security measures, including advanced encryption, regular security audits, and compliance with various international standards. They have the resources and expertise to implement and maintain higher levels of security than individual organizations can achieve on their own. For businesses without the capacity or desire to manage complex security infrastructure, a cloud-based MAS offers a secure, compliant, and hassle-free alternative.

Scalability on Demand

On-Premises: Tailored to Specific Needs

An on-premises MAS deployment allows for extensive customization. Businesses can tailor the system to their specific IT and OT landscape, including guaranteed real-time responses. This capability is particularly beneficial for companies requiring deep integration with legacy systems and factory equipment. On the other hand, scaling on-premises solutions typically requires significant investment in hardware and infrastructure, as well as the technical expertise to manage these expansions.

Cloud-Based Solutions: Easy Scalability and Flexibility

Cloud-based MAS platforms shine in scalability. They allow businesses to scale their operations up or down with ease, without the need to invest in physical infrastructure. This scalability makes cloud solutions ideal for businesses experiencing rapid growth or fluctuating demands. Furthermore, cloud platforms are continually updated with the latest features and capabilities, ensuring businesses always have access to the most advanced tools without additional investment or effort in upgrading systems. A potential down-side is that ultimate control of the deployment lies with the cloud provider, which can be a hurdle for highly regulated industries.

Maintenance and Updates

On-Premises: Hands-On, Resource-Intensive Maintenance

Maintaining an on-premises MAS can require a dedicated IT personnel to manage hardware, perform regular software updates, and troubleshoot issues. This hands-on approach offers complete control over the maintenance schedule and system changes, but can be resource-intensive. Companies who already have specialized IT teams due to the nature of their operations may find this approach a natural fit.

Cloud-Based Solutions: Hassle-Free, Automatic Updates

Cloud-based solutions significantly reduce the burden of maintenance. The service provider typically manages all aspects of system maintenance, including regular updates, security patches, and technical support. Automatic updates ensure that the system is always running the latest software version, providing access to new features and improvements without additional effort or cost. This allows businesses to focus on their core operations, without the need to allocate and manage resources for system maintenance.

Cost Effectiveness

On-Premises: Higher Initial Investment but Predictable Long-Term Costs

Deploying any system on-premises typically involves a higher initial capital expenditure, including costs for hardware, software licensing, and installation. Over the long term, these costs can be more predictable, or at least there are no cloud-subscription fees to factor in. For organizations with the necessary infrastructure already in place, this model can be cost-effective, particularly when considering the longevity and stability of the investment.

Cloud-Based Solutions: Lower Upfront Costs with Ongoing Expenses

Cloud-based MAS solutions offer lower initial costs and much quicker setup compared to on-premise installations. Businesses can avoid significant expenses on hardware and infrastructure. This subscription model converts upfront investments into ongoing operational expenses. In addition to the ease of setup, this can be more cost-effective in the short term. However, for businesses with long-term predictable usage patterns, it is important to consider the cumulative costs over an extended period.

Support

On-Premises: Customized and Direct Control

This model of deployment demands a significant commitment of internal resources for maintenance and troubleshooting, necessitating dedicated, skilled IT personnel. While on-prem provides an unmatched level of control and customization, as discussed earlier in this post, the reliance on in-house capabilities for supporting the MAS can be a considerable burden on manufacturing customers.

Cloud-Based Solutions: Broad, Expert Support with 24/7 Availability

Cloud-based MAS solutions boast a scalable, expert support structure, alleviating the need for an in-house IT team to manage the MAS deployment. This is particularly important for operations spread across multiple locations or time zones. Automatic updates and maintenance conducted by the provider ensure the system remains up-to-date without any additional effort from the customer side. Furthermore, troubleshooting is accelerated in a cloud-based system because the infrastructure is standardized and uniform. This consistency reduces complexity and variability, which significantly improves the efficiency and speed of support services.

Conclusion

The choice between deploying a MAS on-premises or in the cloud depends on various factors including data security needs, customization requirements, budget constraints, network reliability, and maintenance capabilities. Each option has its merits, and the decision should align with the specific operational, financial, and strategic objectives of the organization. At EthonAI, we offer both options to meet our customers’ needs effectively.

How modern data analytics enables better decision-making

This is a reproduction of the article “Please explain it to me! How modern data analytics enables better decisions,” which we wrote for the United Nations Conference on Trade and Development (UNCTAD).

Why AI’s potential is still underutilized in decision-making

British industrialist Charles Babbage (1791-1871) once said, “Errors using inadequate data are much less than those using no data at all.” Three industrial revolutions later, it’s surprising how often decisions are still made on gut feeling without data. But it doesn’t have to be like that. The distinguishing factor of the ongoing fourth industrial revolution is the unparalleled access to and connection of data. However, having data is one thing; another is to make good use of them. That’s where AI comes in.

Management and policymaking are about decision-making, which is best when grounded in facts. Facts, on the other side, are verified truths derived from analyzing and interpreting data. The challenge then is to collect and present data in a form that can be turned into information and actionable knowledge. Luckily, the rampant developments in computer science and information technologies, enable decision-makers more, faster, and better access to data. In addition, AI can help decision-makers establish the needed facts by generating data-driven insights inaccessible to humans to date. But, as this article shows, it isn’t a silver bullet. A special type of AI is needed.

With the recent rise of AI models there have been impressive developments in both creative content generation as well as automation. Yet, when it comes to decision-making, AI still faces two critical challenges: First, the complexity and opacity often found in AI models can deter trust and adoption. Many AI models operate in a “black-box” fashion, where users cannot comprehend how a suggestion has been made. When domain experts are unable to validate the AI’s outputs against their own knowledge, they tend to distrust the AI output. Second, AI systems are currently limited in their ability to perform causal reasoning, which is a critical element in any decision-making process. Knowing that an event relates to another event is interesting, but knowing what causes the events to happen is a game-changer.

Hands-on experience from the industry

The key to addressing AI’s two drawbacks is to design systems that are explainable and can be augmented with domain knowledge. To give a concrete example, consider the following. We conducted a field experiment with Siemens, where we observed factory workers who engaged in a visual quality inspection task of electronic products. Participants were divided into two groups: one aided by a “black-box” AI and the other by AI capable of providing explanations for its recommendations. The group using the explainable AI significantly outperformed those factory workers who got recommendations from the “black-box” AI. And even more interestingly, users with the explainable AI system better knew when to trust the AI and when to depend on their own domain expertise, thereby outperforming the performance of the AI system alone. Hence, when humans work together with AI, the results are superior to letting the AI make decisions alone!

Explainability is not only helpful for creating trust among decision-makers. It can also be leveraged to get the best out of humans and AI strengths. AI can sift through amounts of data, whereas humans can complement this with the physical understanding of the process to establish cause-and-effect relationships. Consider for example our research in a semiconductor fabrication facility, where we provided process experts with explainable AI tools to identify the root causes of quality issues. While the AI was able to reveal complex correlations between various production factors and the quality of the outcomes, it was the human experts who translated these insights into actionable improvements. By cross-referencing the AI’s explanations with their own domain knowledge, experts were able to design targeted experiments to confirm the underlying causes of quality losses. The result? Quality losses plummeted by over 50%. This case emphasizes the indispensable role of human expertise in interpreting data and applying it within the context of established cause-and-effect relationships.

Conclusion

The key message is that data is proliferating and AI is here to help, but without the human in the loop, don’t expect better decision-making. We need to build AI systems and tools that support the human decision-maker in getting to facts faster. For this, the recent developments in Explainable AI and Causal AI offer a promising path forward. Such tools allow users to reason about the inner workings of AI systems and incorporate their own domain knowledge when judging the AI output. It helps explain the causal relations and patterns that the AI is picking up from the data–ultimately enabling decision-makers to make better decisions.

Process Mining: A new take on material flow analysis in manufacturing

Traditional methods for throughput improvement provide only a static view of manufacturing processes. This article explores process mining as a dynamic alternative. We analyze its application in bottleneck analysis, inventory analysis, and process variance analysis within manufacturing.

Why new process mapping methods are needed

Increasing the rate at which parts flow through a factory is a key objective in manufacturing. Better material flow can be achieved by eliminating three key obstacles: bottlenecks, process variation, and non-value-adding activities. Identifying and eliminating these obstacles is a key task for manufacturers. However, in practice, the dynamic nature of manufacturing processes often makes this task challenging. Conventional problem-solving methods frequently fall short in capturing the subtleties necessary to detect inefficiencies in material flows.

To visualize and understand their material flows, manufacturers regularly turn to manual process mapping methods. A common process mapping method is Value Stream Mapping (VSM), which has become a standard to uncover areas for improvement by illustrating both material and information flows within a factory. This provides a snapshot of the flows through a factory and supports manufacturers to analyze inefficiencies for the period of observation. However, as current process mapping tools are static (i.e., they only show the state of operations for a particular observation period), they are less effective when the material flows dynamically change over time. Moreover, existing methods require high manual effort, which limits their applicability in factories with high complexity. This leads to undiscovered inefficiencies and suboptimal process improvements. 

This article explores how process mining can address the limitations of traditional process mapping methods. Process mining is a recent innovation in information system research that leverages event log data to dynamically analyze process flows. Despite noticeable success in other areas, such as service operations and supply chain processes, the potential for applications in manufacturing has not yet been fully recognized. Therefore, we will discuss three key use cases of process mining in manufacturing: bottleneck analysis, inventory analysis, and process variance analysis. Each use case will highlight how process mining offers significant advantages over traditional approaches for material flow analysis and helps manufacturers identify even the smallest inefficiencies.

How process mining enables better material flow analysis

Process mining is a technique for analyzing process flows based on event log data. An event log is a record of activities, capturing at minimum the sequence and timing of steps as they occur. For instance, in manufacturing, each step in a production process can generate an event log entry. Such entries include details like what happened, when it happened, and potential further information (e.g., on which machine, which operator, what costs). Process mining can use these event logs to understand how individual parts flow from one process step to another, thereby uncovering inefficiencies and throughput improvement opportunities.

So what data can be used for process mining in manufacturing? Key data includes a unique unit ID (typically the part or batch number), incoming and outgoing process timestamps (e.g., from 08:55 to 09:05), activity names (e.g., assembly and testing), and equipment names used to process a part. This information tracks each product unit’s journey through various stages of production.

The table below offers a glimpse into how this data might be structured in a manufacturing process involving assembly and testing steps, with each entry capturing a specific moment in the production flow. For most manufacturers, these event logs are readily available in their IT systems (even though they might not know it), most commonly in the format of XES or CSV files. 

Although process mining was originally not developed for manufacturing, its application in this field is very promising. Its ability to provide insights into a factory’s material flow by aggregating the insights from individual product units is highly valuable. Given these opportunities, it is surprising that, to date, the adoption of this technique remains limited. 

We identify three major reasons for the low adoption rate: (i) data acquisition, (ii) data processing, and (iii) value generation.

First, manufacturers often have a mix of newer and older machines that might be not digitized and do not provide the required data. However, in these situations, simple low-cost retrofit solutions can be applied to collect the required data. For instance, data matrix codes can be attached to part carriers, which are then read before and after a process with off-the-shelf scanners. In our experience, obtaining a first view on the data can be achieved in less than a day.

Second, manufacturing is often characterized by the convergence and combination of different parts. This has for long been a challenge for process mining, as the connection to one specific unit ID made it difficult to visualize and explore cases in which IDs are merging. However, due to recent developments in the field, object-centric process mining helps to overcome this hurdle.

Third, manufacturers are missing an overview of concrete use cases, which guides them how process mining can yield tangible improvement opportunities. This article sheds light on these opportunities by presenting three use cases how process mining can be used in manufacturing.

Example 1: Bottleneck analysis

One of the key challenges in manufacturing is to identify and resolve bottlenecks that hinder efficient material flow. Process mining offers a unique solution to this challenge. By analyzing event log data, it provides a detailed “replay” of the material flow through each step of the manufacturing process. This approach enables manufacturers to observe the journey of each part in real-time and retrospectively.

Through process mining, manufacturers can accurately estimate how long parts spend at each stage of a factory (i.e., cycle time), including the time spent waiting before and after each process step (i.e., waiting time). This detailed analysis helps in pinpointing specific improvement areas, be it a machine or a process step, where delays frequently occur. By identifying these bottlenecks, manufacturers gain valuable insights into which processes are constraining the throughput of the entire factory.

The main advantage of process mining over traditional approaches lies in its dynamic capabilities, which allows for the detection of bottlenecks that may shift from one process step to another within specific time intervals. For example, in more complex production setups, where multiple product variants are produced on the same production line, each variant, each shift, and each sequence of machines might have a different bottleneck. Process mining is capable of identifying these specific bottlenecks independent of when they occur, thereby providing insights that are crucial for effectively managing the productivity of complex factories.

Example 2: Inventory analysis

Following the identification of bottlenecks, another critical application is the reduction of (buffer) inventory between machines. Just as process mining helps identifying slow processes, it also illuminates areas where excessive inventory accumulates before and after an activity. This insight is invaluable for streamlining buffer inventories, ensuring that resources are not unnecessarily tied up in idle stock – and, at the same time, not starving due to missing parts.

Process mining essentially enables a dynamic form of value stream mapping, which offers a real-time view of inventory levels throughout a factory. It provides insights on the inventory levels between each process step at any given point of time. By analyzing these data, manufacturers can allocate their resources more effectively, minimize inventory holding costs (especially in areas preceding or following bottlenecks), and reduce lead times.

Example 3: Process variance analysis

Another key objective is to detect deviations from the intended process flow. In process mining, the match between the actual process sequence and the intended process sequence is referred to as conformance. Process conformance checking aims at comparing the target process model with the actual process execution. This involves analyzing event logs to see whether the sequence in which a part travels through a factory adheres to the pre-defined flow.

Deviations from the intended process flow can have two reasons: (i) either the observed process shows unintended deviations, such as rework or unexpected loops, or (ii) the master data used to create the target process model is incorrect (e.g., master data not updated). Process mining allows for a detailed analysis of these process deviations. 

The conformance of process flows can be quantified by different measures (e.g., “fitness”), which shows how well each actual process matches the intended process. This helps in identifying mismatches between the actual events and the as-designed process flow. These insights are crucial for aligning the actual manufacturing process with the intended design.

Unlike static process mapping methods, which only provide a limited snapshot of the material flow, process mining offers a dynamic, ongoing analysis. It captures a comprehensive view of all deviations, not just the ones visible at a particular moment. This dynamic capability is essential for ensuring manufacturing processes consistently adhere to their intended designs and to continuously update the design process to obtain proper master data, both of which is crucial for improving material flow throughout factories.

Outlook

Process mining offers significant advancements in productivity improvement for manufacturing. Utilizing untapped event log data, which manufacturers frequently store hidden in their operational history, it provides a dynamic and detailed perspective on inefficiencies in material flows. Unlike traditional process mapping, process mining offers deeper insights by dynamically analyzing the as-realized process flow throughout a factory. This enables manufacturers to identify bottlenecks, manage inventory effectively, and ensure process conformity. 

Anticipated advancements in the field promise to expand the capabilities of process mining significantly. For instance, the emergence of AI-based chatbots could significantly enhance the scope of prescriptive process mining. Chatbots enable a direct interaction with a process model, thereby allowing even inexperienced analysts to ask specific questions on complex material flows (e.g., “What is the bottleneck of product A during Q1 2024 and what is the best way of removing it?”). 

With recent trends, such as mass customization, the analysis of material flows becomes increasingly complex. Process mining emerges as a powerful tool for informed decision-making and achieving operational excellence.

We thank Dr. Rafael Lorenz for contributing to this article. For a detailed exploration of how process mining can be used in manufacturing, we refer interested readers to our paper in the International Journal of Production Research.

AI is changing how expert knowledge is used in manufacturing

In manufacturing, troubleshooting production issues through changes to the product, processes, or equipment requires skilled engineers who know exactly what they are doing. Artificial intelligence (AI) is unlikely to fully replace them any time soon. However, the way in which these experts work is changing considerably already today.

Eliminating human bias

Today, the process of identifying the root cause of a production issue often starts with a brainstorming session among experts. Engineers would sit together and discuss the most likely failure mechanisms. Next, they would collect data to prove or reject their hypotheses. AI-based analytics flips this process around. Data are continuously collected, the AI identifies the most likely root causes, and THEN the engineers sit together, interpret the results and decide how to move on with trials or improvement actions.

This new workflow has one major advantage: it removes the engineer’s bias to a large extent. While humans would intentionally or non-intentionally confine their search for the root cause to a limited scope of relations that they consider probable, algorithms will search the entire parameter space covered by the available data. The more tricky issues with unexpected root causes will be found by the AI, but might slip through the cracks in a traditional troubleshooting effort. This considerably shortens the time between detection of a problem and its resolution.

In a recent collaboration with a semiconductor manufacturer, we were faced with a yield drop that caused considerable financial loss. Even after two weeks of production under these unfavorable conditions, the process experts were still in the dark about where the problem could come from. Their traditional methods for root cause analysis didn’t seem capable of tracking down the problem. Although they had semi-automated routines to browse through a large amount of data, they couldn’t find any convincing relationships between process data and quality.

When we stepped in with our AI-based approach, we soon found a relationship that had very high statistical significance. However, the relationship seemed physically implausible. It concerned a process step that was outside of the expert’s focus because at that stage of manufacturing, the product was believed to be sufficiently protected against detrimental effects. Nevertheless, our feedback was taken into consideration, and some trials were done that confirmed that the problem arose exactly in the process we pointed at. The engineers were able to fix the issue although the physical failure mechanism was only understood much later. Using our algorithm, we were able to considerably shorten the time between detection and resolution of the issue. This shortened the period of low-quality production and saved more money than needed to justify the investment in AI with only a single incident.

This example shows how AI can overcome a crucial limitation of traditional troubleshooting in manufacturing: humans are only looking where they expect to find something. Whenever something unexpected happens, this bias can lead to failure and long delays in the problem-solving process.

Workflow evolution: The role of expert knowledge in the AI era

The example also shows that there is a shift in how expert knowledge enters the game: Instead of spending most of their time with identifying the most probable root causes, extracting data from various systems, joining tables, and finding complex relationships, the daily tasks of experts shifts towards evaluating results from the AI-based analytics tools and translating them into practicable improvement actions.

That is how it works in an ideal world …

In reality, things are a bit more complicated. In order to unfold their full potential in finding root causes, AI-based analytics tools need high-quality data. Collecting such data requires expertise, as well. Every sensor that collects useful data for root cause finding must get installed by a human. Which kind of sensor do we put where? This kind of decision-making cannot be done by a machine in the foreseeable future. Most sensors that we currently find in process equipment were primarily installed to monitor the health of the equipment or the conditions of the process. This kind of data often relates to product quality in some way or another and can be useful in finding root causes. However, the more we target the measurements directly at product quality the more powerful the data become for AI-assisted tools. A process expert who has knowledge of all past incidents and improvement actions can incorporate this knowledge into the data collection and, thereby, make sure that similar issues will pop up early and get fixed before they even result in relevant scrap rates.

Conclusion

In summary, we observe that AI is changing the way in which experts in manufacturing use their knowledge. Instead of doing a “manual” analysis on data they collected based on what they know to be a probable root cause, they let AI-based analytics tools do the analysis for them and use their knowledge to (1) feed these tools with the most valuable data they can get and (2) translate the AI results into useful improvement actions.

Augmented Intelligence: How explainable AI is changing manufacturing jobs for the better

This is a shortened reproduction of the article “Augmented intelligence: How explainable AI is changing manufacturing jobs for the better,” which we recently published in the World Economic Forum’s Agenda.

The end of human work?

A common narrative suggests AI is poised to replace human work on a large scale. Our research in diverse manufacturing settings rejects that vision. Instead, AI holds the potential to augment human intelligence to solve work tasks more effectively while also enriching the work experience. The key to unlocking this potential lies in explainable algorithms, which contrast the often-opaque decision-making processes in conventional AI systems.

Despite its promise, the adoption of AI in the workplace has been slower than one might expect, and this reluctance has been partially driven by two critical factors. The first is “algorithm aversion,” a reluctance among humans to trust AI systems that operate as “black boxes,” providing decisions without any clear rationale. The second challenge is that the opaque nature of many state-of-the-art algorithms prevents domain experts from benchmarking AI-generated recommendations against domain knowledge, making it difficult to identify and rectify errors. These issues not only erode trust in AI but also limit the scope for effective human-AI collaboration.

This is where explainable AI (XAI) offers a significant breakthrough. Our multi-year research journey in the manufacturing sector leads us to a unifying conclusion: explainability is the missing ingredient that catalyzes AI adoption in manufacturing. Why? Explainable AI works as an interpreter that bridges the gap between complex algorithmic processes and human understanding. Much like how an interpreter can make complex information accessible to a layperson, XAI demystifies the intricate logic of complex algorithms.

By transforming the AI’s ‘black box’ into recommendations with clear explanations, XAI fosters greater trust and enables more effective human-AI collaboration. Not only is XAI making AI more ethical and accountable, but our research shows that it also improves work experience and job performance. While AI has the capacity to sift through massive datasets and identify patterns far beyond human capability, it is the symbiosis of human expertise and AI recommendations that truly unlocks productivity gains.

Experts with XAI outperform AI

A compelling case study that elucidates the importance of XAI comes from our work with Siemens. We conducted an experiment where we compared the performance of two groups of factory workers in a visual quality inspection task of electronic products. The first group was assisted by conventional ‘black-box’ AI, while the second had the benefit of an AI that provided visual heatmaps to explain its predictions of potential quality issues.

The results were striking: expert workers without explanations were more than three times more likely to erroneously override the accurate recommendations given by the AI. In contrast, those augmented by XAI knew better when to trust the AI and when to depend on their own expertise, thereby outperforming the performance of the AI system alone. This shows that XAI is not just about smarter machine decisions; it’s a transformative approach that enables both humans and machines to perform at their best.

XAI is not just beneficial for providing decision support to operators on the shopfloor; it can also help understand complex production systems, providing critical insights to manufacturing experts. During our collaboration with a semiconductor factory, we equipped process experts with XAI tools to elucidate the root causes of quality issues. While the AI explained complex associations between production variables and quality outcomes, it took human expertise to turn these insights into effective improvement actions. 

By cross-referencing the AI’s explanations with their own domain knowledge, experts were able to design targeted experiments to confirm the underlying causes of quality losses. The result? Quality losses plummeted by over 50%, a testament to the efficacy of human-machine collaboration with AI-based explanations.

A compelling case for augmented intelligence

Our research strongly suggests that the future of manufacturing is not a battle of humans versus machines but rather a collaborative enterprise that leverages the unique strengths of both. Many work tasks will not and cannot be replaced or delegated to an AI.

However, AI can augment human work and create more effective and efficient tasks – especially when the AI’s decisions are explained. We thus call for a paradigm shift: rather than “replacing” humans with AI, it will be necessary to “augment” humans. This requires a completely different set of tools, and we deem one particularly relevant: XAI.

Introducing the Manufacturing Analytics System

Modern factories generate substantial amounts of data, but frequently, it is not effectively utilized. This article explores how a new software category — the Manufacturing Analytics System — helps manufacturers turn their data into valuable insights for productivity improvement.

The need for a new software category in the manufacturing industry

The manufacturing industry is currently undergoing a significant transformation. Growing process complexity, changing macroeconomic trends, and rapid technological advancement make it increasingly urgent and difficult to achieve operational excellence.

In response to these changes, manufacturers need to find new ways to improve their productivity. Recent technological advances are a great opportunity to support manufacturers in this endeavor. Over the past years, there has been considerable progress in manufacturing data collection, which has been spurred by the development of IIoT sensor technologies, standardized network protocols, and cloud-based storage and computation. However, the mere collection of data does not automatically lead to increased productivity in factories. Often, this data is not effectively utilized to drive improvements.

Despite significant digitization efforts, the effective use of analytical tools in today’s factories remains limited. For many manufacturers, the situation can be best described as being data-rich but information-poor. According to IBM, approximately 90% of all sensor data is never analyzed. This is disappointing, because the lack of operational insights is currently considered as a key obstacle that manufacturers need to overcome.

What has led us to this situation? In essence, 21st century factories are still managed with 20th century tools. Data is stored in disjoint sources, software is fragmented and verticalized, there is no standardization across sites and teams. Consequently, key employees such as operations managers, process experts, and data scientists are bogged down by the extensive efforts required in aggregating and cleaning data. It is evident that we need unification of analytical standards and workflows to leverage existing data much more effectively.

The answer to these challenges is a new category of software: The Manufacturing Analytics System (MAS). A MAS creates a common context across disparate data sources, analyzes data with the latest AI techniques, and makes the results accessible in a suite of interoperating applications. The applications in a MAS are tailored for the different people involved in achieving operational excellence. A MAS serves as an intermediary between data and users, provides deeper insights faster, enables new types of automation, and streamlines the looping of decisions back to the factories. It makes employees considerably more effective and improves operational KPIs sustainably.

Under the hood of the Manufacturing Analytics System

Manufacturing Analytics Systems are designed to serve end-users across all levels of operational excellence, ranging from factory floor personnel to upper management. A MAS is structured around three layers that transform diverse manufacturing data into valuable insights for productivity improvement:

  • an Application Layer,
  • a Model Layer,
  • and a Context Layer.

The Application Layer contains user-facing software tools that enable productivity improvement. These tools are built on top of a Model Layer, which houses specialized AI models that are designed to generate insights or decisions from manufacturing data. The Context Layer is responsible for gathering data from a wide range of origins and channeling it into the Model Layer for processing. Each of the layers is described in more detail below.

Context Layer

The Context Layer serves as the foundation of a MAS. It prepares and organizes the data for its use in the Model Layer. This layer does not duplicate existing databases. Instead, it stores only relevant data, merged from different sources, in a unified and aggregated format. It provides the crucial link across disparate data sources. This is achieved by mapping or creating common identifiers like timestamps, part IDs, and batch IDs. The context and connections in this layer enable comprehensive analysis across different datasets.

Model Layer

The Model Layer comprises advanced AI models for complex data analysis in manufacturing. Unlike common data platforms that use off-the-shelf algorithms (e.g., Random Forests, GBM, ResNets, etc.), this layer involves tailored models specifically designed for manufacturing tasks (e.g., root cause analysis, visual inspection, material flow analysis, etc.). Tailored models in the Model Layer enable a MAS to effectively address challenges where generic approaches fail.

Examples include EthonAI’s graph algorithms for root cause analysis, LLMs trained on manufacturing queries, and specialized computer vision models for quality inspection. 

Application Layer

The Application Layer contains the user-facing applications that leverage the data processed by the underlying layers. It is a no-code/low-code environment where users engage with the relevant information in tools that are custom-built for manufacturing workflows. This layer is designed to be intuitive for a large set of user personas and to directly integrate into their workflows.

Examples include EthonAI’s Analyst, Inspector, Observer, Tracker, and Miner software.

How will the Manufacturing Analytics System change the industry?

Over the past five years, industry leaders have significantly improved their data acquisition capabilities. Over the next five years, we expect the mid-market segment will catch up. This opens the door to unprecedented productivity levels in the industry. But to really capitalize on their data assets, manufacturers need stronger and widespread analytics. Manufacturing Analytics Systems will quickly become essential for this purpose.

A MAS offers manufacturers advanced analytical capabilities and user-centric applications to improve operational decision-making. It equips manufacturers with the right set of tools to operate as effectively as lighthouse factories. The MAS is not just a technological advancement; it represents a paradigm shift in manufacturing. It commoditizes access to advanced analytics for the wealth of manufacturing data and enables widespread improvement of operations.

EthonAI is building the applications and infrastructure to lead this transformation. Explore the Use Cases on our website to discover how EthonAI’s MAS is already used by industry leaders to cut production losses and achieve operational excellence today.