Skip to content

A story of why causal AI is necessary for root cause analysis in manufacturing

Traditional machine learning is designed for prediction and often struggles with root cause analysis. The article presents a short story demonstrating how causal AI overcomes this problem.

Why causality is needed for decision-making

Data-driven decisions are paramount to stay competitive in today’s manufacturing. However, for effective decisions, we need tools that transform data into actionable insights. Traditional machine learning tools, while great for predictions, fall short in decision-making due to their inability to grasp cause and effect relationships. They fail to understand how different decisions impact outcomes. To make truly informed decisions, understanding these cause and effect dynamics is crucial.

Causal AI provides manufacturers with entirely new insights by going beyond the prediction-focused scope of traditional machine learning. It seeks to uncover the causes behind outcomes, which enables us to assess and compare the outcomes of different decisions. This offers crucial insights for more informed root cause analysis. For manufacturers, this means not only predicting what will happen, but which decision can be taken now that leads to a better outcome in the future.

What is causal AI?

Causal AI, at its core, is an advanced form of artificial intelligence that seeks to understand and quantify cause-and-effect relationships in data. In particular, causal AI aims to understand how one variable A influences another variable B. This is important for decision-making, since if we want to change A with the goal to increase B, we need to know how A influences B. Traditional machine learning only uses A to predict B, but cannot answer what happens to B if we change A as we will see in an example below. However, the answer to this question is important for decision-making, in particular in the context of root cause analysis in manufacturing.

This article looks into the task of root cause analysis for quality improvement. The focus is to maximize “good” quality and minimize “bad” quality outcomes. Simply predicting when quality will drop is not enough in this setting. The objective is to identify and adjust specific production parameters (like adjusting a machine setpoint) when bad quality is observed, to restore good quality. Therefore, understanding the cause-and-effect relationships between these production parameters and the product quality is key. This knowledge allows us to pinpoint which parameters are causing quality issues and make necessary changes to achieve desired quality levels consistently. In the following, we tell a short story to demonstrate the capabilities of causal AI in this context.

Causal AI for root cause analysis

Let’s imagine a manufacturing company specializing in plastic Christmas trees, a seasonal product where quality and timeliness are key. The company faced a peculiar challenge: a noticeable drop in the quality of their plastic trees. Naturally, they turned to data for answers.

Their initial investigation was led by a skilled data scientist, who collected data about the production process. The production process consists of two steps: First, the plastic branches are sourced from a supplier. Second, the branches are put through a machine which attaches the branches to the trunk. There are two possible suppliers, A and B, and two possible machines, M1 and M2.

The data scientist used traditional machine learning techniques, which focused on predicting the quality based on the collected data. This led to an intriguing conclusion: The machine learning model suggested that machine M1 produced worse quality than M2. Based on this analysis, the data scientist recommended stop using the machine M1, which would lead to a substantial reduction in throughput and, hence, reduced production capacity. However, the story took a twist when the company decided to scrutinize both machines. To their astonishment, there was no recognizable difference in the settings of the machines or the machines themselves. This puzzling situation called for a deeper analysis, beyond what traditional machine learning could offer.

Luckily, a friend of the company’s data scientist is a renowned causal AI expert. The expert developed a tailored causal AI algorithm for the production process, seeking not just good predictions, but to understand the underlying cause-and-effect relationships in the production process. The causal AI model revealed an unexpected insight: the root cause of the quality drop was not the machine, but the supplier. In fact, it revealed that Supplier A delivered branches of worse quality than Supplier B. After talking to the factory workers, the company found out that the workers always put the branches of Supplier A through machine M1 and the branches of Supplier B through machine M2. They did this simply because the machines were closer to the boxes with the corresponding branches. Hence, all the low-quality branches of Supplier A ran through machine M2, which made machine M2 look like it is causing the drop in quality. 

But why did the traditional machine learning model fail to identify the true root cause? The reason is that its objective is prediction and, for this, knowing which machine the branches went through was enough to predict the quality perfectly. In particular, since the traditional machine learning model didn’t understand the underlying cause-and-effect relationships, it simply used all available parameters. However, by doing so, it also used the machine as a parameter, which, in this example, is a so-called mediator. By using this mediator, it “blocked” any indirect influence from the supplier via the machines. As a result, the influence of the supplier got lost. Since the causal AI understood the underlying cause-and-effect relationships, in particular the relationship between supplier and machine, it could correctly identify the true root cause.

Armed with this causal insight, the company informed Supplier A about the quality of their branches, which they ultimately were able to improve with new specifications. As such, leveraging causal AI averted a prolonged production stop of machine M1, which would have cost the company a lot of money. All of this just because the traditional machine learning model focuses on prediction, but not on understanding the underlying cause-and-effect relationships. Only a causal AI model could identify and rectify the true root cause of the quality issue.

In this simplified scenario, it would be easy to carefully check all parameters and production steps manually. But imagine a real-world scenario, in which we have hundreds or even thousands of parameters across many process steps. In such a setting, the clear association between machine M1 and quality, identified by traditional methods, can easily be mistaken for a root cause. And manually checking for other influence factors would be tedious, if not impossible. In this case, causal AI can identify the root cause immediately and, as such, saves a lot of time and costs.

Opportunities and challenges of causal AI in manufacturing

The opportunity of causal AI is clear: it offers new ways for manufacturing to identify the true root causes of problems. This depth of insight empowers manufacturers to make decisions that address core issues, leading to enhanced efficiency, quality, and competitive advantage. 

However, the adoption of causal AI is challenging. One significant hurdle is the absence of off-the-shelf software, which can be used without being a data scientist. Moreover, as the above example showed, even seasoned data scientists often lack the experience with causal AI. This is mainly because causal AI is a relatively new field. Despite these challenges, the potential gains in operational understanding and performance are substantial. 

If you’re interested in finding out how causal AI can help your problem-solving efforts, we invite you to book a demo and experience the impact firsthand.

Process Mining: A new take on material flow analysis in manufacturing

Traditional methods for throughput improvement provide only a static view of manufacturing processes. This article explores process mining as a dynamic alternative. We analyze its application in bottleneck analysis, inventory analysis, and process variance analysis within manufacturing.

Why new process mapping methods are needed

Increasing the rate at which parts flow through a factory is a key objective in manufacturing. Better material flow can be achieved by eliminating three key obstacles: bottlenecks, process variation, and non-value-adding activities. Identifying and eliminating these obstacles is a key task for manufacturers. However, in practice, the dynamic nature of manufacturing processes often makes this task challenging. Conventional problem-solving methods frequently fall short in capturing the subtleties necessary to detect inefficiencies in material flows.

To visualize and understand their material flows, manufacturers regularly turn to manual process mapping methods. A common process mapping method is Value Stream Mapping (VSM), which has become a standard to uncover areas for improvement by illustrating both material and information flows within a factory. This provides a snapshot of the flows through a factory and supports manufacturers to analyze inefficiencies for the period of observation. However, as current process mapping tools are static (i.e., they only show the state of operations for a particular observation period), they are less effective when the material flows dynamically change over time. Moreover, existing methods require high manual effort, which limits their applicability in factories with high complexity. This leads to undiscovered inefficiencies and suboptimal process improvements. 

This article explores how process mining can address the limitations of traditional process mapping methods. Process mining is a recent innovation in information system research that leverages event log data to dynamically analyze process flows. Despite noticeable success in other areas, such as service operations and supply chain processes, the potential for applications in manufacturing has not yet been fully recognized. Therefore, we will discuss three key use cases of process mining in manufacturing: bottleneck analysis, inventory analysis, and process variance analysis. Each use case will highlight how process mining offers significant advantages over traditional approaches for material flow analysis and helps manufacturers identify even the smallest inefficiencies.

How process mining enables better material flow analysis

Process mining is a technique for analyzing process flows based on event log data. An event log is a record of activities, capturing at minimum the sequence and timing of steps as they occur. For instance, in manufacturing, each step in a production process can generate an event log entry. Such entries include details like what happened, when it happened, and potential further information (e.g., on which machine, which operator, what costs). Process mining can use these event logs to understand how individual parts flow from one process step to another, thereby uncovering inefficiencies and throughput improvement opportunities.

So what data can be used for process mining in manufacturing? Key data includes a unique unit ID (typically the part or batch number), incoming and outgoing process timestamps (e.g., from 08:55 to 09:05), activity names (e.g., assembly and testing), and equipment names used to process a part. This information tracks each product unit’s journey through various stages of production.

The table below offers a glimpse into how this data might be structured in a manufacturing process involving assembly and testing steps, with each entry capturing a specific moment in the production flow. For most manufacturers, these event logs are readily available in their IT systems (even though they might not know it), most commonly in the format of XES or CSV files. 

Although process mining was originally not developed for manufacturing, its application in this field is very promising. Its ability to provide insights into a factory’s material flow by aggregating the insights from individual product units is highly valuable. Given these opportunities, it is surprising that, to date, the adoption of this technique remains limited. 

We identify three major reasons for the low adoption rate: (i) data acquisition, (ii) data processing, and (iii) value generation.

First, manufacturers often have a mix of newer and older machines that might be not digitized and do not provide the required data. However, in these situations, simple low-cost retrofit solutions can be applied to collect the required data. For instance, data matrix codes can be attached to part carriers, which are then read before and after a process with off-the-shelf scanners. In our experience, obtaining a first view on the data can be achieved in less than a day.

Second, manufacturing is often characterized by the convergence and combination of different parts. This has for long been a challenge for process mining, as the connection to one specific unit ID made it difficult to visualize and explore cases in which IDs are merging. However, due to recent developments in the field, object-centric process mining helps to overcome this hurdle.

Third, manufacturers are missing an overview of concrete use cases, which guides them how process mining can yield tangible improvement opportunities. This article sheds light on these opportunities by presenting three use cases how process mining can be used in manufacturing.

Example 1: Bottleneck analysis

One of the key challenges in manufacturing is to identify and resolve bottlenecks that hinder efficient material flow. Process mining offers a unique solution to this challenge. By analyzing event log data, it provides a detailed “replay” of the material flow through each step of the manufacturing process. This approach enables manufacturers to observe the journey of each part in real-time and retrospectively.

Through process mining, manufacturers can accurately estimate how long parts spend at each stage of a factory (i.e., cycle time), including the time spent waiting before and after each process step (i.e., waiting time). This detailed analysis helps in pinpointing specific improvement areas, be it a machine or a process step, where delays frequently occur. By identifying these bottlenecks, manufacturers gain valuable insights into which processes are constraining the throughput of the entire factory.

The main advantage of process mining over traditional approaches lies in its dynamic capabilities, which allows for the detection of bottlenecks that may shift from one process step to another within specific time intervals. For example, in more complex production setups, where multiple product variants are produced on the same production line, each variant, each shift, and each sequence of machines might have a different bottleneck. Process mining is capable of identifying these specific bottlenecks independent of when they occur, thereby providing insights that are crucial for effectively managing the productivity of complex factories.

Example 2: Inventory analysis

Following the identification of bottlenecks, another critical application is the reduction of (buffer) inventory between machines. Just as process mining helps identifying slow processes, it also illuminates areas where excessive inventory accumulates before and after an activity. This insight is invaluable for streamlining buffer inventories, ensuring that resources are not unnecessarily tied up in idle stock – and, at the same time, not starving due to missing parts.

Process mining essentially enables a dynamic form of value stream mapping, which offers a real-time view of inventory levels throughout a factory. It provides insights on the inventory levels between each process step at any given point of time. By analyzing these data, manufacturers can allocate their resources more effectively, minimize inventory holding costs (especially in areas preceding or following bottlenecks), and reduce lead times.

Example 3: Process variance analysis

Another key objective is to detect deviations from the intended process flow. In process mining, the match between the actual process sequence and the intended process sequence is referred to as conformance. Process conformance checking aims at comparing the target process model with the actual process execution. This involves analyzing event logs to see whether the sequence in which a part travels through a factory adheres to the pre-defined flow.

Deviations from the intended process flow can have two reasons: (i) either the observed process shows unintended deviations, such as rework or unexpected loops, or (ii) the master data used to create the target process model is incorrect (e.g., master data not updated). Process mining allows for a detailed analysis of these process deviations. 

The conformance of process flows can be quantified by different measures (e.g., “fitness”), which shows how well each actual process matches the intended process. This helps in identifying mismatches between the actual events and the as-designed process flow. These insights are crucial for aligning the actual manufacturing process with the intended design.

Unlike static process mapping methods, which only provide a limited snapshot of the material flow, process mining offers a dynamic, ongoing analysis. It captures a comprehensive view of all deviations, not just the ones visible at a particular moment. This dynamic capability is essential for ensuring manufacturing processes consistently adhere to their intended designs and to continuously update the design process to obtain proper master data, both of which is crucial for improving material flow throughout factories.


Process mining offers significant advancements in productivity improvement for manufacturing. Utilizing untapped event log data, which manufacturers frequently store hidden in their operational history, it provides a dynamic and detailed perspective on inefficiencies in material flows. Unlike traditional process mapping, process mining offers deeper insights by dynamically analyzing the as-realized process flow throughout a factory. This enables manufacturers to identify bottlenecks, manage inventory effectively, and ensure process conformity. 

Anticipated advancements in the field promise to expand the capabilities of process mining significantly. For instance, the emergence of AI-based chatbots could significantly enhance the scope of prescriptive process mining. Chatbots enable a direct interaction with a process model, thereby allowing even inexperienced analysts to ask specific questions on complex material flows (e.g., “What is the bottleneck of product A during Q1 2024 and what is the best way of removing it?”). 

With recent trends, such as mass customization, the analysis of material flows becomes increasingly complex. Process mining emerges as a powerful tool for informed decision-making and achieving operational excellence.

We thank Dr. Rafael Lorenz for contributing to this article. For a detailed exploration of how process mining can be used in manufacturing, we refer interested readers to our paper in the International Journal of Production Research.

What are distributional shifts and why do they matter in industrial applications?

This article examines three types of distributional shifts: covariate shift, label shift, and concept shift, using a milling process simulation for downtime prediction. It emphasizes the need to monitor data distributions to maintain good model performance in dynamic environments like manufacturing.

What is distributional shift?

“Shift happens” – a simple yet profound realization in data analytics. Data distributions, the underlying fabric of our datasets, are often subject to frequent and often unexpected changes. These changes, known as “distributional shifts,” can dramatically alter the performance of machine learning (ML) models. 

A core assumption of ML models is that data distributions do not change between the time you train a ML model and when you deploy it to make predictions. However, in real-world scenarios, this assumption often fails, as data can be interconnected and influenced by external factors. An example of such distributional shifts is how ML models went haywire when our shopping habits changed overnight during the pandemic. 

There are three primary types of distributional shifts: covariate shift, label shift, and concept shift. Each represents a different way in which data can deviate from expected patterns. This article explores these forms of distributional shifts in a milling process, where we assess the performance of ML models in a downtime prediction task.

Simulation setup

In order to illustrate the impact of distributional shifts in industrial settings, we set up a simulation for predicting machine downtime in a milling process. Our goal is to determine whether a machine will malfunction during a production run, which helps anticipate capacity constraints proactively.

Our primary variable of interest is “Machine Health,” a binary indicator where:

  • Machine Health = 0 indicates a machine breakdown.
  • Machine Health = 1 indicates the machine is running flawlessly.

Several covariates are considered to be potential predictors of Machine Health. These are:

  • Operating Temperature (°C)
  • Lubricant Level (%)
  • Power Supply (V)
  • Vibration (mm/s)
  • Usage Time (hours)

The relationship between these covariates and the Machine Health is encapsulated in the below ground-truth function. This function dictates the conditions under which a machine is likely to fail:

Here’s a breakdown of the formula:

  • If the Operating Temperature is below 50°C or above 60°C there is a machine breakdown
  • If the Lubricant Level is below 8% or above 10% there is a machine breakdown
  • If the Power Supply is below 210V or above 230V there is a machine breakdown
  • If the Vibration is above 3 mm/s there is a machine breakdown
  • If the Usage Time is above 20 hour there is a machine breakdown

To simulate this process, we generate data for 1000 production runs based on the above ground-truth function. A Random Forest classifier is then trained on this data. Note that the above ground-truth function is assumed to be unknown when making predictions. The classifier’s task is to predict potential breakdowns in future production runs based on the observed covariates.

No shift

Moving forward with our simulation, we now introduce a “no shift” scenario to evaluate the robustness of our trained Random Forest classifier under conditions where there is no distributional shift. We generate an additional 250 production runs, which serve as our test set during model deployment. These new runs are created under the same conditions and assumptions as the initial 1000 production runs used for training the model. It’s important to note that we are deliberately not introducing any changes to the underlying data distribution in this particular example. This allows us to evaluate how well the classifier performs when the test data closely mirrors the training data.

Below, we present a comparative visualization of the variables’ distributions between model training and model deployment (i.e., predicting the Machine Health for the test set). The figure offers a clear perspective on the consistency of data across both the training set and the test set. By keeping the distribution of Operating Temperature, Lubricant Level, Power Supply, Vibration, and Usage Time identical to the training phase, we aim to replicate the ideal conditions under which the ML model was originally trained.

In this no-shift scenario, our Random Forest classifier achieves an accuracy of 100%. This result is not surprising, as the test data was created with the same data generation function as the training data. The classifier effectively recognizes and applies the patterns it learned during training, which leads to flawless predictions. This no-shift scenario serves as a benchmark against which we will compare the model’s performance in subsequent scenarios involving different types of distributional shifts.

Covariate shift

We now explore the concept of covariate shift, a common challenge in the application of ML models in real-world settings. Covariate shift occurs when there is a change in the distribution of the covariates between model training and model deployment (Ptrain(x) ≠ Ptest(x)), while the way these covariates relate to the outcome remains the same (Ptrain(y|x) = Ptest(y|x)). 

To demonstrate the effect of covariate shift, we simulate a scenario for our simulated downtime prediction task. We assume that our factory has received a substantially large order, which requires us to extend the Usage Time on our milling machine. This increase in Usage Time naturally leads to higher Operating Temperatures and a decrease in Lubricant Levels, which alters the data distribution for these specific covariates.

We simulate an additional 250 production runs under these modified conditions. This new data serves as our test set, which now differs in the distribution of key covariates (Operating Temperature, Lubricant Level, and Usage Time) from the original training set. Below, we visualize the differences in distributions of these covariates between model training and model deployment.

When applying our previously trained Random Forest classifier to this new test set, we observe a significant drop in accuracy, with the model achieving 72% accuracy. This decrease from the 100% accuracy seen in the no-shift scenario clearly illustrates the challenges posed by covariate shift. The model, trained on data with different covariate distributions, struggles to adapt to the new conditions, which leads to a noticeable reduction in its predictive accuracy.

Our results demonstrate the importance of monitoring covariate shifts in dynamic environments. Detecting these shifts is crucial, but in high-dimensional scenarios where multiple covariates may shift together, tracking individual features becomes a challenging task. Compared to our simulated environment, real-world applications may involve hundreds of covariates. To address this high complexity, one can turn to dimensionality reduction techniques like t-SNE (t-distributed Stochastic Neighbor Embedding).

t-SNE is a nonlinear dimensionality reduction technique that can be used to visualize high-dimensional data in a low-dimensional space. Such visualizations provide a clear perspective on how the data is distributed across different dimensions. The t-SNE plot below demonstrates this for our milling process, where the five covariates are reduced to two dimensions, while aiming to retain most information. It can be observed that the training data (gray) and testing data (blue) form distinct clusters, which visually indicates a covariate shift. This separation highlights the need to reassess our model’s predictive reliability under these new conditions.

Label shift

Next, we delve into label shift, another form of distributional shift. Label shift occurs when the distribution of the labels changes between modeling training and model deployment (Ptrain(y) ≠ Ptest(y)), while the conditional distribution of the covariates given the label remains constant (Ptrain(x|y) = Ptest(x|y)).

To illustrate an example of label shift, we simulate a test set for another milling machine, which, compared to the first milling machine, is more prone to breakdowns. This change increases the likelihood of machine failures, thus altering the distribution of the Machine Health label. We generate data for 250 production runs under these new conditions, where the probability of breakdowns (Machine Health = 0) is higher than in our initial training dataset.

In the figure below, we visualize the distribution of the covariates and the Machine Health label across the training and testing data sets. The visual comparison clearly shows a shift in the label distribution, with a substantially higher frequency of breakdowns in the test data. Note that there is also a shift in the Vibration covariate. Here it is assumed that Vibration is a symptom of machine breakdowns. Following the definition of a label shift, the causal effect here is from the label (i.e., Machine Health) to the covariate (i.e., Vibration).

We now apply the Random Forest classifier to this new dataset and find that the model achieves an accuracy of 86%. This result marks a clear decrease from the perfect accuracy observed in the no-shift scenario. Despite being trained on data where breakdowns were frequent, the model now faces a scenario where breakdowns are more common, which deteriorates its predictive accuracy.

This example highlights the need for models to be adaptable to changes in the label distribution, especially in dynamic environments where the frequency or nature of the target variable can vary over time. Detecting label shifts is more straightforward than identifying covariate or concept shifts. Regular examination of the class label distributions is a key approach to ensure they accurately represent the deployment environment.

Concept shift

Concept shift, or concept drift, is the third form of distributional shifts that we will investigate. It is characterized by changes in the underlying relationship between covariates and labels over time (Ptrain(y|x) ≠ Ptest(y|x)). Such shifts mean that a model’s predictions may become less accurate as the learned relationships become outdated in the face of new data dynamics.

To illustrate concept shift, we introduce a new context to our simulation. Specifically, we assume that following multiple machine breakdowns we introduced a new maintenance routine for our milling machines. This new routine affects the relationship between our covariates and the Machine Health label. With better maintenance, the milling machines can operate for longer periods without breakdowns, thereby altering the way the Usage Time covariate relates to the Machine Health label. The new maintenance routine adjusts our ground-truth formula as follows:

We simulate 250 production runs with the updated maintenance routine for the test set. The distributions of the covariates and the Machine Health label in the training and testing datasets are depicted below. This deployment setting reflects the new reality where the relationship between the Usage Time and Machine Health has shifted due to improved maintenance practices. Despite similar covariate distributions, it can be observed that the number of machine breakdowns has been significantly reduced. 

The Random Forest classifier, which was initially trained under different operational conditions, now encounters data where the ground-truth relationship between variables has fundamentally changed. When applying the classifier to this new data, we observe an accuracy of 84%. This decrease from the no-shift scenario demonstrates the impact of concept shift on the model’s predictive accuracy. 

Detecting concept drift is challenging because it can often appear gradually. A general strategy to detect this form of distributional shift is to systematically monitor the performance of ML models over time. This continuous assessment helps in pinpointing when the model’s predictions start deviating from expected outcomes, suggesting that the underlying relationships between covariates and labels might have changed.


This article highlights a fundamental challenge in real-world applications of ML: data is constantly changing, and models must be adapted accordingly. We conducted simulations of machine downtime in a milling process to showcase the challenges posed by covariate, label, and concept shifts. In an ideal no-shift scenario, our model achieved perfect accuracy, but this quickly changed under real-world conditions of shifting data distributions.

Conventional ML models, like Random Forests, are increasingly used for industrial applications. While these methods have been made accessible to a wide audience through open source libraries (e.g., scikit-learn), they are often blindly deployed without fully assessing their performance in dynamic environments. We hope this article prompts practitioners to prioritize model monitoring and regular model retraining as key practices for preserving long-term model performance.

Readers interested in learning more about distributional shifts in manufacturing can find a case study from Aker Solutions in our recent paper, published in Production and Operations Management.

Industrial anomaly detection: Using only defect-free images to train your inspection model

This article explains why it is important to use an inspection approach that does not require images of defective products in its training set, and what kind of algorithm is suited in practice.

Requirements for visual quality inspection

Industrial anomaly detection in quality inspection tasks aims to use algorithms that automatically detect defective products. It helps manufacturers achieve high quality standards and reduce rework.

This article focuses on industrial anomaly detection with image data. Modern machine learning algorithms can process this data to decide whether a product in an image is defective. To train such algorithms, a dataset of examplary images is needed.

An important feasibility criterion for manufacturers is the way these training datasets need to be compiled. For instance, some naive algorithms require large datasets to work reliably (around one thousand images or more for each product variant). This is expensive and often infeasible in practice. That’s why we only consider so-called few-shot algorithms that work reliably with a low number of examples, specifically much less than one hundred images. 

Another aspect that distinguishes algorithms is whether examples of defective products are needed. Here, we can broadly distinguish two classes of algorithms: (1) “generative algorithms” that can learn from just  normal (or defect-free) products, and (2) “discriminative algorithms” that require normal and anomalous (or defective) images.

This is an important distinction for two reasons. First, anomalies are often rare, and when the manufacturing of new product variants starts up, no defective data is available for training. Secondly, by definition “anomalous” is everything that is not normal, which makes it practically impossible to cover all possible anomalies with sufficient training data. The latter is the more important argument, so let’s look at it in more detail.

Figure 1: Example from PCB manufacturing. The green connectors on the bottom of the PCB need to be mounted correctly as shown in (a). Possible defects are misplaced, missing or incorrect connectors. Examples of missing connectors are shown in (b), (c), and (d).

Figure 1 illustrates this. The single example in (a) should already give you a good impression of the concept of “normal.” By contrast, the training images in (b) and (c) are by no means sufficient to define the concept for “anomalous” (e.g., other defect types such as discolorizations or misplacements are not represented).

Choosing the right type of algorithm

To better understand how discriminative and generative models differ when applied to anomaly detection, we use the PCB example in Figure 1 to construct a hypothetical scenario. For the sake of simplicity, a discriminative algorithm can be thought of as a decision boundary in a high-dimensional feature space. Each image becomes a point in that space, and lies either on the “normal” or the “anomalous” side of the boundary. Figure 2 simplifies this even further, down to a two-dimensional feature space. Such algorithms look at the training data of the two classes (normal and anomalous) and try to extract discriminating features when constructing the decision boundary. As such, these algorithms are likely not robust on unseen and novel defect types.

Figure 2: The two subfigures show a simplified two-dimensional feature space of a discriminative model. The dashed line is the decision boundary of the model after training. The dots correspond to training and test images, where green means defect-free and red means defective. Dots with a black border were in the training set, the others were not. The letters refer to the images in Figure 1. (a) Only contains images that were used to construct the discriminative model (training images). (b) Contains both training and test images, highlighting the difficulties of a discriminative model to generalize to all types of anomalous images.

To see how a discriminative algorithm fails in practice, recall that anomalous is everything that is not normal, and consider that normal images tend to occupy only a small volume in the wide feature space. By contrast, the surrounding space of anomalous images is vast. It is thus very unlikely to gather sufficiently numerous and different examples of such images for training.

In the example of Figure 2, the training images (with a black outline) happen to cover just the lower part of the space, and the resulting decision boundary is good at making that distinction. But it does not encode the fact that defective products can also lie further above, to the right, or to the left – which is where the unseen example 1(d) happens to lie.

Figure 2 illustrates the problem with discriminative models, when defect types are not part of the training set. The decision boundary may end up working well on the training data, but previously unseen defects can easily end up on the “normal” side of the boundary. Concretely in this example, the image 1(d) happens to be closer in feature space to the non-defective images than the defective images 1(b) and 1(c).

For this reason, we strongly advocate to use algorithms that focus on learning the concept of normality instead, and can thus be trained solely from normal images. Such algorithms can also benefit from defective images in their training set, in order to improve robustness to specific types of defects, but crucially, they do not require them. Using ML terminology, we seek industrial anomaly detection algorithms that explain how normal data is generated, as opposed to discriminating normal from anomalous images. Such models can represent the generative process behind normal data. This can be used to judge whether or not an image could have been created via this generative process. If not, then the image is anomalous.


The Inspector offered by EthonAI provides a state-of-the-art solution for manufacturers to the problem of visual inspection. The EthonAI Inspector performs anomaly detection with generative algorithms that can be trained with just a few defect-free images. This is a great advantage in manufacturing environments, where gathering images is expensive, especially if examples of defects need to be in the training data. In addition, the nature of the algorithms that we deploy are robust towards unseen defects, as outlined above. We constantly observe that customers can uncover new defect types in their manufacturing process that they were unaware of before. This significantly improves the quality assurance process as a whole.

Generative modeling (or generative AI) has seen tremendous successes in the past years. It is expected that the usage of such models will continue to grow in manufacturing and help set new quality standards. Most real-world scenarios require knowledge on how normal images are generated, including factors of allowed variations such as lighting and position. EthonAI will continue to push the limits of such algorithms, and help you ensure that you don’t ship defective products to your customers.