Skip to content
Meet CIM’s Machine Learning Classification Model
7:23

Machine learning, artificial intelligence (AI) and data science are interrelated sciences whose terms are often used interchangeably, sometimes leading to confusion. However, a company’s effectiveness and ultimately success may depend on how it implements these fields. Indeed, a 2020 Deloitte survey of 2,875 IT and business executives from various industries, including Energy, found that 67% of companies are using machine learning, and 97% are using or planning to use it in the next year.

In this blog post, we’ll discuss how OneBridge Solutions (OBS) applies machine learning and data science to pipeline integrity challenges to predict and ultimately prevent pipeline failures.

Machine Learning

According to the MIT Sloan School of Management: “Machine learning is a subfield of artificial intelligence that gives computers the ability to learn without explicitly being programmed.” (To add to the confusion, AI is sometimes referred to as a subset of machine learning.) AI is an evolving and broad term to describe a field of science that uses technology to create machines that can mimic human intelligence. This definition is deliberately vague, as AI is often used as a catchall for data science and machine learning as well as technology that is yet to be developed. 

Machine Learning (ML) is the science of designing software that can learn autonomously or in concert with other machines or humans. These programs or models can learn from data, often applying statistical techniques to recognize patterns in the data and generalize their learnings to unseen data, thus performing tasks without explicit instructions. This effectually allows computers to program themselves through “experience” or to be self-running. 

Machine Learning Data

ML models learn from the data they are provided with and then use their learning to either describe, prescribe or predict. Machine learning starts with data that has been gathered and prepared, often referred to as “training data.” The model’s performance can then be evaluated by applying the model to unseen data. The more data, the better the model works.

Data Science

Data science is using scientific methods, statistics and other tools to extract insights and create useful intelligence from data. Machine learning can be used in data science to automate the process of data analysis and data science can be utilized to contribute to the growth of AI and machine learning. Because they are intertwined, machine learning and data science are often used interchangeably. Therefore, drawing a distinction between the two is often unnecessary.

A more important dichotomy is machine learning/data science vs. traditional rule-based algorithms. Machine learning can perform tasks without explicit instructions whereas algorithms not based on ML need explicit instructions to work.

CIM’s Application of Machine Learning

Machine learning is utilized in the first Cognitive Integrity Management (CIM™) process, Assessment Planning. Here, the details of each integrity assessment are managed but most importantly, results of said assessments are uploaded into the software platform where the data is “ingested,” analyzed and standardized by the classification model.

The Problem:

With over 30 different companies who provide inline inspection services, not only does the formatting vary with each company, but the way in which each pipeline anomaly is classified and described can vary significantly from company to company and report to report.

Therefore, the exact same anomaly e.g., “dent with metal loss” could be identified several different ways, depending on the ILI company or technology used:

  • Anomaly type = dent with metal loss
  • Anomaly type = dent; Comment = with metal loss
  • Anomaly type = dent; Metal Loss depth = 20% (indirectly indicating a dent with metal loss)
  • Anomaly type = dent

 

For various reasons, pipeline operators use different companies for inline inspections but require these inline inspection reports to be compared to understand the change in anomalies over time. This requires that the inline inspection data be classified and normalized in a consistent fashion.

The Solution

The OBS team developed a series of classification models trained on over 6,000 inline inspection reports with hundreds of different formats and over 50 million reported anomalies. This ensures that the ingestion process can accept a wide variety of different report structures while also accurately identifying and (re)classifying anomalies based on the different ways they’ve been described by the inline inspection provider. 

This results in machine learning models that can interpret the data in each ILI report, regardless of the format or vendor. 

To do this, the feature classifier first looks at ALL the available data in the pipe tally i.e.

  • anomaly type
  • description of the anomaly
  • any additional comments
  • associated attributes of each anomaly

to correctly determine what is being described by the ILI report. Then the model interprets or reclassifies each anomaly to a standardized name and format.

In the “dent with metal loss” example above, the anomaly would be categorized in CIM in the same manner for each ILI, allowing for the pipeline operator to track this anomaly from inspection to inspection and determine if the dent and/or corrosion has changed and at what rate, specifically to calculate a corrosion growth rate. 

The machine learning model can also “pick out” data that is important to the anomaly analysis e.g. gas class location, values for MOP/MAOP or ILI sizing tolerance, etc.

Pipeline operators are therefore not limited to a single inline inspection company and can effectively match and compare anomalies across inline inspections throughout the entire history of the pipeline, allowing for a more effective and efficient pipeline integrity management program.

Additional ML and Data Science Applications

Many machine learning models are predictive in that they are utilized to predict or impute date. To that end, ML can be utilized to fill in data gaps in pipeline information e.g. when gathering information for a risk assessment, an integral part of a pipeline integrity program. Running risk analyses on pipelines with missing data can yield results that are nonsensical. Therefore, filling in data gaps before conducting a risk assessment can yield more useful results. If a pipeline operator has a pipeline that is missing coating information for ½ mile of a 10-mile pipeline, a machine learning model can be used to predict the missing coating type. It can also be used to flag data discrepancies e.g. Does the coating type say coal tar (a pipeline coating whose usage significantly decreased after 1990) but the install date says 2020?

ML models can also be trained on the information on a set of pipelines and utilized to predict an event on a different but similar pipeline.    

Click here to learn more about how OneBridge has used applied machine learning/data science to predict internal corrosion

Want to learn more? Contact us!