Geophysical Insights hosting the 2018 OIl & Gas Machine Learning Symposium in Houston on September 27, 2018
Dr. Tom Smith presenting on Machine Learning at the 3D Seismic Symposium on March 6th in Denver
What is the "holy grail" of Machine Learning in seismic interpretation? by Dr. Tom Smith, GSH Luncheon 2018
Using Attributes to Interpret the Environment of Deposition - A Video Course. Taught by Kurt Marfurt, Rocky Roden, and ChingWen Chen
Dr. Kurt Marfurt and Dr. Tom Smith featured in the July edition of AOGR on Machine Learning and Multi-Attribute Analysis

Unsupervised vs. Supervised classifiers -Comparing classification results

Unsupervised vs. Supervised classifiers -Comparing classification results

By: Ivan Marroquin, Ph.D. - Senior Research Geophysicist

In machine learning, there is a very interesting challenge in comparing the quality of the classification result generated by either unsupervised or supervised classifiers. Most of the time, we opt for one technique over the other. Sometimes, we perform a comparison study and use a visual examination to decide which classifier produced the best outcome.

Can we do better than this? I believe so! Let’s assume that we have a dataset that consists of three well-defined groups of data points. Then, we use an unsupervised classifier to generate three clusters. The algorithm produces two outputs: (1) cluster centers and (2) membership of each data point to its closest cluster center. As a consequence, we get the boundaries of clusters (see figure A below). If we present the same data to a supervised classifier, assuming that the data points already have a class label assigned to them, the algorithm generates boundaries that separate a class from each other (see figure B below). So far, you would think: I cannot still compare the classification outputs. However, there is a common trait between these results: the presence of boundaries. What if I tell you that we can take advantage of the notion of the boundaries in the context of supervised classifiers. In a way, it can help to derive cluster centers associated with each predicted class (see cluster center symbols with dashed patterns in red in figure B below).


There are so many different types of classification problems, I focused on the case of lithofacies classification from wireline well log data. I used this data to implement a machine learning pipeline to derive cluster centers. The pipeline consists of three steps (see diagram below): (1) generate a lithofacies classification, (2) derive cluster centers from lithofacies classification result, and (3) validate cluster centers. Each of these steps was addressed with a specific machine learning algorithm. For the first step, a multi-class feedforward neural network was used. In the second step, an evolutionary algorithm was used. And in the last step, I used a metric learning algorithm. To ensure that the best performing model in each step of the pipeline was obtained, the algorithms interacted with an automated machine learning method. New research efforts in machine learning have brought forward a concept known as “automated machine learning”. The objective of this new shift is to take us away from the manual adjustment of hyperparameters to using machine learning to optimize another machine learning by finding its best hyperparameters configuration.


To demonstrate the effectiveness of the proposed machine learning pipeline and the quality of the obtained cluster centers, a lithofacies classification was produced from the derived cluster centers. In the next figure, from left to right, the first four panels show the wireline log data used to train the neural network. The following panel displays the neural network-based lithofacies classification. Note that three lithofacies classes were predicted: reservoir sand (bands in yellow), tight sand (bands in cyan), and floodplain rocks (bands in gray). The last panel displays the lithofacies classification from the derived cluster centers. There is a strong match between the two classifications in terms of the occurrence of reservoir sands, but also in the lithofacies sequence and boundaries.

            I am thankful to Geophysical Insights to grant the permission to present this research work at the upcoming SEG-SBGf Workshop on Machine Learning

            If you are interested in learning on how we extract meaningful geological information from seismic with machine learning, and how our technology has helped geoscientists in finding hydrocarbons, please visit us at Or, if you desire further information, feel free to contact us.



Seismic Interpretation of DHI Characteristics with Machine Learning

Direct Hydrocarbon Indicators (DHIs) are seismic anomalies due to the presence of hydrocarbons, caused by changes in rock physic properties, typically of the hydrocarbon-filled reservoir in relation to the encasing rock or the brine portion of the reservoir. The accurate interpretation of DHI characteristics has proven to significantly improve the success rates of drilling commercial wells.

Seismic Interpretation Below Tuning with Multi-Attribute Analysis

This international webinar describes how multi-attribute seismic analysis is applied using the Paradise® software to visualize thin beds and facies below classical seismic tuning thickness. The material is presented by Mr. Rocky Roden, an industry thought leader and Senior Consulting Geophysicist for Geophysical Insights.

Profile of the Future Interpreter: A movie by Dr. Kurt Marfurt of the AASPI Consortium, O.U.

Dr. Kurt Marfurt of the AASPI Consortium presents an example of how new interpreters are using machine learning to enhance seismic interpretation Full Video Transcription: [Murphy] Hey Dr. M, I've got a problem. I'm trying to map sand prone areas in a deep-water turbidite and I don't know where to start.

Seismic Facies of the Eagle Ford Texas

A presentation by Patricia Santogrossi at the Houston Geological Society (HGS) North American dinner covering seismic facies in the Eagle Ford.

Attribute Essentials: History and Theory

The origins of seismic attributes and the theory behind how they are calculated. For more information, please visit our website

Attribute Essentials: Categories of Attributes

For more information, please visit our website

Eagle Ford Formation Case Study

Latest Technology for Seismic Interpretation: Direct detection & delineation of facies architecture in the Eagel Ford Group or How did the Eagle Ford GP get Made?

An Introduction to Paradise 3.0

Paradise is multi-attribute analysis software that uses machine learning processes to extract more information from the seismic response, even below seismic resolution. Using Paradise, interpreters are able to analyze multiple attributes simultaneously and calibrate results to wells quickly.


Comparison of Seismic Inversion and SOM Seismic Multi-Attribute Analysis


Comparison of Seismic Inversion and SOM Seismic Multi-Attribute Analysis

 Inversion versus SOM Seismic Multi-attribute analysis

Self-Organizing Maps (SOM) is a relatively new approach for seismic interpretation in our industry and should not be confused with seismic inversion or rock modeling.  The descriptions below differentiate SOM, which is a statistical classifier, from seismic inversion.  

Seismic Inversion

The purpose of seismic inversion is to transform seismic reflection data into rock and fluid properties.  This is done by trying to convert reflectivity data (interface properties) to layer properties.  If elastic parameters are desired, then the reflectivity from AVO must be performed.  The most basic inversion calculates acoustic impedance (density X velocity) of layers from which predictions about lithology and porosity can be made.  The more advanced inversion methods attempt to discriminate specifically between lithology, porosity, and fluid effects.  Inversions can be grouped into categories: pre-stack vs. post-stack, deterministic vs. geostatistical, or relative vs. absolute.  Necessary for most inversions is the estimation of the wavelet and a calculation of the low frequency trend obtained from well control and velocity information.  Without an accurate calibration of these parameters, the inversion is non-unique.  Inversion requires a stringent set of data conditions from the well logs and seismic.  The accuracy of inversion results are directly related to significant good quality well control, usually requiring numerous wells in the same stratigraphic interval for reasonable results. 


SOM Seismic Multi-Attribute Analysis   

Self-Organizing Maps (SOM) is a non-linear mathematical approach that classifies data into patterns or clusters.  It is an artificial neural network that employs unsupervised learning.  SOM requires no previous information for training, but evaluates the natural patterns and clusters present in the data.  A seismic multi-attribute approach involves selecting several attributes that potentially reveal aspects of geology and evaluate how these data form natural organizational patterns with SOM.  The results from a SOM analysis are revealed by a 2D color map that identify the patterns present in the multi-attribute data set.  The data for SOM are any type of seismic attribute which is any measurable property of the seismic.  Any type of inversion is an attribute type that can be included in a SOM analysis.  A SOM analysis will reveal geologic features in the data, which is dictated by the type of seismic attributes employed. The SOM classification patterns can relate to defining stratigraphy, seismic facies, direct hydrocarbon indicators, thin beds, aspects of shale plays, such as fault/fracture trends and sweet spots, etc.  The primary considerations for SOM are the sample rate, seismic attributes employed, and seismic data quality.  SOM addresses the issues of evaluating dozens of seismic attribute volumes (Big Data) and understanding how these numerous volumes are inter-related.    

 Seismic inversion attempts to invert the seismic data into rock and fluid properties predicted by converting seismic data from interface properties into layers.  Numerous wells and good quality well information in the appropriate zone is necessary for successful inversion calculations, otherwise solutions are non-unique.  For successful inversions, wavelet effects must be removed and the low frequency trend must be accurate.


SOM identifies the natural organizational patterns in a multi-attribute classification approach.  Geologic features and geobodies exhibit natural patterns or clusters which can be corroborated with well control if present, but not necessary for the SOM analysis.  For successful SOM analysis the appropriate seismic attributes must be selected.


 Rocky Roden, Senior Geoscience Consultant, Geophysical Insights
Rocky R. Roden has extensive knowledge of modern geoscience technical approaches (past Chairman-The Leading Edge Editorial Board).  As former Chief Geophysicist and Director of Applied Technology for Repsol-YPF, his role comprised advising corporate officers, geoscientists, and managers on interpretation, strategy and technical analysis for exploration and development in offices in the U.S., Argentina, Spain, Egypt, Bolivia, Ecuador, Peru, Brazil, Venezuela, Malaysia, and Indonesia.  He has been involved in the technical and economic evaluation of Gulf of Mexico lease sales, farmouts worldwide, and bid rounds in South America, Europe, and the Far East.  Previous work experience includes exploration and development at Maxus Energy, Pogo Producing, Decca Survey, and Texaco.  He holds a B.S. in Oceanographic Technology-Geology from Lamar University and a M.S. in Geological and Geophysical Oceanography from Texas A&M University.


The Value of Instantaneous Attributes


The Value of Instantaneous Attributes

 Wheeler Diagram - Instantaneous Attribute for Multi-attribute analysis

Our industry may have heard of instantaneous attributes for the first time in 1979 with the publication of Taner et al’s paper on “Complex Trace Analysis”1 in Geophysics.  This author first became aware of them when in 1982 a little blue booklet, published by the same authors in 1981 in association with Seiscom Delta, came to my attention. It seemed that the suggested use for these attributes was for structural analysis. As an early seismic stratigrapher specializing in lateral prediction at Shell’s Bellaire Research Center, I acquired the programs to run as PAL jobs on our mainframe computer. Color plots of instantaneous phase, envelope, and instantaneous frequency with my own proprietary colorbars helped to look at the Atlantic Margin in a whole new way.
Often called instantaneous attributes, their fundamental distinction is of having a value at every 4 or 2ms sample which enhances continuity, makes discontinuities more apparent, and need not include the “cloaking” and distortion of amplitude.
The culmination at Shell of my work came with it very effective and most confidential application, in the late 80s earliest 90s, at Mars field to solve a drilling problem, reclassify a regional structure, and, most importantly to rewrite the book on the architecture and extents of multiple key reservoirs. Earlier work (’86-’89) on key applications in Brazil had literally caused a rewrite of the stratigraphic lexicon there and brought attention to the importance of their application.
I continued to use the technology ever since on all sorts of platforms and ,until 1999, when Clemenceau and Colbert of Amoco published one instantaneous phase section over Ram Powell field, had never seen another stratigraphic application. 
A second go round with multiple attributes in the period from ’98 -2002, was unsupervised but not neural network. It rather was based on technology that was used to map the human genome and as such was limited to the use of only 4 attributes at a time.

Current Case studies
Up to 16 Instantaneous attributes have been generated in a convenient interpretation software for which it is important to know that the Real Part amplitude data is used as the parent volume. A set of “instantaneous attributes” can also be created in the Paradise software (approx. 16) but these are calculated on the conventional RFC amplitude data.  This produces an inherent difference in some of the results, notably the Instantaneous Phase.
The intent of either approach is to deliver inherently smoother and higher frequency simultaneous multiple-attribute data to PCA and subsequently to and from the SOM classifier, a learning machine that organizes the samples into natural clusters that reveal geologic features and anomalies in the data. The use of machine learning tools represent a fundamental and dramatic step change in the science of Seismic interpretation (Tom Smith, pers. comm. April 2016). Using these tools, a relatively small group of instantaneous attributes, regardless of depositional environment,  seem to do an excellent job of resolving identifiable system and facies tracts in seismic data here with a sample based resolution of 10-12’ for 2ms data.
From the PCA analysis for the Eagle Ford for example, nine attributes were run in the SOM along with using the base survey data to “prune,” that is, remove null values from the data volume so as to not assign a neuron wastefully. A brief description of the five most commonly occurring and most independent instantaneous attributes in neuron clusters in the Eagle Ford is as follows: 

  • Instantaneous Phase (10.1%) which is useful for stratigraphic and structural continuity and discontinuity enhancement;
  • Normalized Amplitude (13.9%) aka Cosine of Instantaneous Phase which returns the energy distinctly from peaks versus troughs; 
  • Relative Acoustic Impedance (14.8%) helps to resolve geobodies;
  • Envelope or Total energy of the entire reflected waveform includes the Real Part (15.9%) or that which is equivalent to the base survey data that is measurable and the Imaginary Part (12.2%) which is not. 
  • Trace Envelope (5%) comprises a much smaller part of the data
  • In addition to the above, Thin Bed Indicator (3.5%), Instantaneous Frequency (2.5%), and Envelope 2nd Derivative (?), rounded out the nine suggested by the PCA. However, these three were less evident in the area investigated or were possibly in the background.

A check of the total relative and total independence of these attributes in the neuron clusters that characterize the facies of the Eagle Ford Group clastics and carbonates showed the following:

Only 3 neuron clusters (facies) out of 26 have just 3 prominent attributes (IP, NA, Real Part). Two of these are the uppermost facies of a carbonate stack #62 for the carbonate regressive margin and #6 at the top of the EF Marl. Both are high resistivity facies. The lowermost part of Geobody 2, N51, has the same order as the upper marl and is not calibrated. 

Only 1 neuron cluster or facies, N57, has 6 prominent attributes and its samples comprise only .3 of 1% of the total samples in the 110ms model.  Its 2nd, 4th, and 6th attributes are those that each comprise 5% or less of the data.  Of these, Trace Envelope, is nearly completely restricted to three neuron clusters (N57, 58, 59) that form the mid to downdip central core of Geobody 1 and N52 in Geobody 2.  The only other occurrence is in N24 at the base of the marl section updip which appears to be 95% carbonate in one XRD sample. All calibrated instances are High resistivity.

1 Taner, M. T., F. Koehler, and R. E. Sheriff, Complex seismic trace analysis, 1979, Geophysics, v.44, no. 6, p. 1041-1063, 16 Figs., 1 Table, June.

 Seismic Attributes
Patricia Santogrossi is a geoscientist who has enjoyed 40 years in the oil business. She is currently a Consultant to Geophysical Insights, producer of the Paradise multi-attribute analysis software platform. Formerly, she was a Leading Reservoir Geoscientist and Non-operated Projects Manager with Statoil USA E & P. In this role Ms. Santogrossi was engaged for nearly nine years in Gulf of Mexico business development, corporate integration, prospect maturation, and multiple appraisal projects in the deep and ultra-deepwater Gulf of Mexico. Ms. Santogrossi has previously worked with domestic and international Shell Companies, Marathon Oil Company, and Arco/Vastar Resources in research, exploration, leasehold and field appraisal as well as staff development. She has also been Chief Geologist for Chroma Energy, who possessed proprietary 3D voxel multi-attribute visualization technology, and for Knowledge Reservoir, a reservoir characterization and simulation firm that specialized in Deepwater project evaluations. A longtime member of SEPM, AAPG, GCSSEPM, HGS and SEG, Ms. Santogrossi has held various elected and appointed positions in these industry organizations. She has recently begun her fourth three-year term as a representative to the AAPG House of Delegates from the Houston Geological Society (HGS). In addition, she has been invited to continue her role this fall on the University of Illinois’ Department of Geology Alumni Board. Ms. Santogrossi was born, raised, and educated in Illinois before she headed to Texas to work for Shell after she received her MS in Geology from the University of Illinois, Champaign-Urbana. Her other ‘foreign assignments’ have included New Orleans and London. She resides in Houston with her husband of twenty-four years, Joe Delasko.


Machine Learning - The Next Generation Seismic Interpretation


Machine Learning - The Next Generation Seismic Interpretation

Most people associate neural networks, big data and big number crunching as parts of a single paradigm for access to web information.  Articulate a query and wait for an answer.  But in this particular field and at this particular time, we “must” place neural networks and big data tools in the hands of seismic interpreters.  They are accustomed to working interactively with their data.  We do this not because they are a narrow-minded bunch who are unwilling to work with new tools unless it’s interactive, but because none of us know enough about the multi-attribute properties of seismic data – we don’t know the semantics of the words in these data – to let neural networks fly through seismic data unattended.  In other words, it ain’t English.  We don’t know the language yet.  


Machine Learning and Truck Driving


Machine Learning and Truck Driving

Einstein stated eloquently, “We can't solve problems by using the same kind of thinking we used when we created them.” Unlike us, machines don’t have cognitive biases and “experience baggage” when reading through data. They make their assessments based on pattern recognition and algorithms....