Applications of Machine Learning for Geoscientists – Permian Basin

Applications of Machine Learning for Geoscientists – Permian Basin

By Carrie Laudon
Published with permission: Permian Basin Geophysical Society 60th Annual Exploration Meeting
May 2019

Abstract

Over the last few years, because of the increase in low-cost computer power, individuals and companies have stepped up investigations into the use of machine learning in many areas of E&P. For the geosciences, the emphasis has been in reservoir characterization, seismic data processing, and to a lesser extent interpretation. The benefits of using machine learning (whether supervised or unsupervised) have been demonstrated throughout the literature, and yet the technology is still not a standard workflow for most seismic interpreters. This lack of uptake can be attributed to several factors, including a lack of software tools, clear and well-defined case histories and training. Fortunately, all these factors are being mitigated as the technology matures. Rather than looking at machine learning as an adjunct to the traditional interpretation methodology, machine learning techniques should be considered the first step in the interpretation workflow.

By using statistical tools such as Principal Component Analysis (PCA) and Self Organizing Maps (SOM) a multi-attribute 3D seismic volume can be “classified”. The PCA reduces a large set of seismic attributes both instantaneous and geometric, to those that are the most meaningful. The output of the PCA serves as the input to the SOM, a form of unsupervised neural network, which, when combined with a 2D color map facilitates the identification of clustering within the data volume. When the correct “recipe” is selected, the clustered or classified volume allows the interpreter to view and separate geological and geophysical features that are not observable in traditional seismic amplitude volumes. Seismic facies, detailed stratigraphy, direct hydrocarbon indicators, faulting trends, and thin beds are all features that can be enhanced by using a classified volume.

The tuning-bed thickness or vertical resolution of seismic data traditionally is based on the frequency content of the data and the associated wavelet. Seismic interpretation of thin beds routinely involves estimation of tuning thickness and the subsequent scaling of amplitude or inversion information below tuning. These traditional below-tuning-thickness estimation approaches have limitations and require assumptions that limit accuracy. The below tuning effects are a result of the interference of wavelets, which are a function of the geology as it changes vertically and laterally. However, numerous instantaneous attributes exhibit effects at and below tuning, but these are seldom incorporated in thin-bed analyses. A seismic multi-attribute approach employs self-organizing maps to identify natural clusters from combinations of attributes that exhibit below-tuning effects. These results may exhibit changes as thin as a single sample interval in thickness. Self-organizing maps employed in this fashion analyze associated seismic attributes on a sample-by-sample basis and identify the natural patterns or clusters produced by thin beds. Examples of this approach to improve stratigraphic resolution in both the Eagle Ford play, and the Niobrara reservoir of the Denver-Julesburg Basin will be used to illustrate the workflow.

Introduction

Seismic multi-attribute analysis has always held the promise of improving interpretations via the integration of attributes which respond to subsurface conditions such as stratigraphy, lithology, faulting, fracturing, fluids, pressure, etc. The benefits of using machine learning (whether supervised or unsupervised) has been demonstrated throughout the literature and yet the technology is still not a standard workflow for most seismic interpreters. This lack of uptake can be attributed to several factors, including a lack of software tools, clear and well-defined case histories, and training. This paper focuses on an unsupervised machine learning workflow utilizing Self-Organizing Maps (Kohonen, 2001) in combination with Principal Component Analysis to produce classified seismic volumes from multiple instantaneous attribute volumes. The workflow addresses several significant issues in seismic interpretation: it analyzes large amounts of data simultaneously; it determines relationships between different types of data; it is sample based and produces high-resolution results and, reveals geologic features that are difficult to see in conventional approaches.

Principal Component Analysis (PCA)

Multi-dimensional analysis and multi-attribute analysis go hand in hand. Because individuals are grounded in three-dimensional space, it is difficult to visualize what data in a higher number dimensional space looks like. Fortunately, mathematics doesn’t have this limitation and the results can be easily understood with conventional 2D and 3D viewers.

Working with multiple instantaneous or geometric seismic attributes generates tremendous volumes of data. These volumes contain huge numbers of data points which may be highly continuous, greatly redundant, and/or noisy. (Coleou et al., 2003). Principal Component Analysis (PCA) is a linear technique for data reduction which maintains the variation associated with the larger data sets (Guo and others, 2009; Haykin, 2009; Roden and others, 2015). PCA can separate attribute types by frequency, distribution, and even character. PCA technology is used to determine which attributes may be ignored due to their very low impact on neural network solutions and which attributes are most prominent in the data. Figure 1 illustrates the analysis of a data cluster in two directions, offset by 90 degrees. The first principal component (eigenvector 1) analyses the data cluster along the longest axis. The second principal component (eigenvector 2) analyses the data cluster variations perpendicular to the first principal component. As stated in the diagram, each eigenvector is associated with an eigenvalue which shows how much variance there is in the data.

two attribute data set

Figure 1. Two attribute data set illustrating the concept of PCA

The next step in PCA analysis is to review the eigen spectrum to select the most prominent attributes in a data set. The following example is taken from a suite of instantaneous attributes over the Niobrara formation within the Denver­ Julesburg Basin. Results for eigenvectors 1 are shown with three attributes: sweetness, envelope and relative acoustic impedance being the most prominent.

two attribute data set

Figure 2. Results from PCA for first eigenvector in a seismic attribute data set

Utilizing a cutoff of 60% in this example, attributes were selected from PCA for input to the neural network classification. For the Niobrara, eight instantaneous attributes from the four of the first six eigenvectors were chosen and are shown in Table 1. The PCA allowed identification of the most significant attributes from an initial group of 19 attributes.

Results from PCA for Niobrara Interval

Table 1: Results from PCA for Niobrara Interval shows which instantaneous attributes will be used in a Self-Organizing Map (SOM).

Self-Organizing Maps

Teuvo Kohonen, a Finnish mathematician, invented the concepts of Self-Organizing Maps (SOM) in 1982 (Kohonen, T., 2001). Self-Organizing Maps employ the use of unsupervised neural networks to reduce very high dimensions of data to a classification volume that can be easily visualized (Roden and others, 2015). Another important aspect of SOMs is that every seismic sample is used as input to classification as opposed to wavelet-based classification.

Figure 3 diagrams the SOM concept for 10 attributes derived from a 3D seismic amplitude volume. Within the 3D seismic survey, samples are first organized into attribute points with similar properties called natural clusters in attribute space. Within each cluster new, empty, multi-attribute samples, named neurons, are introduced. The SOM neurons will seek out natural clusters of like characteristics in the seismic data and produce a 2D mesh that can be illustrated with a two- dimensional color map. In other words, the neurons “learn” the characteristics of a data cluster through an iterative process (epochs) of cooperative than competitive training. When the learning is completed each unique cluster is assigned to a neuron number and each seismic sample is now classified (Smith, 2016).

two attribute data set

Figure 3. Illustration of the concept of a Self-Organizing Map

Figures 4 and 5 show a simple example using 2 attributes, amplitude, and Hilbert transform on a synthetic example. Synthetic reflection coefficients are convolved with a simple wavelet, 100 traces created, and noise added. When the attributes are cross plotted, clusters of points can be seen in the cross plot. The colored cross plot shows the attributes after SOM classification into 64 neurons with random colors assigned. In Figure 5, the individual clusters are identified and mapped back to the events on the synthetic. The SOM has correctly distinguished each event in the synthetic.

Two attribute synthetic example of a Self-Organizing Map

Figure 4. Two attribute synthetic example of a Self-Organizing Map. The amplitude and Hilbert transform are cross plotted. The colored cross plot shows the attributes after classification into 64 neurons by SOM.

Synthetic SOM example

Figure 5. Synthetic SOM example with neurons identified by number and mapped back to the original synthetic data

Results for Niobrara and Eagle Ford

In 2018, Geophysical Insights conducted a proof of concept on 100 square miles of multi-client 3D data jointly owned by Geophysical Pursuit, Inc. (GPI) and Fairfield Geotechnologies (FFG) in the Denver¬ Julesburg Basin (DJ). The purpose of the study is to evaluate the effectiveness of a machine learning workflow to improve resolution within the reservoir intervals of the Niobrara and Codell formations, the primary targets for development in this portion of the basin. An amplitude volume was resampled from 2 ms to 1 ms and along with horizons, loaded into the Paradise® machine learning application and attributes generated. PCA was used to identify which attributes were most significant in the data, and these were used in a SOM to evaluate the interval Top Niobrara to Greenhorn (Laudon and others, 2019).

Figure 6 shows results of an 8X8 SOM classification of 8 instantaneous attributes over the Niobrara interval along with the original amplitude data. Figure 7 is the same results with a well composite focused on the B chalk, the best section of the reservoir, which is difficult to resolve with individual seismic attributes. The SOM classification has resolved the chalk bench as well as other stratigraphic features within the interval.

North-South Inline showing the original amplitude data (upper) and the 8X8 SOM result (lower) from Top Niobrara

Figure 6. North-South Inline showing the original amplitude data (upper) and the 8X8 SOM result (lower) from Top Niobrara through Greenhorn horizons. Seismic data is shown courtesy of GPI and FFG.

8X8 Instantaneous SOM through Rotharmel 11-33 with well log composite

Figure 7. 8X8 Instantaneous SOM through Rotharmel 11-33 with well log composite. The B bench, highlighted in green on the wellbore, ties the yellow-red-yellow sequence of neurons. Seismic data is shown courtesy of GPI and FFG

 

8X8 SOM results through the Eagle Ford

Figure 8. 8X8 SOM results through the Eagle Ford. The primary target, the Lower Eagle Ford shale had 16 neuron classes over 14-29 milliseconds of data. Seismic data shown courtesy of Seitel.

The results shown in Figure 9 reveal non-layer cake facies bands that include details in the Eagle )RUG,v basal clay-rich shale, high resistivity and low resistivity Eagle Ford shale objectives, the Eagle Ford ash, and the upper Eagle Ford marl, which are overlain disconformably by the Austin Chalk.

Eagle Ford SOM classification shown with well results

Figure 9. Eagle Ford SOM classification shown with well results. The SOM resolves a high resistivity interval, overlain by a thin ash layer and finally a low resistivity layer. The SOM also resolves complex 3-dimensional relationships between these facies

Convolutional Neural Networks (CNN)

A promising development in machine learning is supervised classification via the applications of convolutional neural networks (CNNs). Supervised methods have, in the past, not been efficient due to the laborious task of training the neural network. CNN is a deep learning seismic classification. We apply CNN to fault detection on seismic data. The examples that follow show CNN fault detection results which did not require any interpreter picked faults for training, rather the network was trained using synthetic data. Two results are shown, one from the North Sea, Figure 10, and one from the Great South Basin, New Zealand, Figure 11.

Side by side comparison of coherence attribute to CNN fault probability attribute, North Sea

Figure 10. Side by side comparison of coherence attribute to CNN fault probability attribute, North Sea

Side by side comparison of coherence attribute to CNN fault probability attribute, North Sea

Figure 11. Comparison of Coherence to CNN fault probability attribute, New Zealand

Conclusions

Advances in compute power and algorithms are making the use of machine learning available on the desktop to seismic interpreters to augment their interpretation workflow. Taking advantage of today’s computing technology, visualization techniques, and an understanding of machine learning as applied to seismic data, PCA combined with SOMs efficiently distill multiple seismic attributes into classification volumes. When applied on a multi-attribute seismic sample basis, SOM is a powerful nonlinear cluster analysis and pattern recognition machine learning approach that helps interpreters identify geologic patterns in the data and has been able to reveal stratigraphy well below conventional tuning thickness.

In the fault interpretation domain, recent development of a Convolutional Neural Network that works directly on amplitude data shows promise to efficiently create fault probability volumes without the requirement of a labor-intensive training effort.

References

Coleou, T., M. Poupon, and A. Kostia, 2003, Unsupervised seismic facies classification: A review and comparison of techniques and implementation: The Leading Edge, 22, 942–953, doi: 10.1190/1.1623635.

Guo, H., K. J. Marfurt, and J. Liu, 2009, Principal component spectral analysis: Geophysics, 74, no. 4, 35–43.

Haykin, S., 2009. Neural networks and learning machines, 3rd ed.: Pearson

Kohonen, T., 2001,Self organizing maps: Third extended addition, Springer, Series in Information Services, Vol. 30.

Laudon, C., Stanley, S., and Santogrossi, P., 2019, Machine Leaming Applied to 3D Seismic Data from the Denver-Julesburg Basin Improves Stratigraphic Resolution in the Niobrara, URTeC 337, in press

Roden, R., and Santogrossi, P., 2017, Significant Advancements in Seismic Reservoir Characterization with Machine Learning, The First, v. 3, p. 14-19

Roden, R., Smith, T., and Sacrey, D., 2015, Geologic pattern recognition from seismic attributes: Principal component analysis and self-organizing maps, Interpretation, Vol. 3, No. 4, p. SAE59-SAE83.

Santogrossi, P., 2017, Classification/Corroboration of Facies Architecture in the Eagle Ford Group: A Case Study in Thin Bed Resolution, URTeC 2696775, doi 10.15530-urtec-2017-<2696775>.

Applications of Convolutional Neural Networks (CNN) to Seismic Interpretation

Applications of Convolutional Neural Networks (CNN) to Seismic Interpretation

As part of our quarterly series on machine learning, we were delighted to have had Dr. Tao Zhao present applications of Convolutional Neural Networks (CNN) in a worldwide webinar on 20 March 2019 that was attended by participants on every continent.  Dr. Zhao highlighted applications in seismic facies classification, fault detection, and extracting large scale channels using CNN technology.  If you missed the webinar, no problem!  A video of the webinar can be streamed via the video player below.  Please provide your name and business email address so that we may invite you to future webinars and other events.  The abstract for Dr. Zhao’s talk follows:

We welcome your comments and questions and look forward to discussions on this timely topic.

Abstract:  Leveraging Deep Learning in Extracting Features of Interest from Seismic Data

Mapping and extracting features of interest is one of the most important objectives in seismic data interpretation. Due to the complexity of seismic data, geologic features identified by interpreters on seismic data using visualization techniques are often challenging to extract. With the rapid development in GPU computing power and the success obtained in computer vision, deep learning techniques, represented by convolutional neural networks (CNN), start to entice seismic interpreters in various applications. The main advantages of CNN over other supervised machine learning methods are its spatial awareness and automatic attribute extraction. The high flexibility in CNN architecture enables researchers to design different CNN models to identify different features of interest. In this webinar, using several seismic surveys acquired from different regions, I will discuss three CNN applications in seismic interpretation: seismic facies classification, fault detection, and channel extraction. Seismic facies classification aims at classifying seismic data into several user-defined, distinct facies of interest. Conventional machine learning methods often produce a highly fragmented facies classification result, which requires a considerable amount of post-editing before it can be used as geobodies. In the first application, I will demonstrate that a properly built CNN model can generate seismic facies with higher purity and continuity. In the second application, compared with traditional seismic attributes, I deploy a CNN model built for fault detection which provides smooth fault images and robust noise degradation. The third application demonstrates the effectiveness of extracting large scale channels using CNN. These examples demonstrate that CNN models are capable of capturing the complex reflection patterns in seismic data, providing clean images of geologic features of interest, while also carrying a low computational cost.

 

Geophysical Insights Announces Call for Abstracts – University Challenge

Geophysical Insights Announces Call for Abstracts – University Challenge

Geophysical Insights – University Challenge Topics

Call for Abstracts

The following “Challenge Topics” are offered to universities who are part of the Paradise University Program. Those universities are encouraged to consider pursuing one or more of the topics below in their research work with Paradise® and related interpretation technologies. Students interested in researching and publishing on one or more of these topics are welcome to submit an abstract to Geophysical Insights, including an explanation of their interest in the topic. The management of Geophysical Insights will select the best abstract per Challenge Topic and provide a grant of $1,000 to each student upon the completion of the research work. Student(s) who undertake the research may count on additional forms of support from Geophysical Insights, including:

    • • Potential job interview after graduation
    • • Special recognition at the Geophysical Insights booth at a future SEG
    • • Occasional collaboration via web meeting, email, or phone with a senior geoscientist
    • • Inclusion in invitations to webinars hosted by Geophysical Insights on geoscience topics

Challenge Research Topics

Develop a geophysical basis for the identification of thin beds below classic seismic tuning

The research on this topic will investigate applications of new levels of seismic resolution afforded by multi-attribute Self-Organizing Maps (SOM), the unsupervised machine learning process in the Paradise software. The mathematical basis of detecting events below classical seismic tuning through simultaneous multi-attribute analysis – using machine learning – has been reported by Smith (2017) in an abstract submitted to SEG 2018. (Subsequently, the abstract has been placed online as a white paper resource). Examples of thin-bed resolution have been documented in a Frio onshore Texas reservoir, and in the Texas Eagle Ford Shale by Roden, et al., (2017). Therefore, the researcher is challenged to develop a better understanding of the physical basis for the resolution of events below seismic tuning vs. results from wavelet-based methods. Additional empirical results of the detection of thin beds are also welcomed. This approach has wide potential for both exploration and development in the interpretation of facies and stratigraphy and impact on reserve/resource calculations.  For unconventional plays, thin bed delineation will have a significant influence on directional drilling programs.

Determine the effectiveness of ‘machine learning’ determined geobodies in estimating reserves/resources and reservoir properties

The Paradise software has the capability of isolating and quantifying geobodies that result from a SOM machine learning process. Initial studies conducted with the technology suggest that the estimated reservoir volume is approximately what is being realized through the life of the field. This Challenge is to apply the geobody tool in Paradise along with other reservoir modeling techniques and field data to determine the effectiveness of geobodies in estimating reserves. If this proves to be correct, the estimating of reserves from geobodies could be done early in the lifecycle of the field, saving engineering time while reducing risk.

Corroborate SOM classification results to well logs or lithofacies

A challenge to cluster-based classification techniques is corroborating well log curves to lithofacies. Up to this point, such corroboration has been an iterative process of running different neural configurations and visually comparing each classification result to “ground truth”. Some geoscientists (results yet to be published) have used bivariate statistical analysis from petrophysical well logs in combination with the SOM classification results to develop a representation of the static reservoir properties, including reservoir distribution and storage capacity. The challenge is to develop a methodology incorporating SOM seismic results with lithofacies determination from well logs.

Explore the significance of SOM low-probability anomalies (DHIs, anomalous features, etc.)

In addition to a standard classification volume resulting from a SOM analysis, Paradise also produces a “Probability” volume that is composed of a probability value at each voxel for a given neural class (neuron). This technique is a gauge of the consistency of a feature to the surrounding region. Direct Hydrocarbon Indicators (DHIs) tend to be identified in the Paradise software as “low probability” or “anomalous” events because their properties are often inconsistent with the region. These SOM low probability features have been documented by Roden et al. (2015) and Roden and Chen (2017).  However, the Probability volume changes with the size of the region analyzed, and with respect to DHIs and anomalous features. This Challenge will determine the effectiveness of using the probability measure from a SOM result as a valid gauge of DHIs and set out the relationships among the optimum neural configuration, the size of the region, and extent of the DHIs.

Map detailed facies distribution from SOM results

SOM results have proven to provide detailed information in the delineation and distribution of facies in essentially any geologic setting (Roden et al., 2015; Roden and Santogrossi, 2017; Santogrossi, 2017). Due to the high-resolution output of appropriate SOM analysis, individual facies units can often be defined in much more detail than conventional interpretation approaches. Research topics should be related to determining facies distribution in different geological environments utilizing the SOM process, available well log curves, and regional knowledge of stratigraphy.

For more information on Paradise or the University Challenge Program, please contact:

Hal Green
Email: [email protected]
Mobile:  713.480.2260

Future of Seismic Interpretation with Machine Learning and Deep Learning

By: Iván Marroquín, Ph.D. – Senior Research Geophysicist

I am very excited to participate as a speaker in the workshop on Big Data and Machine Learning organized by European Association of Geoscientists & Engineers. My presentation is about using machine learning and deep learning to advance seismic interpretation process for the benefit of hydrocarbons exploration and production.

Companies in the oil and gas industry invest millions of dollars in an effort to improve their understanding of their reservoir characteristics and predict their future behavior. An integral part of this effort consists of using traditional workflows for interpreting large volumes of seismic data. Geoscientists are required to manually define relationships between geological features and seismic patterns. As a result, the task of finding significative seismic responses to recognize reservoir characteristics can be overwhelming.

In this era of big data revolution, we are at the beginning of the next fundamental shift in seismic interpretation. Knowledge discovery, based on machine learning and deep learning, supports geoscientists in two ways. First, it interrogates volumes of seismic data without preconceptions. The objective is to automatically find key insights, hidden patterns, and correlations. So then, geoscientists gain visibility into complex relationships between geologic features and seismic data. To illustrate this point, Figure 1a shows a thin bed reservoir scenario from Texas (USA). In terms of seismic data, it is difficult to discern the presence of the seismic event associated with the producing zone at well location. The use of machine learning to derive a seismic classification output (Figure 1b) brought forward a much rich stratigraphic information. Upon closer examination using time slice views (Figure 1c), it is indicated that the reservoir is an offshore bar. Note how well oil production matches the extent of the reservoir body.  

Figure 1. Seismic classification result using machine learning (result provided by Deborah Sacrey, senior geologist with Geophysical Insights).  

Another way knowledge discovery can help geoscientists is to automate elements of seismic interpretation process. At the rate machine learning and deep learning can consume large amounts of seismic data, it makes possible to constantly review, modify, and take appropriate actions at the right time. With these possibilities, geoscientists are free to focus on other, more valuable tasks. The following example demonstrates that a deep learning model trained can be trained on seismic data or derived attributes (e.g., seismic classification, instantaneous, geometric, etc.) to identify desired outcomes, such as fault locations. In this case, a seismic classification volume (Figure 2a) was generated from seismic amplitude data (Taranaki Basin, west coast of New Zealand). Figure 2b shows the predicted faults displayed against the classification volume. To corroborate the quality of the prediction, the faults are also displayed against the seismic amplitude data (Figure 2c). It is important to note that the seismic classification volume provides an additional benefit to the process of seismic interpretation. It has the potential to expose stratigraphic information not readily apparent in seismic amplitude data.  

Figure 2. Fault location predictions using deep learning (result provided by Dr. Tao Zhao, research geophysicist with Geophysical Insights).

Machine Learning Essentials for Seismic Interpretation: an e-Course by Dr. Tom Smith

Machine Learning Essentials for Seismic Interpretation: an e-Course by Dr. Tom Smith

Machine learning is foundational to the digital transformation of the oil & gas industry and will have a dramatic impact on the exploration and production of hydrocarbons.  Dr. Tom Smith, the founder and CEO of Geophysical Insights, conducts a comprehensive survey of machine learning technology and its applications in this 24-part series.  The course will benefit geoscientists, engineers, and data analysts at all experience levels, from data analysts who want to better understand applications of machine learning to geoscience, to senior geophysicists with deep experience in the field.

Aspects of supervised learning, unsupervised learning, classification and reclassification are introduced to illustrate how they work on seismic data.  Machine learning is presented, not as an end-all-be-all, but as a new set of tools which enables interpretation on seismic data on a new, higher level that of abstraction  that promises to reduce risks and identify features that which might otherwise be missed.

The following major topics are covered:

  • Operation  – supervised and unsupervised learning; buzzwords; examples
  • Foundation  – seismic processing for ML; attribute selection list objectives; principal component analysis
  • Practice  – geobodies; below-tuning; fluid contacts; making predictions
  • Prediction – the best well; the best seismic processing; over-fitting; cross-validation; who makes the best predictions?

This course can be taken for certification, or for informational purposes only (without certification). 

Enroll today for this valuable e-course from Geophysical Insights!