Applications of Machine Learning for Geoscientists – Permian Basin

Applications of Machine Learning for Geoscientists – Permian Basin

By Carrie Laudon
Published with permission: Permian Basin Geophysical Society 60th Annual Exploration Meeting
May 2019

Abstract

Over the last few years, because of the increase in low-cost computer power, individuals and companies have stepped up investigations into the use of machine learning in many areas of E&P. For the geosciences, the emphasis has been in reservoir characterization, seismic data processing, and to a lesser extent interpretation. The benefits of using machine learning (whether supervised or unsupervised) have been demonstrated throughout the literature, and yet the technology is still not a standard workflow for most seismic interpreters. This lack of uptake can be attributed to several factors, including a lack of software tools, clear and well-defined case histories and training. Fortunately, all these factors are being mitigated as the technology matures. Rather than looking at machine learning as an adjunct to the traditional interpretation methodology, machine learning techniques should be considered the first step in the interpretation workflow.

By using statistical tools such as Principal Component Analysis (PCA) and Self Organizing Maps (SOM) a multi-attribute 3D seismic volume can be “classified”. The PCA reduces a large set of seismic attributes both instantaneous and geometric, to those that are the most meaningful. The output of the PCA serves as the input to the SOM, a form of unsupervised neural network, which, when combined with a 2D color map facilitates the identification of clustering within the data volume. When the correct “recipe” is selected, the clustered or classified volume allows the interpreter to view and separate geological and geophysical features that are not observable in traditional seismic amplitude volumes. Seismic facies, detailed stratigraphy, direct hydrocarbon indicators, faulting trends, and thin beds are all features that can be enhanced by using a classified volume.

The tuning-bed thickness or vertical resolution of seismic data traditionally is based on the frequency content of the data and the associated wavelet. Seismic interpretation of thin beds routinely involves estimation of tuning thickness and the subsequent scaling of amplitude or inversion information below tuning. These traditional below-tuning-thickness estimation approaches have limitations and require assumptions that limit accuracy. The below tuning effects are a result of the interference of wavelets, which are a function of the geology as it changes vertically and laterally. However, numerous instantaneous attributes exhibit effects at and below tuning, but these are seldom incorporated in thin-bed analyses. A seismic multi-attribute approach employs self-organizing maps to identify natural clusters from combinations of attributes that exhibit below-tuning effects. These results may exhibit changes as thin as a single sample interval in thickness. Self-organizing maps employed in this fashion analyze associated seismic attributes on a sample-by-sample basis and identify the natural patterns or clusters produced by thin beds. Examples of this approach to improve stratigraphic resolution in both the Eagle Ford play, and the Niobrara reservoir of the Denver-Julesburg Basin will be used to illustrate the workflow.

Introduction

Seismic multi-attribute analysis has always held the promise of improving interpretations via the integration of attributes which respond to subsurface conditions such as stratigraphy, lithology, faulting, fracturing, fluids, pressure, etc. The benefits of using machine learning (whether supervised or unsupervised) has been demonstrated throughout the literature and yet the technology is still not a standard workflow for most seismic interpreters. This lack of uptake can be attributed to several factors, including a lack of software tools, clear and well-defined case histories, and training. This paper focuses on an unsupervised machine learning workflow utilizing Self-Organizing Maps (Kohonen, 2001) in combination with Principal Component Analysis to produce classified seismic volumes from multiple instantaneous attribute volumes. The workflow addresses several significant issues in seismic interpretation: it analyzes large amounts of data simultaneously; it determines relationships between different types of data; it is sample based and produces high-resolution results and, reveals geologic features that are difficult to see in conventional approaches.

Principal Component Analysis (PCA)

Multi-dimensional analysis and multi-attribute analysis go hand in hand. Because individuals are grounded in three-dimensional space, it is difficult to visualize what data in a higher number dimensional space looks like. Fortunately, mathematics doesn’t have this limitation and the results can be easily understood with conventional 2D and 3D viewers.

Working with multiple instantaneous or geometric seismic attributes generates tremendous volumes of data. These volumes contain huge numbers of data points which may be highly continuous, greatly redundant, and/or noisy. (Coleou et al., 2003). Principal Component Analysis (PCA) is a linear technique for data reduction which maintains the variation associated with the larger data sets (Guo and others, 2009; Haykin, 2009; Roden and others, 2015). PCA can separate attribute types by frequency, distribution, and even character. PCA technology is used to determine which attributes may be ignored due to their very low impact on neural network solutions and which attributes are most prominent in the data. Figure 1 illustrates the analysis of a data cluster in two directions, offset by 90 degrees. The first principal component (eigenvector 1) analyses the data cluster along the longest axis. The second principal component (eigenvector 2) analyses the data cluster variations perpendicular to the first principal component. As stated in the diagram, each eigenvector is associated with an eigenvalue which shows how much variance there is in the data.

two attribute data set

Figure 1. Two attribute data set illustrating the concept of PCA

The next step in PCA analysis is to review the eigen spectrum to select the most prominent attributes in a data set. The following example is taken from a suite of instantaneous attributes over the Niobrara formation within the Denver­ Julesburg Basin. Results for eigenvectors 1 are shown with three attributes: sweetness, envelope and relative acoustic impedance being the most prominent.

two attribute data set

Figure 2. Results from PCA for first eigenvector in a seismic attribute data set

Utilizing a cutoff of 60% in this example, attributes were selected from PCA for input to the neural network classification. For the Niobrara, eight instantaneous attributes from the four of the first six eigenvectors were chosen and are shown in Table 1. The PCA allowed identification of the most significant attributes from an initial group of 19 attributes.

Results from PCA for Niobrara Interval

Table 1: Results from PCA for Niobrara Interval shows which instantaneous attributes will be used in a Self-Organizing Map (SOM).

Self-Organizing Maps

Teuvo Kohonen, a Finnish mathematician, invented the concepts of Self-Organizing Maps (SOM) in 1982 (Kohonen, T., 2001). Self-Organizing Maps employ the use of unsupervised neural networks to reduce very high dimensions of data to a classification volume that can be easily visualized (Roden and others, 2015). Another important aspect of SOMs is that every seismic sample is used as input to classification as opposed to wavelet-based classification.

Figure 3 diagrams the SOM concept for 10 attributes derived from a 3D seismic amplitude volume. Within the 3D seismic survey, samples are first organized into attribute points with similar properties called natural clusters in attribute space. Within each cluster new, empty, multi-attribute samples, named neurons, are introduced. The SOM neurons will seek out natural clusters of like characteristics in the seismic data and produce a 2D mesh that can be illustrated with a two- dimensional color map. In other words, the neurons “learn” the characteristics of a data cluster through an iterative process (epochs) of cooperative than competitive training. When the learning is completed each unique cluster is assigned to a neuron number and each seismic sample is now classified (Smith, 2016).

two attribute data set

Figure 3. Illustration of the concept of a Self-Organizing Map

Figures 4 and 5 show a simple example using 2 attributes, amplitude, and Hilbert transform on a synthetic example. Synthetic reflection coefficients are convolved with a simple wavelet, 100 traces created, and noise added. When the attributes are cross plotted, clusters of points can be seen in the cross plot. The colored cross plot shows the attributes after SOM classification into 64 neurons with random colors assigned. In Figure 5, the individual clusters are identified and mapped back to the events on the synthetic. The SOM has correctly distinguished each event in the synthetic.

Two attribute synthetic example of a Self-Organizing Map

Figure 4. Two attribute synthetic example of a Self-Organizing Map. The amplitude and Hilbert transform are cross plotted. The colored cross plot shows the attributes after classification into 64 neurons by SOM.

Synthetic SOM example

Figure 5. Synthetic SOM example with neurons identified by number and mapped back to the original synthetic data

Results for Niobrara and Eagle Ford

In 2018, Geophysical Insights conducted a proof of concept on 100 square miles of multi-client 3D data jointly owned by Geophysical Pursuit, Inc. (GPI) and Fairfield Geotechnologies (FFG) in the Denver¬ Julesburg Basin (DJ). The purpose of the study is to evaluate the effectiveness of a machine learning workflow to improve resolution within the reservoir intervals of the Niobrara and Codell formations, the primary targets for development in this portion of the basin. An amplitude volume was resampled from 2 ms to 1 ms and along with horizons, loaded into the Paradise® machine learning application and attributes generated. PCA was used to identify which attributes were most significant in the data, and these were used in a SOM to evaluate the interval Top Niobrara to Greenhorn (Laudon and others, 2019).

Figure 6 shows results of an 8X8 SOM classification of 8 instantaneous attributes over the Niobrara interval along with the original amplitude data. Figure 7 is the same results with a well composite focused on the B chalk, the best section of the reservoir, which is difficult to resolve with individual seismic attributes. The SOM classification has resolved the chalk bench as well as other stratigraphic features within the interval.

North-South Inline showing the original amplitude data (upper) and the 8X8 SOM result (lower) from Top Niobrara

Figure 6. North-South Inline showing the original amplitude data (upper) and the 8X8 SOM result (lower) from Top Niobrara through Greenhorn horizons. Seismic data is shown courtesy of GPI and FFG.

8X8 Instantaneous SOM through Rotharmel 11-33 with well log composite

Figure 7. 8X8 Instantaneous SOM through Rotharmel 11-33 with well log composite. The B bench, highlighted in green on the wellbore, ties the yellow-red-yellow sequence of neurons. Seismic data is shown courtesy of GPI and FFG

 

8X8 SOM results through the Eagle Ford

Figure 8. 8X8 SOM results through the Eagle Ford. The primary target, the Lower Eagle Ford shale had 16 neuron classes over 14-29 milliseconds of data. Seismic data shown courtesy of Seitel.

The results shown in Figure 9 reveal non-layer cake facies bands that include details in the Eagle )RUG,v basal clay-rich shale, high resistivity and low resistivity Eagle Ford shale objectives, the Eagle Ford ash, and the upper Eagle Ford marl, which are overlain disconformably by the Austin Chalk.

Eagle Ford SOM classification shown with well results

Figure 9. Eagle Ford SOM classification shown with well results. The SOM resolves a high resistivity interval, overlain by a thin ash layer and finally a low resistivity layer. The SOM also resolves complex 3-dimensional relationships between these facies

Convolutional Neural Networks (CNN)

A promising development in machine learning is supervised classification via the applications of convolutional neural networks (CNNs). Supervised methods have, in the past, not been efficient due to the laborious task of training the neural network. CNN is a deep learning seismic classification. We apply CNN to fault detection on seismic data. The examples that follow show CNN fault detection results which did not require any interpreter picked faults for training, rather the network was trained using synthetic data. Two results are shown, one from the North Sea, Figure 10, and one from the Great South Basin, New Zealand, Figure 11.

Side by side comparison of coherence attribute to CNN fault probability attribute, North Sea

Figure 10. Side by side comparison of coherence attribute to CNN fault probability attribute, North Sea

Side by side comparison of coherence attribute to CNN fault probability attribute, North Sea

Figure 11. Comparison of Coherence to CNN fault probability attribute, New Zealand

Conclusions

Advances in compute power and algorithms are making the use of machine learning available on the desktop to seismic interpreters to augment their interpretation workflow. Taking advantage of today’s computing technology, visualization techniques, and an understanding of machine learning as applied to seismic data, PCA combined with SOMs efficiently distill multiple seismic attributes into classification volumes. When applied on a multi-attribute seismic sample basis, SOM is a powerful nonlinear cluster analysis and pattern recognition machine learning approach that helps interpreters identify geologic patterns in the data and has been able to reveal stratigraphy well below conventional tuning thickness.

In the fault interpretation domain, recent development of a Convolutional Neural Network that works directly on amplitude data shows promise to efficiently create fault probability volumes without the requirement of a labor-intensive training effort.

References

Coleou, T., M. Poupon, and A. Kostia, 2003, Unsupervised seismic facies classification: A review and comparison of techniques and implementation: The Leading Edge, 22, 942–953, doi: 10.1190/1.1623635.

Guo, H., K. J. Marfurt, and J. Liu, 2009, Principal component spectral analysis: Geophysics, 74, no. 4, 35–43.

Haykin, S., 2009. Neural networks and learning machines, 3rd ed.: Pearson

Kohonen, T., 2001,Self organizing maps: Third extended addition, Springer, Series in Information Services, Vol. 30.

Laudon, C., Stanley, S., and Santogrossi, P., 2019, Machine Leaming Applied to 3D Seismic Data from the Denver-Julesburg Basin Improves Stratigraphic Resolution in the Niobrara, URTeC 337, in press

Roden, R., and Santogrossi, P., 2017, Significant Advancements in Seismic Reservoir Characterization with Machine Learning, The First, v. 3, p. 14-19

Roden, R., Smith, T., and Sacrey, D., 2015, Geologic pattern recognition from seismic attributes: Principal component analysis and self-organizing maps, Interpretation, Vol. 3, No. 4, p. SAE59-SAE83.

Santogrossi, P., 2017, Classification/Corroboration of Facies Architecture in the Eagle Ford Group: A Case Study in Thin Bed Resolution, URTeC 2696775, doi 10.15530-urtec-2017-<2696775>.

Seismic Facies Classification Using Deep Convolutional Neural Networks

Seismic Facies Classification Using Deep Convolutional Neural Networks

By Tao Zhao
Published with permission: SEG International Exposition and 88th Annual Meeting
October 2018

Summary

Convolutional neural networks (CNNs) is a type of supervised learning technique that can be directly applied to amplitude data for seismic data classification. The high flexibility in CNN architecture enables researchers to design different models for specific problems. In this study, I introduce an encoder-decoder CNN model for seismic facies classification, which classifies all samples in a seismic line simultaneously and provides superior seismic facies quality comparing to the traditional patch-based CNN methods. I compare the encoder-decoder model with a traditional patch- based model to conclude the usability of both CNN architectures.

Introduction

With the rapid development in GPU computing and success obtained in computer vision domain, deep learning techniques, represented by convolutional neural networks (CNNs), start to entice seismic interpreters in the application of supervised seismic facies classification. A comprehensive review of deep learning techniques is provided in LeCun et al. (2015). Although still in its infancy, CNN-based seismic classification is successfully applied on both prestack (Araya-Polo et al., 2017) and poststack (Waldeland and Solberg, 2017; Huang et al., 2017; Lewis and Vigh, 2017) data for fault and salt interpretation, identifying different wave characteristics (Serfaty et al., 2017), as well as estimating velocity models (Araya-Polo et al., 2018).

The main advantages of CNN over other supervised classification methods are its spatial awareness and automatic feature extraction. For image classification problems, other than using the intensity values at each pixel individually, CNN analyzes the patterns among pixels in an image, and automatically generates features (in seismic data, attributes) suitable for classification. Because seismic data are 3D tomographic images, we would expect CNN to be naturally adaptable to seismic data classification. However, there are some distinct characteristics in seismic classification that makes it more challenging than other image classification problems. Firstly, classical image classification aims at distinguishing different images, while seismic classification aims at distinguishing different geological objects within the same image. Therefore, from an image processing point of view, instead of classification, seismic classification is indeed a segmentation problem (partitioning an image into blocky pixel shapes with a coarser set of colors). Secondly, training data availability for seismic classification is much sparser comparing to classical

image classification problems, for which massive data are publicly available. Thirdly, in seismic data, all features are represented by different patterns of reflectors, and the boundaries between different features are rarely explicitly defined. In contrast, features in an image from computer artwork or photography are usually well-defined. Finally, because of the uncertainty in seismic data, and the nature of manual interpretation, the training data in seismic classification is always contaminated by noise.

To address the first challenge, until today, most, if not all, published studies on CNN-based seismic facies classification perform classification on small patches of data to infer the class label of the seismic sample at the patch center. In this fashion, seismic facies classification is done by traversing through patches centered at every sample in a seismic volume. An alternative approach, although less discussed, is to use CNN models designed for image segmentation tasks (Long et al., 2015; Badrinarayanan et al., 2017; Chen et al., 2018) to obtain sample-level labels in a 2D profile (e.g. an inline) simultaneously, then traversing through all 2D profiles in a volume.

In this study, I use an encoder-decoder CNN model as an implementation of the aforementioned second approach. I apply both the encoder-decoder model and patch-based model to seismic facies classification using data from the North Sea, with the objective of demonstrating the strengths and weaknesses of the two CNN models. I conclude that the encoder-decoder model provides much better classification quality, whereas the patch-based model is more flexible on training data, possibly making it easier to use in production.

The Two Convolutional Neural Networks (CNN) Models

Patch-based model

A basic patch-based model consists of several convolutional layers, pooling (downsampling) layers, and fully-connected layers. For an input image (for seismic data, amplitudes in a small 3D window), a CNN model first automatically extracts several high-level abstractions of the image (similar to seismic attributes) using the convolutional and pooling layers, then classifies the extracted attributes using the fully- connected layers, which are similar to traditional multilayer perceptron networks. The output from the network is a single value representing the facies label of the seismic sample at the center of the input patch. An example of patch-based model architecture is provided in Figure 1a. In this example, the network is employed to classify salt versus non-salt from seismic amplitude in the SEAM synthetic data (Fehler and Larner, 2008). One input instance is a small patch of data bounded by the red box, and the corresponding output is a class label for this whole patch, which is then assigned to the sample at the patch center. The sample marked as the red dot is classified as non-salt.

CNN architecture patch-based model

Figure 1. Sketches for CNN architecture of a) 2D patch-based model and b) encoder-decoder model. In the 2D patch-based model, each input data instance is a small 2D patch of seismic amplitude centered at the sample to be classified. The corresponding output is then a class label for the whole 2D patch (in this case, non-salt), which is usually assigned to the sample at the center. In the encoder-decoder model, each input data instance is a whole inline (or crossline/time slice) of seismic amplitude. The corresponding output is a whole line of class labels, so that each sample is assigned a label (in this case, some samples are salt and others are non-salt). Different types of layers are denoted in different colors, with layer types marked at their first appearance in the network. The size of the cuboids approximately represents the output size of each layer.

Encoder-decoder model

Encoder-decoder is a popular network structure for tackling image segmentation tasks. Encoder-decoder models share a similar idea, which is first extracting high level abstractions of input images using convolutional layers, then recovering sample-level class labels by “deconvolution” operations. Chen et al. (2018) introduce a current state-of-the-art encoder-decoder model while concisely reviewed some popular predecessors. An example of encoder-decoder model architecture is provided in Figure 1b. Similar to the patch-based example, this encoder-decoder network is employed to classify salt versus non-salt from seismic amplitude in the SEAM synthetic data. Unlike the patch- based network, in the encoder-decoder network, one input instance is a whole line of seismic amplitude, and the corresponding output is a whole line of class labels, which has the same dimension as the input data. In this case, all samples in the middle of the line are classified as salt (marked in red), and other samples are classified as non-salt (marked in white), with minimum error.

Application of the Two CNN Models

For demonstration purpose, I use the F3 seismic survey acquired in the North Sea, offshore Netherlands, which is freely accessible by the geoscience research community. In this study, I am interested to automatically extract seismic facies that have specific seismic amplitude patterns. To remove the potential disagreement on the geological meaning of the facies to extract, I name the facies purely based on their reflection characteristics. Table 1 provides a list of extracted facies. There are eight seismic facies with distinct amplitude patterns, another facies (“everything else”) is used for samples not belonging to the eight target facies.

Facies numberFacies name
1Varies amplitude steeply dipping
2Random
3Low coherence
4Low amplitude deformed
5Low amplitude dipping
6High amplitude deformed
7Moderate amplitude continuous
8Chaotic
0Everything else

To generate training data for the seismic facies listed above, different picking scenarios are employed to compensate for the different input data format required in the two CNN models (small 3D patches versus whole 2D lines). For the patch-based model, 3D patches of seismic amplitude data are extracted around seed points within some user-defined polygons. There are approximately 400,000 3D patches of size 65×65×65 generated for the patch-based model, which is a reasonable amount for seismic data of this size. Figure 2a shows an example line on which seed point locations are defined in the co-rendered polygons.

The encoder-decoder model requires much more effort for generating labeled data. I manually interpret the target facies on 40 inlines across the seismic survey and use these for building the network. Although the total number of seismic samples in 40 lines are enormous, the encoder-decoder model only considers them as 40 input instances, which in fact are of very small size for a CNN network. Figure 2b shows an interpreted line which is used in training the network

In both tests, I randomly use 90% of the generated training data to train the network and use the remaining 10% for testing. On an Nvidia Quadro M5000 GPU with 8GB memory, the patch-based model takes about 30 minutes to converge, whereas the encoder-decoder model needs about 500 minutes. Besides the faster training, the patch-based model also has a higher test accuracy at almost 100% (99.9988%, to be exact) versus 94.1% from the encoder- decoder model. However, this accuracy measurement is sometimes a bit misleading. For a patch-based model, when picking the training and testing data, interpreters usually pick the most representative samples of each facies for which they have the most confidence, resulting in high quality training (and testing) data that are less noisy, and most of the ambiguous samples which are challenging for the classifier are excluded from testing. In contrast, to use an encoder-decoder model, interpreters have to interpret all the target facies in a training line. For example, if the target is faults, one needs to pick all faults in a training line, otherwise unlabeled faults will be considered as “non-fault” and confuse the classifier. Therefore, interpreters have to make some not-so-confident interpretation when generating training and testing data. Figure 2c and 2d show seismic facies predicted from the two CNN models on the same line shown in Figure 2a and 2b. We observe better defined facies from the encoder-decoder model compared to the patch- based model.

Figure 3 shows prediction results from the two networks on a line away from the training lines, and Figure 4 shows prediction results from the two networks on a crossline. Similar to the prediction results on the training line, comparing to the patch-based model, the encoder-decoder model provides facies as cleaner geobodies that require much less post-editing for regional stratigraphic classification (Figure 5). This can be attributed to an encoder-decoder model that is able to capture the large scale spatial arrangement of facies, whereas the patch-based model only senses patterns in small 3D windows. To form such windows, the patch-based model also needs to pad or simply skip samples close to the edge of a 3D seismic volume. Moreover, although the training is much faster in a patch-based model, the prediction stage is very computationally intensive, because it processes data size N×N×N times of the original seismic volume (N is the patch size along each dimension). In this study, the patch-based method takes about 400 seconds to predict a line, comparing to less than 1 second required in the encoder-decoder model.

Conclusion

In this study, I compared two types of CNN models in the application of seismic facies classification. The more commonly used patch-based model requires much less effort in generating labeled data, but the classification result is suboptimal comparing to the encoder-decoder model, and the prediction stage can be very time consuming. The encoder-decoder model generates superior classification result at near real-time speed, at the expense of more tedious labeled data picking and longer training time.

Acknowledgements

The author thanks Geophysical Insights for the permission to publish this work. Thank dGB Earth Sciences for providing the F3 North Sea seismic data to the public, and ConocoPhillips for sharing the MalenoV project for public use, which was referenced when generating the training data. The CNN models discussed in this study are implemented in TensorFlow, an open source library from Google.

Figure 2. Example of seismic amplitude co-rendered with training data picked on inline 340 used for a) patch-based model and b) encoder-decoder model. The prediction result from c) patch-based model, and d) from the encoder-decoder model. Target facies are colored in colder to warmer colors in the order shown in Table 1. Compare Facies 5, 6 and 8.

Figure 3. Prediction results from the two networks on a line away from the training lines. a) Predicted facies from the patch-based model. b) Predicted facies from the encoder-decoder based model. Target facies are colored in colder to warmer colors in the order shown in Table 1. The yellow dotted line marks the location of the crossline shown in Figure 4. Compare Facies 1, 5 and 8.

Figure 4. Prediction results from the two networks on a crossline. a) Predicted facies from the patch-based model. b) Predicted facies from the encoder-decoder model. Target facies are colored in colder to warmer colors in the order shown in Table 1. The yellow dotted lines mark the location of the inlines shown in Figure 2 and 3. Compare Facies 5 and 8.

Figure 5. Volumetric display of the predicted facies from the encoder-decoder model. The facies volume is visually cropped for display purpose. An inline and a crossline of seismic amplitude co-rendered with predicted facies are also displayed to show a broader distribution of the facies. Target facies are colored in colder to warmer colors in the order shown in Table 1.

References

Araya-Polo, M., T. Dahlke, C. Frogner, C. Zhang, T. Poggio, and D. Hohl, 2017, Automated fault detection without seismic processing: The Leading Edge, 36, 208–214.

Araya-Polo, M., J. Jennings, A. Adler, and T. Dahlke, 2018, Deep-learning tomography: The Leading Edge, 37, 58–66.

Badrinarayanan, V., A. Kendall, and R. Cipolla, 2017, SegNet: A deep convolutional encoder-decoder architecture for image segmentation: IEEE Transactions on Pattern Analysis and Machine Intelligence, 39, 2481–2495.

Chen, L. C., G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille, 2018, DeepLab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs: IEEE Transactions on Pattern Analysis and Machine Intelligence, 40, 834–848.

Chen, L. C., Y. Zhu, G. Papandreou, F. Schroff, and H. Adam, 2018, Encoder-decoder with atrous separable convolution for semantic image segmentation: arXiv preprint, arXiv:1802.02611v2.

Fehler, M., and K. Larner, 2008, SEG advanced modeling (SEAM): Phase I first year update: The Leading Edge, 27, 1006–1007.

Huang, L., X. Dong, and T. E. Clee, 2017, A scalable deep learning platform for identifying geologic features from seismic attributes: The Leading Edge, 36, 249–256.

LeCun, Y., Y. Bengio, and G. Hinton, 2015, Deep learning: Nature, 521, 436–444.

Lewis, W., and D. Vigh, 2017, Deep learning prior models from seismic images for full-waveform inversion: 87th Annual International Meeting, SEG, Expanded Abstracts, 1512–1517.

Long, J., E. Shelhamer, and T. Darrell, 2015, Fully convolutional networks for semantic segmentation: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 3431–3440.

Serfaty, Y., L. Itan, D. Chase, and Z. Koren, 2017, Wavefield separation via principle component analysis and deep learning in the local angle domain: 87th Annual International Meeting, SEG, Expanded Abstracts, 991–995.

Waldeland, A. U., and A. H. S. S. Solberg, 2017, Salt classification using deep learning: 79th Annual International Conference and Exhibition, EAGE, Extended Abstracts, Tu-B4-12.

A Fault Detection Workflow Using Deep Learning and Image Processing

A Fault Detection Workflow Using Deep Learning and Image Processing

By Tao Zhao
Published with permission: SEG International Exposition and 88th Annual Meeting
October 2018

Summary

Within the last a couple of years, deep learning techniques, represented by convolutional neural networks (CNNs), have been applied to fault detection problems on seismic data with an impressive outcome. As is true for all supervised learning techniques, the performance of a CNN fault detector highly depends on the training data, and post-classification regularization may greatly improve the result. Sometimes, a pure CNN-based fault detector that works perfectly on synthetic data may not perform well on field data. In this study, we investigate a fault detection workflow using both CNN and directional smoothing/sharpening. Applying both on a realistic synthetic fault model based on the SEAM (SEG Advanced Modeling) model and also field data from the Great South Basin, offshore New Zealand, we demonstrate that the proposed fault detection workflow can perform well on challenging synthetic and field data.

Introduction

Benefited from its high flexibility in network architecture, convolutional neural networks (CNNs) are a supervised learning technique that can be designed to solve many challenging problems in exploration geophysics. Among these problems, detection of particular seismic facies of interest might be the most straightforward application of CNNs. The first published study applying CNN on seismic data might be Waldeland and Solberg (2017), in which the authors used a CNN model to classify salt versus non-salt features in a seismic volume. At about the same time as Waldeland and Solberg (2017), Araya-Polo et al. (2017) and Huang et al. (2017) reported success in fault detection using CNN models.

From a computer vision perspective, in seismic data, faults are a special group of edges. CNN has been applied to more general edge detection problems with great success (El- Sayed et al., 2013; Xie and Tu, 2015). However, faults in seismic data are fundamentally different from edges in images used in computer vision domain. The regions separated by edges in a traditional computer vision image are relatively homogeneous, whereas in seismic data such regions are defined by patterns of reflectors. Moreover, not all edges in seismic data are faults. In practice, although providing excellent fault images, traditional edge detection attributes such as coherence (Marfurt et al., 1999) are also sensitive to stratigraphic edges such as unconformities, channel banks, and karst collapses. Wu and Hale (2016) proposed a brilliant workflow for automatically extracting fault surfaces, in which a crucial step is computing the fault likelihood. CNN-based fault detection methods can be used as an alternative approach to generate such fault likelihood volumes, and the fault strike and dip can be then computed from the fault likelihood.

One drawback of supervised machine learning-based fault detection is its brute-force nature, meaning that instead of detecting faults following geological/geophysical principles, the detection purely depends on the training data. In reality, we will never have training data that covers all possible appearances of faults in seismic data, nor are our data noise- free. Therefore, although the raw output from the CNN classifier may adequately represent faults in synthetic data of simple structure and low noise, some post-processing steps are needed for the result to be useful on field data. Based on the traditional coherence attribute, Qi et al. (2017) introduced an image processing-based workflow to skeletonize faults. In this study, we regularize the raw output from a CNN fault detector with an image processing workflow built on Qi et al. (2017) to improve the fault images.

We use both a realistic synthetic data and field data to investigate the effectiveness of the proposed workflow. The synthetic data should ideally be a good approximation of field data and provide full control on the parameter set. We build our synthetic data based on the SEAM model (Fehler and Larner, 2008) by taking sub-volumes from the impedance model and inserting faults. After verifying the performance on the synthetic data, we then move on to field data acquired from the Great South Basin, offshore New Zealand, where extensive amount of faulting occurs. Results on both synthetic and field data show great potential of the proposed fault detection workflow which provides very clean fault images.

Proposed Workflow

The proposed workflow starts with a CNN classifier which is used to produce a raw image of faults. In this study, we adopt a 3D patch-based CNN model that classifies each seismic sample using samples within a 3D window. An example of the CNN architecture used in this study is provided in Figure 1. Basic patch-based CNN model consists of several convolutional layers, pooling (downsampling) layers, and fully-connected layers. Given a 3D patch of seismic amplitudes, a CNN model first automatically extracts several high-level abstractions of the image (similar to seismic attributes) using the convolutional and pooling layers, then classifies the extracted attributes using the fully- connected layers, which behave similar to a traditional multilayer perceptron network. The output from the network is then a single value representing the facies label of the seismic sample centered at the 3D patch. In this study, the label is binary, representing “fault” or “non-fault”.

Figure 1. Sketches of a 2D patch-based CNN architecture. In this demo case, each input data instance is a small 2D patch of seismic amplitude centered at the sample to be classified. The corresponding output is a class label representing the patch (in this case, fault), which is usually assigned to the center sample. Different types of layers are denoted in different colors, with layer types marked at their first appearance in the network. The size of the cuboids approximately represents the output size of each layer.

We then use a suite of image processing techniques to improve the quality of the fault images. First, we use a directional Laplacian of Gaussian (LoG) filter (Machado et al., 2016) to enhance lineaments that are of high angle from layering reflectors and suppress anomalies close to reflector dip, while calculating the dip, azimuth, and dip magnitude of the faults. Taking these data, we then use a skeletonization step, redistributing the fault anomalies within a fault damage zone to the most likely fault plane. We then do a thresholding to generate a binary image for faults. Optionally, if the result is still noisy, we can continue with a median filter to reduce the random noise and iteratively perform the directional LoG and skeletonization to achieve a desirable result. Figure 2 summarizes the regularization workflow.

Synthetic Test

We first test the proposed workflow on synthetic data built on the SEAM model. To make the model a good approximation of real field data, we select a portion in the SEAM model where stacked channels and turbidites exist. We then randomly insert faults in the impedance model and convolve with a 40Hz Ricker wavelet to generate seismic volumes. The parameters used in random generation of five reverse faults in the 3D volume are provided in Table 1. Figure 3a shows one line from the generated synthetic data with faults highlighted in red. In this model, we observe strong layer deformation with amplitude change along reflectors due to the turbidites in the model. Therefore, such synthetic data are in fact quite challenging for a fault detection algorithm, because of the existence of other types of discontinuities.

We randomly use 20% of the samples on the fault planes and approximately the same amount of non-fault samples to train the CNN model. The total number of training sample is about 350,000, which represents <1% of the total samples in the seismic volume. Figure 3b shows the raw output from the CNN fault detector on the same line shown in Figure 3a. We observe that instead of sticks, faults appear as a small zone. Also, as expected, there are some misclassifications where data are quite challenging. We then perform the regularization steps excluding the optional steps in Figure 2. Figure 3c shows the result after directional LoG filter and skeletonization. Notice that these two steps have cleaned up much of the noise, and the faults are now thinner and more continuous. Finally, we perform a thresholding to generate a fault map where faults are labeled as “1” and “0” for everywhere else (Figure 3d). Figure 4 shows the fault detection result on a less challenging line. We observe that the result on such line is nearly perfect.

Figure 2. The regularization workflow used to improve the fault images after CNN fault detection.

Fault attributeValues range
Dip angle (degree)-15 to 15
Strike angle (degree)-25 to 25
Displacement (m)25 to 75

Table 1. Parameter ranges used in generating faults in the synthetic model.

Field Data Test

We further verify the proposed workflow on field data from the Great South Basin, offshore New Zealand. The seismic data contain extensive faulting with complex geometry, as well as other types of coherence anomalies as shown in Figure 5. In this case, we manually picked training data on five seismic lines for regions representing fault and non-fault. An example line is given in Figure 6. As one may notice, although the training data consist very limited coverage in the whole volume, we try to include the most representative samples for the two classes. On the field data, we use the whole regularization workflow including the optional steps. Figure 7 gives the final output from the proposed workflow, and the result from using coherence in lieu of raw CNN output in the workflow. We observe that the result from CNN plus regularization gives clean fault planes with very limited noise from other types of discontinuities.

Conclusion

In this study, we introduce a fault detection workflow using both CNN-based classification and image processing regularization. We are able to train a CNN classifier to be sensitive to only faults, which greatly reduces the mixing between faults and other discontinuities in the produced faults images. To improve the resolution and further suppress non-fault features in the raw fault images, we then use an image processing-based regularization workflow to enhance the fault planes. The proposed workflow shows great potential on both challenging synthetic data and field data.

Acknowledgements

The authors thank Geophysical Insights for the permission to publish this work. We thank New Zealand Petroleum and Minerals for providing the Great South Basin seismic data to the public. The CNN fault detector used in this study is implemented in TensorFlow, an open source library from Google. The authors also thank Gary Jones at Geophysical Insights for valuable discussions on the SEAM model.

Figure 3. Line A from the synthetic data showing seismic amplitude with a) artificially created faults highlighted in red; b) raw output from CNN fault detector; c) CNN detected faults after directional LoG and skeletonization; and d) final fault after thresholding.

Figure 4. Line B from the synthetic data showing seismic amplitude co-rendered with a) randomly created faults highlighted in red and b) final result from the fault detection workflow, in which predicted faults are marked in red.

Figure 5. Coherence attribute along t = 1.492s. Coherence shows discontinuities not limited to faults, posting challenges to obtain only fault images.

Figure 6. A vertical slice from the field seismic amplitude data with manually picked regions for training the CNN fault detector. Green regions represent fault and red regions represent non-fault.

References

Araya-Polo, M., T. Dahlke, C. Frogner, C. Zhang, T. Poggio, and D. Hohl, 2017, Automated fault detection without seismic processing: The Leading Edge, 36, 208–214.

El-Sayed, M. A., Y. A. Estaitia, and M. A. Khafagy, 2013, Automated edge detection using convolutional neural network: International Journal of Advanced Computer Science and Applications, 4, 11–17.

Fehler, M., and K. Larner, 2008, SEG Advanced Modeling (SEAM). Phase I first year update: The Leading Edge, 27, 1006–1007.

Huang, L., X. Dong, and T. E. Clee, 2017, A scalable deep learning platform for identifying geologic features from seismic attributes: The Leading Edge, 36, 249–256.

Machado, G., A. Alali, B. Hutchinson, O. Olorunsola, and K. J. Marfurt, 2016, Display and enhancement of volumetric fault images: Interpretation, 4, 1, SB51–SB61.

Marfurt, K. J., V. Sudhaker, A. Gersztenkorn, K. D. Crawford, and S. E. Nissen, 1999, Coherency calculations in the presence of structural dip: Geophysics, 64, 104–111.

Qi, J., G. Machado, and K. Marfurt, 2017, A workflow to skeletonize faults and stratigraphic features: Geophysics, 82, no. 4, O57–O70.

Waldeland, A. U., and A. H. S. S. Solberg, 2017, Salt classification using deep learning: 79th Annual International Conference and Exhibition, EAGE, Extended Abstracts, Tu-B4-12.

Wu, X., and D. Hale, 2016, 3D seismic image processing for faults: Geophysics, 81, no. 2, IM1–IM11.

Xie, S., and Z. Tu, 2015, Holistically-nested edge detection: Proceedings of the IEEE International Conference on Computer Vision, 1395–1403.

Machine Learning Terms

Machine Learning Terms

A graphical representation of neuron classifications of attributes. Each hexagonal object (neuron) in the interactive 2D Colormap represents a class of data in the region and a corresponding geologic condition. The 2D Colomap is used interactively with the Paradise Universal Viewer to select and isolate specific neurons which have a classified set of seismic attributes according to where the data is concentrated.

While a SOM or PCA are examples of batching applications, other features within Paradise are interactive, such as the 2D Colormap and Universal Viewer; therefore, Paradise has both batching and interactive capabilities.

A tool in Paradise that quantifies the relative contribution of attributes within a neuron or set of neurons. The 2D Colormap reveals the specific attributes comprising a selected classification result as represented by each neuron on the 2D Colormap. The interpreter can then run a SOM across the refined area to expose geobodies with similar properties. Note: This feature will be available in Paradise 3.0.

A process by which computer algorithms learn iteratively from the data and adapt independently to produce reliable, repeatable results. Machine learning addresses two significant issues:

1. The Big Data problem of trying to interpret dozens, if not hundreds, of volumes of data
2. The fact that humans cannot understand the relationship of several types of data all at once

A beneficial technique when single attributes are indistinct. These natural patterns or clusters represent geologic information embedded in the data, and can help identify geologic features, geobodies, and aspects of geology that often cannot be interpreted by any other means.

A class of machine learning.  While there are several forms of applying neural networks, Paradise uses the Self-Organizing Map (SOM) process, which is sometimes referred to as Kohonen maps after Professor Teuva Kohonen at the University of Finland.

Attributes differ in their relative contribution to information in a given volume. PCA is a linear process that helps to determine those attributes that have the greatest contribution to the data and quantifying the relative contribution of each attribute based on its variance.

The results of the PCA are given in 2 bar charts – Eigenvalues and Eigenvectors. Together, they indicate the direction and magnitude of the greatest variance among the set of attributes.

Eigenvalues – Graphically presents the extent of variance among a set of attributes in the PCA and can be selected to reveal its corresponding set of Eigenvectors

Eigenvector – The graphical bar chart, and associated table which lists the relative contribution on a percentage basis of each attribute in a set.

Any measurable property of seismic data which aids interpreters in identifying geologic features that are not understood clearly in the original data.

A neural network based, machine learning process that is applied to multiple attribute volumes simultaneously. A SOM analysis enables interpreters to identify the natural organizational patterns in the data from multiple seismic attributes. Applied at single sample seismic resolution in Paradise, the SOM produces a non-linear classification of the data in a region designated by the interpreter. Regions can be constrained by time, between horizons, or above and below a given horizon. SOM evaluations have proven to be beneficial in essentially all geologic settings, including unconventional resource plays, moderately compacted onshore regions, and offshore unconsolidated sediments.

Presents SOM process results through classification and probability volumes. Displays 2D and 3D views of the data while using the 2D colormap to gain understanding of the classification results

  • At the intersection of machine learning and multi-attribute seismic analysis

  • The next generation of statistical classification tools

  • An evolving workbench that enables interpreters to extract greater insights from seismic data

  • Small producers – attract investment capital through reduced risk and faster interpretation

  • Large producers – reduce risk and the cost/bbl for field development

  • Machine learning technology that analyzes data at single sample resolution

  • Easy to use, left-to-right guided thought-flows can be applied by all interpreters

  • Interactive, 2D Colormap representing classification results

  • Integrated PCA to SOM (see terms below) thought-flow that identifies the best attributes and refines an interpretation

Thin Beds and Anomaly Resolution in the Niobrara

Thin Beds and Anomaly Resolution in the Niobrara

By Rocky Roden
March 2018

Identifying thin beds using machine learning

In March of 2018, a study was conducted of the Niobrara using machine learning for multi-attribute analysis. Geophysical Insights obtained 100 square miles of seismic data covering the Niobrara in northeast Colorado from the Geophysical Pursuit and Geokinetics multi-client seismic library.  Using Paradise®, a multi-attribute learning application, geoscientists generated and classified seismic volumes, the results revealing thin beds below seismic tuning and anomaly resolution not practical with traditional interpretation methods. In total, the results of that classification demonstrated dramatically improved stratigraphic resolution and anomaly isolation within the Niobrara Formation and associated reservoir and source rock units.

 

Outline of Niobrar Phase 5

 

Figure 1: Outline of Niobrara Phase 5. 100 square miles within this survey were selected for analysis.

 Seismic Volume Classification Results

The result set, or classification in multi-colors, shows the continuity of facies and the continuity/discontinuity of anomalies in the greater Niobrara section. Each voxel represents 1 millisecond in the volume or approximately 15 feet. The white dashed correlation line shows anomalies on the B bench and the brackets indicate anomalies at the Codell level. Two horizontal boreholes are shown, however, neither seems an optimal penetration. The results demonstrate that machine learning in Paradise enables a sample-based thin bed analysis.

Classification - Instantaneous Attributes

Figure 2a: Classification based on eight Instantaneous attributes.

Figure 2b: The original amplitude data.

Figure 3: Self-Organizing Map (SOM) classification with low probability (<10%) anomalies (white) in the greater Niobrara section. The results may indicate concentrations of hydrocarbons in organic-rich shales. Anomalies also may show migration up section along faults from the Niobrara into the Sharon Springs. Niobrara B anomalies are best developed in elevated areas. Additional well data can corroborate anomaly interpretations.

Interactive 2D Map

Figure 4: Interactive 2D Colormaps use transparency (left and see inset) to highlight and extract anomalous behavior or particular facies. Note that the wellbore intersects a fault (white arrow) and misses the bracketed anomaly.

Principal Component Analysis (PCA) was used to identify and quantify the key attributes in the seismic volumes. The SOM process was then applied on these attributes to learn and classify the data. The neural topology, shown in the Paradise 2D Colormap, establishes the number of classes in the resulting seismic volume. The classification volumes show geologic features and anomalies that can aid in well location and development planning.

 

Top Niobrara Map - Structures and Associated Fractures

Figure 5: Map of structural features identified in this Niobrara volume. Near vertical graben-style faulting predominates in the upper Niobrara..

What is Big Data?

What is Big Data?

 

Let’s talk for a minute about the concepts of Big Data.

Remember a few years ago, if you wanted to survive in the oil and gas business, saving the whales was all the rage? We searched for some way to incorporate protecting the whales into our exploration geophysics, and that would affect operations. Well, we have another big thing today – Big Data. We’re always looking for ways to tie in what we’re doing to Big Data. The bosses up at board level – they’re all talking about Big Data. What is it?

Big Data is access to large volumes of disparate kinds of oil and gas data, which we then feed to machine learning algorithms to discover unknown relationships. It’s the unknown data we’ve never spotted before. A key to that definition is “disparate kinds”. So, if you say “I’m doing big data with my seismic data” – that’s not really an appropriate choice of terms. If you say “I’m going to throw in all my seismic data, along with associated wells, and my production data.” – NOW you are starting to talk about real Big Data operations.

A couple more key terms to keep in mind:

Data Mining is evaluating Big Data with deep learning.

And finally, the Internet of Things (IoT).

This may actually have a bigger impact on our industry than even machine learning. The IoT refers to all the pieces of equipment and hardware in our lives being hooked up to the internet. The IoT is walking up to your web-enabled refrigerator that recognizes your face and what you add and remove to the contents. In our business, we’re looking at the GPS of the boat, the geophones – everything is a web-aware device to both send and receive. In fact, when the geophones get planted, their GPS is still communicating. We know when they are in the ground, and when they get pulled up, thrown in the back of a truck, and driven somewhere.

With the trifecta of those things – Big Data, IoT, and Data Mining we are approaching a new age in the oil and gas industry to know things and understand them in ways we never have before.

At Geophysical Insights, we believe You should be able to query your seismic data with learning machines just as effortlessly and with as much reliability as you query the web for the nearest gas station.