Net Reservoir Discrimination through Multi-Attribute Analysis at Single Sample Scale

Net Reservoir Discrimination through Multi-Attribute Analysis at Single Sample Scale

By Jonathan Leal, Rafael Jerónimo, Fabian Rada, Reinaldo Viloria and Rocky Roden
Published with permission: First Break
Volume 37, September 2019

Abstract

A new approach has been applied to discriminate Net Reservoir using multi-attribute seismic analysis at single sample resolution, complemented by bivariate statistical analysis from petrophysical well logs. The combination of these techniques was used to calibrate the multi-attribute analysis to ground truth, thereby ensuring an accurate representation of the reservoir static properties and reducing the uncertainty related to reservoir distribution and storage capacity. Geographically, the study area is located in the south of Mexico. The reservoir rock consists of sandstones from the Upper Miocene age in a slope fan environment.

The first method in the process was the application of Principal Component Analysis (PCA), which was employed to identify the most prominent attributes for detecting lithological changes that might be associated with the Net Reservoir. The second method was the application of the Kohonen Self-Organizing Map (SOM) Neural Network Classification at voxel scale (i.e., sample rate and bin size dimensions from seismic data), instead of using waveform shape classification. The sample-level analysis revealed significant new information from different seismic attributes, providing greater insights into the characteristics of the reservoir distribution in a shaly sandstone. The third method was a data analysis technique based on contingency tables and Chi-Square test, which revealed relationships between two categorical variables (SOM volume neurons and Net Reservoir). Finally, a comparison between a SOM of simultaneous seismic inversion attributes and traditional attributes classification was made corroborating the delineated prospective areas. The authors found the SOM classification results are beneficial to the refinement of the sedimentary model in a way that more accurately identified the lateral and vertical distribution of the facies of economic interest, enabling decisions for new well locations and reducing the uncertainty associated with field exploitation. However, the Lithological Contrast SOM results from traditional attributes showed a better level of detail compared with seismic inversion SOM.

Introduction

Self-Organizing Maps (SOM) is an unsupervised neural network – a form of machine learning – that has been used in multi-attribute seismic analysis to extract more information from the seismic response than would be practical using only single attributes. The most common use is in automated facies mapping. It is expected that every neuron or group of neurons can be associated with a single depositional environment, the reservoir´s lateral and vertical extension, porosity changes or fluid content (Marroquín et al., 2009). Of course, the SOM results must be calibrated with available well logs. In this paper, the authors generated petrophysical labels to apply statistical validation techniques between well logs and SOM results. Based on the application of PCA to a larger set of attributes, a smaller, distilled set of attributes were classified using the SOM process to identify lithological changes in the reservoir (Roden et al., 2015).

A bivariate statistical approach was then conducted to reveal the relationship between two categorical variables: the individual neurons comprising the SOM classification volume and Net Reservoir determined from petrophysical properties (percentage of occurrence of each neuron versus Net Reservoir).

The Chi-Square test compares the behavior of the observed frequencies (Agresti, 2002) for each SOM neuron lithological contrast against the Net Reservoir variable (grouped in “Net Reservoir” and “no reservoir” categories). Additional data analysis was conducted to determine which neurons responded to the presence of hydrocarbons using box plots showing Water Saturation, Clay Volume, and Effective Porosity as Net Pay indicators. The combination of these methods demonstrated an effective means of identifying the approximate region of the reservoir.

About the Study Area

The reservoir rock consists of sandstones from the Upper Miocene age in a slope fan environment. These sandstones correspond to channel facies, and slope lobes constituted mainly of quartz and potassium feldspars cemented in calcareous material of medium maturity. The submarine slope fans were deposited at the beginning of the deceleration of the relative sea-level fall, and consist of complex deposits associated with gravitational mass movements.

Stratigraphy and Sedimentology

The stratigraphic chart comprises tertiary terrigenous rocks from Upper Miocene to Holocene. The litho-stratigraphic units are described in Table 1.

Table 1: Stratigraphic Epoch Chart of Study Area

 

Figure 1. Left: Regional depositional facies. Right: Electrofacies and theoretical model, Muti (1978).

Figure 1 (left) shows the facies distribution map of the sequence, corresponding to the first platform-basin system established in the region. The two dashed lines – one red and one dark brown – represent the platform edge at different times according to several regional integrated studies in the area. The predominant direction of contribution for studied Field W is south-north, which is consistent with the current regional sedimentary model. The field covers an area of approximately 46 km2 and is located in facies of distributary channels northeast of the main channel. The reservoir is also well-classified and consolidated in clay matrix, and it is thought that this texture corresponds to the middle portion of the turbidite system. The observed electrofacies logs of the reservoir are box-shaped in wells W-2, W-4, W-5, and W-6 derived from gamma ray logs and associated with facies of distributary channels that exhibit the highest average porosity. In contrast, wells W-3 and W-1 are different – associated with lobular facies – according to gamma ray logs. In Figure 1 (right), a sedimentary scheme of submarine fans proposed by Muti (1978).

Petrophysics

The Stieber model was used to classify Clay Volume (VCL). The Effective Porosity (PIGN) was obtained using the Neutron-Density model and non-clay water intergranular Water Saturation (SUWI) was determined to have a salinity of 45,000 ppm using the Simandoux model. Petrophysical cut-off values used to distinguish Net Reservoir and Net Pay estimations were 0.45, 0.10 and 0.65, respectively.

Reservoir Information

The reservoir rock corresponds to sands with Net Pay thickness ranging from 9-12 m, porosity between 18-25%, average permeability of 8-15 mD, and Water Saturation of approximately 25%. The initial pressure was 790 kg / cm2 with the current pressure is 516 kg/cm2. The main problems affecting productivity in this volumetric reservoir are pressure drop, being the mechanism of displacement the rock-fluid expansion, and gas in solution. Additionally, there are sanding problems and asphaltene precipitation.

Methodology

Multidisciplinary information was collected and validated to carry out seismic multi-attribute analysis. Static and dynamic characterization studies were conducted in the study area, revealing the most relevant reservoir characteristics and yielding a better sense of the proposed drilling locations. At present, six wells have been drilled.

The original available seismic volume and associated gathers employed in the generation of multiple attributes and for simultaneous inversion were determined to be of adequate quality. At target depth, the dominant frequency approaches 14 Hz, and the interval velocity is close to 3,300 m/s. Therefore, the vertical seismic resolution is 58 m. The production sand has an average thickness of 13 m, so it cannot be resolved with conventional seismic amplitude data.

Principal Component Analysis (PCA)

Principal Component Analysis (PCA) is one of the most common descriptive statistics procedures used to synthesize the information contained in a set of variables (volumes of seismic attributes) and to reduce the dimensionality of a problem. Applied to a collection of seismic attributes, PCA can be used to identify the seismic attributes that have the greatest “contribution,” based on the extent of their relative variance to a region of interest. Attributes identified through the use of PCA are responsive to specific geological features, e.g., lithological contrast, fracture zones, among others. The output of PCA is an Eigen spectrum that quantifies the relative contribution or energy of each seismic attribute to the studied characteristic.

PCA Applied for Lithological Contrast Detection

The PCA process was applied to the following attributes to identify the most significant attributes to the region to detect lithological contrasts at the depth of interest: Thin Bed Indicator, Envelope, Instantaneous Frequency, Imaginary Part, Relative Acoustic Impedance, Sweetness, Amplitude, and Real Part. Of the entire seismic volume, only the voxels in a time window (seismic samples) delimited by the horizon of interest were analyzed, specifically 56 milliseconds above and 32 milliseconds below the horizon. The results are shown for each principal component. In this case, the criterion used for the selected attributes were those whose maximum percentage contribution to the principle component was greater than or equal to 80%. Using this selection technique, the first five principal components were reviewed in the Eigen spectrum. In the end, six (6) attributes of the first two principal components were selected (Figure 2).

Figure 2. PCA results for lithological contrast detection.

Simultaneous Classification of Seismic Attributes Using a Self-Organizing Maps (SOM) Neural Network (Voxel Scale)

The SOM method is an unsupervised classification process in that the network is trained from the input data alone. A SOM consists of components (vectors) called neurons or classes and input vectors that have a position on the map. The values are compared employing neurons that are capable of detecting groupings through training (machine learning) and mapping. The SOM process non-linearly maps the neurons to a two dimensional, hexagonal or rectangular grid. SOM describes a mapping of a larger space to a smaller one. The procedure for locating a vector from the data space on the map is to find the neuron with the vector of weights (smaller metric distance) closer to the vector of the data space. (The subject of this analysis accounted for seismic samples located within the time window covering several samples above and below the target horizon throughout the study area). It is important to classify attributes that have the same common interpretive use, such as lithological indicators, fault delineation, among others. The SOM revealed patterns and identified natural organizational structures present in the data that are difficult to detect in any other way (Roden et al., 2015), since the SOM classification used in this study is applied on individual samples (using sample rate and bin size from seismic data, Figure 2, lower right box), detecting features below conventional seismic resolution, in contrast with traditional wavelet-based classification methods.

SOM Classification for Lithological Contrast Detection

The following six attributes were input to the SOM process with 25 classes (5 X 5) stipulated as the desired output: Envelope, Hilbert, Relative Acoustic Impedance, Sweetness, Amplitude, and Real Part.

As in the PCA analysis, the SOM was delimited to seismic samples (voxels) in a time window following the horizon of interest, specifically 56 milliseconds above to 32 milliseconds below. The resulting SOM classification volume was examined with several visualization and statistical analysis techniques to associate SOM classification patterns with reservoir rock.

3D and Plan Views

One way of identifying patterns or trends coherent with the sedimentary model of the area is visualizing all samples grouped by each neuron in 3D and plan views using stratal-slicing technique throughout the reservoir. The Kohonen SOM and the 2D Colormap in Figure 3 (lower right) ensures that the characteristics of neighboring neurons are similar. The upper part of Figure 3 shows groupings classified by all 5x5 (25) neurons comprising the neural network, while in the lower part there are groupings interpreted to be associated with the reservoir classified by a few neurons that are consistent with the regional sedimentary model, i.e., neurons N12, N13, N16, N17, N22, and N23.

Figure 3. Plan view with geological significance preliminary geobodies from Lithological Contrast SOM. Below: only neurons associated with reservoir are shown.

Vertical Seismic Section Showing Lithological Contrast SOM

The observed lithology in the reservoir sand is predominantly made up of clay sandstone. A discrete log for Net Reservoir was generated to calibrate the results of the Lithological Contrast SOM, using cut-off values according to Clay Volume and Effective Porosity. Figure 4 shows the SOM classification of Lithological Contrast with available well data and plan view. The samples grouped by neurons N17, N21, and N22 match with Net Reservoir discrete logs. It is notable that only the well W-3 (minor producer) intersected the samples grouped by the neuron N17 (light blue). The rest of the wells only intersected neurons N21 and N22. It is important to note that these features are not observed on the conventional seismic amplitude data (wiggle traces).

Figure 4. Vertical section composed by the SOM of Lithological Contrast, Amplitude attribute (wiggle), and Net Reservoir discrete property along wells.

Stratigraphic Well Section

A cross-section containing the wells (Figure 5) shows logs of Gamma Ray, Clay Volume, perforations, resistivity, Effective Porosity, Net Reservoir with lithological contrast SOM classification, and Net Pay.
The results of SOM were compared by observation with discrete well log data, relating specific neurons to the reservoir. At target zone depth, only the neurons N16, N17, N21, and N22 are present. It is noteworthy that only W-3 well (minor producer) intersect clusters formed by neuron N17 (light blue). The rest of the wells intersect neurons N16, N21, N22, and N23.

Statistical Analysis Vertical Proportion Curve (VPC)

Traditionally, Vertical Proportion Curves (VPC) are qualitative and quantitative tools used by some sedimentologists to define succession, division, and variability of sedimentary sequences from well data, since logs describe vertical and lateral evolution of facies (Viloria et al., 2002). A VPC can be modeled as an accumulative histogram where the bars represent the facies proportion present at a given level in a stratigraphic unit. As part of the quality control and revision of the SOM classification volume for Lithological Contrasts, this statistical technique was used to identify whether in the stratigraphic unit or in the window of interest, a certain degree of succession and vertical distribution of specific neurons observed could be related to the reservoir.

The main objective of this statistical method is to identify how specific neurons are vertically concentrated along one or more logs. As an illustration of the technique, a diagram of the stratigraphic grid is shown in Figure 6. The VPC was extracted from the whole 3D grid of SOM classification volume for Lithological Contrast, and detection was generated by counting the occurrence among the 25 neurons or classes in each stratigraphic layer in the VPC extracted from the grid. The VPC of SOM neurons exhibits remarkable slowly-varying characteristics indicative of geologic depositional patterns. The reservoir top corresponds to stratigraphic layer No. 16. In the VPC on the right, only neurons N16, N17, N21, and N22 are present. These neurons have a higher percentage occurrence relative to all 25 classes from the top of the target sand downwards. Corroborating the statistics, these same neural classes appear in the map view in Figure 3 and the vertical section shown in Figure 4. The stratigraphic well section in Figure 5 also supports the statistical results. It is important to note that these neurons also detected seismic samples above the top of the sand top, although in a lesser proportion. This effect is consistent with the existence of layers with similar lithological characteristics, which can be seen from the well logs.

Figure 6. Vertical proportion Curve to identify neurons related to reservoir rock.

Bivariate Statistical Analysis Cross Tabs

The first step in this methodology is a bivariate analysis through cross-tabs (contingency table) to determine if two categorical variables are related based on observing the extent to which the occurrence of one variable is repeated in the categories of the second. Given that one variable is analyzed in terms of another, a distinction must be made between dependent and independent variables. With cross tabs analysis, the possibilities are extended to (in addition to frequency analyzes for each variable, separately) the analyses of the joint frequencies or those in which the analysis unit nature is defined by the combination of two variables.

The result was obtained by extracting the SOM classification volume along wells paths and constructing a discrete well log with two categories: “Net Reservoir” and “not reservoir.” The distinction between “Net Reservoir” and “not reservoir” simply means that the dependent variable might have a hydrocarbon storage capacity or not. In this case, the dependent variable corresponds to neurons of SOM classification for Lithological Contrast volume. It is of ordinal type, since it has an established internal order, and the change from one category to another is not the same. The neurons go from N1 to N25, organized in rows. The independent variable is Net Reservoir, which is also an ordinal type variable. In this tab, the values organized in rows correspond to neurons from the SOM classification volume for Lithological Contrast, and in the columns are discrete states of the “Net Reservoir” and “not reservoir” count for each neuron. Table 2 shows that the highest Net Reservoir counts are associated with neurons N21 and N22 at 47.0% and 28.2% respectively. Conversely, lower counts of Net Reservoir are associated with neurons N17 (8.9%), N16 (7.8%) and N23 (8.0%).

Table 2. Cross Tab for Lithological Contrast SOM versus Net reservoir.

Neuron N21 was detected at reservoir depth in wells W-2 (producer), W-4 (abandoned for technical reasons during drilling), W-5 (producer) and W-6 (producer). N21 showed higher percentages of occurrence in Net Reservoir, so this neuron could be identified as indicating the highest storage capacity. N22 was present in wells W-1 and W-6 at target sand depth but also detected in wells W-2, W-4 and W-5 in clay-sandy bodies overlying the highest quality zone in the reservoir. N22 was also detected in the upper section of target sand horizontally navigated by the W-6 well, which has no petrophysical evaluation. N17 was only detected in well W-3, a minor producer of oil, which was sedimentologically cataloged as lobular facies and had the lowest reservoir rock quality. N16 was detected in a very small proportion in wells W-4 (abandoned for technical reasons during drilling) and W-5 (producer). Finally, N23 was only detected towards the top of the sand in well W-6, and in clayey layers overlying it in the other wells. This is consistent with the observed percentage of 8% Net Reservoir, as shown in Table 2.

Chi-Square Independence Hypothesis Testing

After applying the cross-tab evaluation, this classified information was the basis of a Chi-Square goodness-of-fit test to assess the independence or determine the association between two categorical variables: Net Reservoir and SOM neurons. That is, it aims to highlight the absence of a relationship between the variables. The Chi-Square test compared the behavior of the observed frequencies for each Lithological Contrast neuron with respect to the Net Reservoir variable (grouped in “Net Reservoir” and “no reservoir”), and with the theoretically expected frequency distribution when the hypothesis is null.

As a starting point, the null hypothesis formulation was that the Lithological Contrast SOM neuron occurrences are independent of the presence of Net Reservoir. If the calculated Chi-Square value is equal to or greater than a certain critical theoretical value, the null hypothesis must be rejected. Consequently, the alternative hypothesis must be accepted. Observe the results in Table 3 where the calculated Chi-Square is greater than the theoretical critical value (296 ≥ 9.4, with four degrees of freedom and 5% confidence level), so the null hypothesis of the independence of Net Pay with SOM neurons is rejected, leaving a relationship between Net Reservoir and Lithological Contrast SOM variables.

The test does not report a goodness of fit magnitude (substantial, moderate or poor), however. To measure the degree of correlation between both variables, Pearson’s Phi (φ) and Cramer’s V (ν) measures were computed. Pearson’s φ coefficient was estimated from Eq. 1.1.

Eq. 1.1

where X2: Chi-Square and n : No. of cases

Additionally, Cramer’s V was estimated using Eq. 1.2.

Eq. 1.2

In both cases, values near zero indicate a poor or weak relationship while values close to one indicate a strong relation. The authors obtained values for φ, and Cramer´s ν equals to 0.559 (Table 3). Based on this result, we can interpret a moderate relation between both variables.

Table 3. Calculated and theoretical Chi-Square values and its correlation measures.

Box-and-Whisker Plots

Box-and-whisker plots were constructed to compare and understand the behavior of petrophysical properties for the range that each neuron intersects the well paths in the SOM volume. Also, these quantify which neurons of interest respond to Net Reservoir and Net Pay properties (Figure 7). Five descriptive measures are shown for a box-and-whisker plot of each property:

• Median (thick black horizontal line)
• First quartile (lower limit of the box)
• Third quartile (upper limit of the box)
• Maximum value (upper end of the whisker)
• Minimum value (lower end of the whisker)

The graphs provide information about data dispersion, i.e., the longer the box and whiskers, the greater the dispersion and also data symmetry. If the median is relatively centered of the box, the distribution is symmetrical. If, on the contrary, it approaches the first or third quartile, the distribution could be skewed to these quartiles, respectively. Finally, these graphs identify outlier observations that depart from the rest of the data in an unusual way (these are represented by dots and asterisks as less or more distant from the data center). Horizontal dashed green line is the cut-off value for Effective Porosity (PIGN >0.10) while the dashed blue line represents the cut-off value for Clay Volume (VCL>0.45) and, dashed beige line is cut-off value for Water Saturation (SUWI<0.65).

Based on these data and the resulting analysis, it can be inferred that neurons N16, N17, N21, N22, and N23 respond positively to Net Reservoir. Of these neurons, the most valuable predictors are N21 and N22 since they present lower clay content in comparison with neurons N16 and N23 and associated higher Effective Porosity shown by neurons N16, N17, and N23 (Figure 7a). Neurons N21 and N22 are ascertained to represent the best reservoir rock quality. Finally, neuron N23 (Figure 7b), can be associated with rock lending itself with storage capacity, but clayey and with high Water Saturation, which allows discarding it as a significant neuron. It is important to note that this analysis was conducted by accounting for the simultaneous occurrence of the petrophysical values (VCL, PIGN, and SUWI) on the neurons initially intersected (Figure 7a), and then on the portion of the neurons that pass Net Reservoir cut-off values (Figure 7b), and finally on the portion of the neurons that pass net-pay cut-off values (Figure 7c). For all these petrophysical reasons, the neurons to be considered as a reference to estimate the lateral and vertical distribution of Net Reservoir associated with the target sand are in order of importance, N21, N22, N16, and N17.

Figure 7. Comparison between neurons according to petrophysical properties: VCL (Clay Volume), PIGN (Effective Porosity) and SUWI (Water Saturation). a) SOM neurons for lithological contrast detection, b) Those that pass Net Reservoir cut-off and c) Those that pass Net Pay cut-off.

Simultaneous Seismic Inversion

During this study, a simultaneous prestack inversion was performed using 3D seismic data and sonic logs, in order to estimate seismic petrophysical attributes as Acoustic Impedance (Zp), Shear Impedance (Zs), Density (Rho), as well as P&S-wave velocities, among others. They are commonly used as lithology indicators, possible fluids, and geomechanical properties. Figure 8a shows a scatter plot from well data of seismic attributes Lambda Rho and Mu Rho ratio versus Clay Volume (VCL) and as discriminator Vp/Vs ratio (Vp/Vs). The target sand corresponds to low Vp/Vs and Lambda/Mu values (circled in the figure). Another discriminator in the reservoir was S-wave impedance (Zs) (Figure 8b). From this, seismic inversion attributes were selected for classification by SOM neural network analysis. These attributes were Vp/Vs ratio, Lambda Rho/Mu Rho ratio, and Zs.

Figure 8. Scatter plots: a) Lambda Rho and Mu Rho ratio versus VCL and Vp/Vs y b) Zs versus VCL and Vp/Vs.

Self-Organizing Map (SOM) Comparison

Figure 9 is a plan view of neuron-extracted geobodies associated with the sand reservoir. In the upper part, a SOM classification for Lithological Contrast detection obtained from six traditional seismic attributes is shown; and in the lower part, a different SOM classification for Lithological Contrast detection was obtained from three attributes of simultaneous inversion. Both results are very similar. The selection of SOM classification neurons from inversion attributes was done through spatial pattern recognition, i.e., identifying geometry/shape of the clusters related to each of 25 neurons congruent with the sedimentary model, and by using a stratigraphic section for wells that includes both SOM classifications tracks.

Figure 9. Plan view of neurons with geological meaning. Up: SOM Classification from traditional attributes. Down: SOM Classification from simultaneous inversion attributes.

Figure 10 shows a well section that includes a track for Net Reservoir and Net Pay classification along with SOM classifications from traditional attributes and a second SOM from simultaneous inversion attributes defined from SOM volumes and well paths intersection. In fact, only the neurons numbers with geological meaning are shown.

Figure 10. Well section showing the target zone with tracks for discrete logs from Net Reservoir, Net Pay and both SOM classifications.

Discussion and Conclusions

Principal Component Analysis (PCA) identified the most significant seismic attributes to be classified by Self-Organizing Maps (SOM) neural network at single-sample basis to detect features associated with lithological contrast and recognize lateral and vertical extension in the reservoir. The interpretation of SOM classification volumes was supported by multidisciplinary sources (geological, petrophysical, and dynamic data). In this way, the clusters detected by certain neurons became the inputs for geobody interpretation. The statistical analysis and visualization techniques enabled the estimation of Net Reservoir for each neuron. Finally, the extension of reservoir rock geobodies derived from SOM classification of traditional attributes was corroborated by the SOM acting on simultaneous inversion attributes. Both multi-attribute machine learning analysis of traditional attributes and attributes of seismic inversion enable refinement of the sedimentary model to reveal more precisely the lateral and vertical distribution of facies. However, the Lithological Contrast SOM results from traditional attributes showed a better level of detail compared with seismic inversion SOM.

Collectively, the workflow may reduce uncertainty in proposing new drilling locations. Additionally, this methodology might be applied using specific attributes to identify faults and fracture zones, identify absorption phenomena, porosity changes, and direct hydrocarbon indicator features, and determine reservoir characteristics.

Acknowledgments

The authors thank Pemex and Oil and Gas Optimization for providing software and technical resources. Thanks also are extended to Geophysical Insights for the research and development of the Paradise® AI workbench and the machine learning applications used in this paper. Finally, thank Reinaldo Michelena, María Jerónimo, Tom Smith, and Hal Green for review of the manuscript.

References

Agresti, A., 2002, Categorical Data Analysis: John Wiley & Sons.

Marroquín I., J.J. Brault and B. Hart, 2009, A visual data mining methodology to conduct seismic facies analysis: Part 2 – Application to 3D seismic data: Geophysics, 1, 13-23.

Roden R., T. Smith and D. Sacrey, 2015, Geologic pattern recognition from seismic attributes: Principal component analysis and self-organizing maps: Interpretation, 4, 59-83.

Viloria R. and M. Taheri, 2002, Metodología para la Integración de la Interpretación Sedimentológica en el Modelaje Estocástico de Facies Sedimentarias, (INT-ID-9973, 2002). Technical Report INTEVEP-PDVSA.

Solving Interpretation Problems using Machine Learning on Multi-Attribute, Sample-Based Seismic Data

Solving Interpretation Problems using Machine Learning on Multi-Attribute, Sample-Based Seismic Data

Solving Interpretation Problems using Machine Learning on Multi-Attribute, Sample-Based Seismic Data
Presented by Deborah Sacrey, Owner of Auburn Energy
Challenges addressed in this webinar include:

  • Reducing risk in drilling marginal or dry holes
  • Interpretation of thin bedded reservoirs far below conventional seismic tuning
  • How to better understand reservoir characteristics
  • Interpretation of reservoirs in deep, pressured environments
  • Using the classification process to help with correlations in difficult stratigraphic or structural environments

The webinar is open to those interested in learning more about how the application of machine learning is key to seismic interpretation.

 
Deborah Sacrey

Deborah Sacrey

Owner

Auburn Energy

Deborah Sacrey is a geologist/geophysicist with 41 years of oil and gas exploration experience in the Texas, Louisiana Gulf Coast, and Mid-Continent areas of the US. Deborah specializes in 2D and 3D interpretation for clients in the US and internationally.

She received her degree in Geology from the University of Oklahoma in 1976 and began her career with Gulf Oil in Oklahoma City. She started Auburn Energy in 1990 and built her first geophysical workstation using the Kingdom software in 1996. Deborah then worked closely with SMT (now part of IHS) for 18 years developing and testing Kingdom. For the past eight years, she has been part of a team to study and bring the power of multi-attribute neural analysis of seismic data to the geoscience community, guided by Dr. Tom Smith, founder of SMT. Deborah has become an expert in the use of the Paradise® software and has over five discoveries for clients using the technology.

Deborah is very active in the geological community. She is past national President of SIPES (Society of Independent Professional Earth Scientists), past President of the Division of Professional Affairs of AAPG (American Association of Petroleum Geologists), Past Treasurer of AAPG and Past President of the Houston Geological Society. She is currently the incoming President of the Gulf Coast Association of Geological Societies (GCAGS) and is a member of the GCAGS representation on the AAPG Advisory Council. Deborah is also a DPA Certified Petroleum Geologist #4014 and DPA Certified Petroleum Geophysicist #2. She is active in the Houston Geological Society, South Texas Geological Society and the Oklahoma City Geological Society (OCGS).

Machine Learning Revolutionizing Seismic Interpretation

Machine Learning Revolutionizing Seismic Interpretation

By Thomas A. Smith and Kurt J. Marfurt
Published with permission: The American Oil & Gas Reporter
July 2017

The science of petroleum geophysics is changing, driven by the nature of the technical and business demands facing geoscientists as oil and gas activity pivots toward a new phase of unconventional reservoir development in an economic environment that rewards efficiency and risk mitigation. At the same time, fast-evolving technologies such as machine learning and multiattribute data analysis are introducing powerful new capabilities in investigating and interpreting the seismic record.

Through it all, however, the core mission of the interpreter remains the same as ever: extracting insights from seismic data to describe the subsurface and predict geology between existing well locations–whether they are separated by tens of feet on the same horizontal well pad or tens of miles in adjacent deepwater blocks. Distilled to its fundamental level, the job of the data interpreter is to determine where (and where not) to drill and complete a well. Getting the answer correct to that million-dollar question gives oil and gas companies a competitive edge. The ability to arrive at the right answers in the timeliest manner possible is invariably the force that pushes technological boundaries in seismic imaging and interpretation. The state of the art in seismic interpretation is being redefined partly by the volume and richness of high-density, full-azimuth 3-D surveying methods and processing techniques such as reverse time migration and anisotropic tomography. Combined, these solutions bring new resolution and clarity to processed subsurface images that simply are unachievable using conventional imaging methods. In data interpretation, analytical tools such as machine learning, pattern recognition, multiattribute analysis and self-organizing maps are enhancing the interpreter’s ability to classify, model and manipulate data in multidimensional space. As crucial as the technological advancements are, however, it is clear that the future of petroleum geophysics is being shaped largely by the demands of North American unconventional resource plays. Optimizing the economic performance of tight oil and shale gas projects is not only impacting the development of geophysical technology, but also dictating the skill sets that the next generation of successful interpreters must possess. Resource plays shift the focus of geophysics to reservoir development, challenging the relevance of seismic-based methods in an engineering-dominated business environment. Engineering holds the purse strings in resource plays, and the problems geoscientists are asked to solve with 3-D seismic are very different than in conventional exploration geophysics. Identifying shallow drilling hazards overlying a targeted source rock, mapping the orientation of natural fractures or faults, and characterizing changes in stress profiles or rock properties is related as much to engineering as to geophysics.

Given the requirements in unconventional plays, there are four practical steps to creating value with seismic analysis methods. The first and obvious step is for oil and gas companies to acquire 3-D seismic and incorporate the data into their digital databases.  Some operators active in unconventional plays fully embrace 3-D technology, while others only apply it selectively. If interpreters do not have access to high-quality data and the tools to evaluate that information, they cannot possibly add value to company’s bottom line.The second step is to break the conventional resolution barrier on the seismic reflection wavelet, the so-called quarter-wave length limit. This barrier is based on the overlapping reflections of seismic energy from the top and bottom of a layer, and depends on layer velocity, thickness, and wavelet frequencies. Below the quarter-wave length, the wavelets start to overlap in time and interfere with one another, making it impossible by conventional means to resolve separate events. The third step is correlating seismic reflection data–including compressional wave energy, shear wave energy and density–to quantitative rock property and geomechanical information from geology and petrophysics. Connecting seismic data to the variety of very detailed information available at the borehole lowers risk and provides a clearer picture of the subsurface between wells, which is fundamentally the purpose of acquiring a 3-D survey. The final step is conducting a broad, multiscaled analysis that fully integrates all available data into a single rock volume encompassing geophysical, geologic and petrophysical features. Whether an unconventional shale or a conventional carbonate, bringing all the data together in a unified rock volume resolves issues in subsurface modeling and enables more realistic interpretations of geological characteristics.

The Role of Technology

Every company faces pressures to economize, and the pressures to run an efficient business only ratchet up at lower commodity prices. The business challenges also relate to the personnel side of the equation, and that should never be dismissed. Companies are trying to bridge the gap between older geoscientists who seemingly know everything and the ones entering the business who have little experience but benefit from mentoring, education and training. One potential solution is using information technology to capture best practices across a business unit, and then keeping a scorecard of those practices in a database that can offer expert recommendations based on past experience. Keylogger applications can help by tracking how experienced geoscientists use data and tools in their day-to-day workflows. However, there is no good substitute for a seasoned interpreter. Technologies such as machine learning and pattern recognition have game-changing possibilities in statistical analysis, but as petroleum geologist Wallace Pratt pointed out in the 1950s, oil is first found in the human mind. The role of computing technology is to augment, not replace, the interpreter’s creativity and intuitive reasoning (i.e., the “geopsychology” of interpretation).

Delivering Value

A self-organizing map (SOM) is a neural network-based, machine learning process that is simultaneously applied to multiple seismic attribute volumes. This example shows a class II amplitude-variation-with-offset response from the top of gas sands, representing the specific conventional geological settings where most direct hydrocarbon indicator characteristics are found. From the top of the producing reservoir, the top image shows a contoured time structure map overlain by amplitudes in color. The bottom image is a SOM classification with low probability (less than 1 percent) denoted by white areas. The yellow line is the downdip edge of the high-amplitude zone designated in the top image. Consequently, seismic data interpreters need to make the estimates they derive from geophysical data more quantitative and more relatable for the petroleum engineer. Whether it is impedance inversion or anisotropic velocity modeling, the predicted results must add some measure of accuracy and risk estimation. It is not enough to simply predict a higher porosity at a certain reservoir depth. To be of consequence to engineering workflows, porosity predictions must be reliably delivered within a range of a few percentage points at depths estimated on a scale of plus or minus a specific number of feet.

3-d seismic image

Class II amplitude-variation-with-offset response from the top of gas sand.

Machine learning techniques apply statistics-based algorithms that learn iteratively from the data and adapt independently to produce repeatable results. The goal is to address the big data problem of interpreting massive volumes of data while helping the interpreter better understand the interrelated relationships of different types of attributes contained within 3-D data. The technology classifies attributes by breaking data into what computer scientists call “objects” to accelerate the evaluation of large datasets and allow the interpreter to reach conclusions much faster. Some computer scientists believe “deep learning” concepts can be applied directly to 3-D prestack seismic data volumes, with an algorithm figuring out the relations between seismic amplitude data patterns and the desired property of interest.  While Amazon, Alphabet and others are successfully using deep learning in marketing and other functions, those applications have access to millions of data interactions a day. Given the significantly fewer number of seismic interpreters in the world, and the much greater sensitivity of 3-D data volumes, there may never be sufficient access to training data to develop deep learning algorithms for 3-D interpretation.The concept of “shallow learning” mitigates this problem.
 
Stratigraphy above the Buda

Conventional amplitude seismic display from a northwest-to-southeast seismic section across a well location is contrasted with SOM results using multiple instantaneous attributes.

First, 3-D seismic data volumes are converted to well-established relations that represent waveform shape, continuity, orientation and response with offsets and azimuths that have proven relations (“attributes”) to porosity, thickness, brittleness, fractures and/or the presence of hydrocarbons. This greatly simplifies the problem, with the machine learning algorithms only needing to find simpler (i.e., shallower) relations between the attributes and properties of interest.In resource plays, seismic data interpretations increasingly are based on statistical rather than deterministic predictions. In development projects with hundreds of wells within a 3-D seismic survey area, operators rely on the interpreter to identify where to drill and predict how a well will complete and produce. Given the many known and unknown variables that can impact drilling, completion and production performance, the challenge lies with figuring out how to use statistical tools to apply data measurements from the previous wells to estimate the performance of the next well drilled within the 3-D survey area. Therein lies the value proposition of any kind of science, geophysics notwithstanding. The value of applying machine learning-based interpretation boils down to one word: prediction. The goal is not to score 100 percent accuracy, but to enhance the predictions made from seismic analysis to avoid drilling uneconomic or underproductive wells. Avoiding investments in only a couple bad wells can pay for all the geophysics needed to make those predictions. And because the statistical models are updated with new data as each well is drilled and completed, the results continually become more quantitative for improved prediction accuracy over time.

New Functionalities

In terms of particular interpretation functionalities, three specific concepts are being developed around machine learning capabilities:

  • Evaluating multiple seismic attributes simultaneously using self-organizing maps (multiattribute analysis);
  • Relating in multidimensional space natural clusters or groupings of attributes that represent geologic information embedded in the data; and
  • Graphically representing the clustered information as geobodies to quantify the relative contributions of each attribute in a given seismic volume in a form that is intrinsic to geoscientific workflows.

A 3-D seismic volume contains numerous attributes, expressed as a mathematical construct representing a class of data from simultaneous analysis. An individual class of data can be any measurable property that is used to identify geologic features, such as rock brittleness, total organic carbon or formation layering. Supported by machine learning and neural networks, multiattribute technology enhances the geoscientist’s ability to quickly investigate large data volumes and delineate anomalies for further analysis, locate fracture trends and sweet spots in shale plays, identify geologic and stratigraphic features, map subtle changes in facies at or even below conventional seismic resolution, and more. The key breakthrough is that the new technology works on machine learning analysis of multiattribute seismic samples.While applied exclusively to seismic data at present, there are many types of attributes contained within geologic, petrophysical and engineering datasets. In fact, literally, any type of data that can be put into rows and columns on a spreadsheet is applicable to multiattribute analysis. Eventually, multiattribute analysis will incorporate information from different disciplines and allow all of it to be investigated within the same multidimensional space that leads to the second concept: Using machine learning to organize and evaluate natural clusters of attribute classes. If an interpreter is analyzing eight attributes in an eight-dimensional space, the attributes can be grouped into natural clusters that populate that space. The third component is delivering the information found in the clusters in high-dimensionality space in a form that quantifies the relative contribution of the attributes to the class of data, such as simple geobodies displayed with a 2-D color index map. This approach allows multiple attributes to be mapped over large areas to obtain a much more complete picture of the subsurface, and has demonstrated the ability to achieve resolution below conventional seismic tuning thickness. For example, in an application in the Eagle Ford Shale in South Texas, multiattribute analysis was able to match 24 classes of attributes within a 150-foot vertical section across 200 square miles of a 3-D survey. Using these results, a stratigraphic diagram of the seismic facies has been developed over the entire survey area to improve geologic predictions between boreholes, and ultimately, correlate seismic facies with rock properties measured at the boreholes. Importantly, the mathematical foundation now exists to demonstrate the relationships of the different attributes and how they tie with pixel components in geobody form using machine learning. Understanding how the attribute data mathematically relate to one another and to geological properties gives geoscientists confidence in the interpretation results.

Leveraging Integration

The term “exploration geophysics” is becoming almost a misnomer in North America, given the focus on unconventional reservoirs, and how seismic methods are being used in these plays to develop rather than find reservoirs. With seismic reflection data being applied across the board in a variety of ways and at different resolutions in unconventional development programs, operators are combining 3-D seismic with data from other disciplines into a single integrated subsurface model. Fully leveraging the new sets of statistical and analytical tools to make better predictions from integrated multidisciplinary datasets is crucial to reducing drilling and completion risk and improving operational decision making. Multidimensional classifiers and attribute selection lists using principal component analysis and independent component analysis can be used with geophysical, geological, engineering, petrophysical and other attributes to create general-purpose multidisciplinary tools of benefit to all oil and gas company departments and disciplines. As noted, the integrated models used in resource plays increasingly are based on statistics, so any evaluation to develop the models also needs to be statistical. In the future, a basic part of conducting a successful analysis will be the ability to understand statistical data and how the data can be organized to build more tightly integrated models. And if oil and gas companies require more integrated interpretations, it follows that interpreters will have to possess more integrated skills and knowledge. The geoscientist of tomorrow may need to be more of a multidisciplinary professional with the blended capabilities of a geologist, geophysicist, engineer and applied statistician. But whether a geoscientist is exploring, appraising or developing reservoirs, he or she only can be as good as the prediction of the final model. By applying technologies such as machine learning and multiattribute analysis during the workup, interpreters can use their creative energies to extract more knowledge from their data and make more knowledgeable predictions about undrilled locations.

THOMAS A. SMITH is president and chief executive officer of Geophysical Insights, which he founded in 2008 to develop machine learning processes for multiattribute seismic analysis. Smith founded Seismic Micro-Technology in 1984, focused on personal computer-based seismic interpretation. He began his career in 1971 as a processing geophysicist at Chevron Geophysical. Smith is a recipient of the Society of Exploration Geophysicists’ Enterprise Award, Iowa State University’s Distinguished Alumni Award and the University of Houston’s Distinguished Alumni Award for Natural Sciences and Mathematics. He holds a B.S. and an M.S. in geology from Iowa State, and a Ph.D. in geophysics from the University of Houston.
Dr. Kurt Marfurt KURT J. MARFURT is the Frank and Henrietta Schultz Chair and Professor of Geophysics in the ConocoPhillips School of Geology & Geophysics at the University of Oklahoma. He has devoted his career to seismic processing, seismic interpretation and reservoir characterization, including attribute analysis, multicomponent 3-D, coherence and spectral decomposition. Marfurt began his career at Amoco in 1981. After 18 years of service in geophysical research, he became director of the University of Houston’s Center for Applied Geosciences & Energy. He joined the University of Oklahoma in 2007. Marfurt holds an M.S. and a Ph.D. in applied geophysics from Columbia University.

Geobody Interpretation Through Multi-Attribute Surveys, Natural Clusters and Machine Learning

By Thomas A. Smith 
June 2017

Geobody interpretation through multi-attribute surveys, natural clusters and machine learning

Summary

Multi-attribute seismic samples (even as entire attribute surveys), Principal Component Analysis (PCA), attribute selection lists, and natural clusters in attribute space are candidate inputs to machine learning engines that can operate on these data to train neural network topologies and generate autopicked geobodies. This paper sets out a unified mathematical framework for the process from seismic samples to geobodies.  SOM is discussed in the context of inversion as a dimensionality-reducing classifier to deliver a winning neuron set.  PCA is a means to more clearly illuminate features of a particular class of geologic geobodies.  These principles are demonstrated with geobody autopicking below conventional thin bed resolution on a standard wedge model.

Introduction

Seismic attributes are now an integral component of nearly every 3D seismic interpretation.  Early development in seismic attributes is traced to Taner and Sheriff (1977).  Attributes have a variety of purposes for both general exploration and reservoir characterization, as laid out clearly by Chopra and Marfurt (2007).  Taner (2003) summarizes attribute mathematics with a discussion of usage.

Self-Organizing Maps (SOM) are a type of unsupervised neural networks that self-train in the sense that they obtain information directly from the data.  The SOM neural network is completely self-taught, which is in contrast to the perceptron and its various cousins undergo supervised training.  The winning neuron set that results from training then classifies the training samples to test itself by finding the nearest neuron to each training sample (winning neuron).  In addition, other data may be classified as well.  First discovered by Kohonen (1984), then advanced and expanded by its success in a number of areas (Kohonen, 2001; Laaksonen, 2011), SOM has become a part of several established neural network textbooks, namely Haykin (2009) and Dutta, Hart and Stork (2001).  Although the style of SOM discussed here has been used commercially for several years, only recently have results on conventional DHI plays been published (Roden, Smith and Sacrey, 2015).

Three Spaces

The concept of framing seismic attributes as multi-attribute seismic samples for SOM training and classification was presented by Taner, Treitel, and Smith (2009) in an SEG Workshop.  In that presentation, survey data and their computed attributes reside in survey space.  The neural network resides in neuron topology space.  These two meet in attribute space where neurons hunt for natural clusters and learn their characteristics.

Results were shown for 3D surveys over the venerable Stratton Field and a Gulf of Mexico salt dome.  The Stratton Field SOM results clearly demonstrated that there are continuous geobody events in the weak reflectivity zone between C38 and F11 events, some of which are well below seismic tuning thickness, that could be tied to conventional reflections and which correlated with wireline logs at the wells.  Studies of SOM machine learning of seismic models were presented by Smith and Taner (2010).  They showed how winning neurons distribute themselves in attribute space in proportion to the density of multi-attribute samples.  Finally, interpretation of SOM salt dome results found a low probability zone where multi-attribute samples of poor fit correlated with an apparent salt seal and DHI down-dip conformance (Smith and Treitel, 2010).

Survey Space to Attribute Space:

Ordinary seismic samples of amplitude traces in a 3D survey may be described as an ordered  set .  A multi-attribute survey is a “Super 3D Survey” constructed by combining a number of attribute surveys with the amplitude survey.  This adds another dimension to the set and another subscript, so the new set of samples including the additional attributes is .  These data may be thought of as separate surveys or equivalently separate samples within one survey.  Within a single survey, each sample is a multi-attribute vector.  This reduces the subscript by one count so the set of multi-attribute vectors  .

Next, a two-way mapping function may be defined that references the location of any sample in the 3D survey by single and triplet indices  Now the three survey coordinates may be gathered into a single index so the multi-attribute vector samples are also an unordered set in attribute space  The index map is a way to find a sample a sample in attribute space from survey space and vice versa.

Multi-attribute sample and set in attribute space: 

A multi-attribute seismic sample is a column vector in an ordered set of three subscripts c,d,e representing sample index, trace index, and line index. Survey bins refer to indices d and e.  These samples may also be organized into an unordered set with subscript i.  They are members of an -dimensional real space.  The attribute data are normalized so in fact multi-attribute samples reside in scaled attribute space.

Natural clusters in attribute space: 

Just as there are reflecting horizons in survey space, there must be clusters of coherent energy in attribute space.  Random samples, which carry no information, are uniformly distributed in attribute space just as in survey space.  The set  of natural clusters in attribute space is unordered and contains m  members.  Here, the brackets [1, M]  indicate an index range.  The natural clusters may reside anywhere in attribute space, but attribute space is filled with multi-attribute samples, only some of which are meaningful natural clusters.  Natural clusters may be big or small, tightly packed or diffuse.  The rest of the samples are scattered throughout F-space.  Natural clusters are discovered in attribute space with learning machines imbued with simple training rules and aided by properties of their neural networks.

A single natural cluster: 

A natural cluster may have elements in it.  Every natural cluster is expected to have a different number of multi-attribute samples associated with it.  Each element is taken from the pool of the set of all multi-attribute samples   Every natural cluster may have a different number of multi-attribute samples associated with it so for any natural cluster,  then N(m).  Every natural cluster has its own unique properties described by the subset of samples  that are associated with it.  Some sample subsets associated with a winning neuron are small (“not so popular”) and some subsets are large (“very popular”).  The distribution of Euclidean distances may be tight (“packed”) or loose (“diffuse”).

Geobody sample and geobody set in survey space: 

For this presentation, a geobody G_b is defined as a contiguous region in survey space composed of elements which are identified by members g.  The members of a geobody are an ordered set  which registers with those coordinates of members of the multi-attribute seismic survey .

A geobody member is just an identification number (id), an integer .  Although the 3D seismic survey is a fully populated “brick” with members ,  the geobody members  register at certain contiguous locations, but not all of them.  The geobody  is an amorphous, but contiguous, “blob” within the “brick” of the 3D survey.  The coordinates of the geobody blob in the earth are  where  By this, all the multi-attribute samples in the geobody may be found, given the id and three survey coordinates of a seed point.

A single geobody in survey space

Each geobody  is a set of  N geobody  members with the same id.  That is, there are N members in , so N(b).  The geobody members for this geobody are taken from the pool of all geobody samples, the set  Some geobodies are small and others large.  Some are tabular, some lenticular, some channels, faults, columns, etc.  So how are geobodies and natural clusters related?

A geobody is not a natural cluster

This expression is short but sweet.  It says a lot.  On the left is the set of all B geobodies.  On the right is the set of M natural clusters.  The expression says that these two sets aren’t the same.  On the left, the geobody members are id numbers  These are in survey space.  On the right, the natural clusters  These are in attribute space.  What this means is that geobodies are not directly revealed by natural clusters.  So, what is missing?

Interpretation is conducted in survey space.  Machine learning is conducted in attribute space.  Someone has to pick the list of attributes.  The attributes must be tailored to the geological question at hand.  And a good geological question is always the best starting point for any interpretation.

A natural cluster is an imaged geobody

Here, a natural cluster C_m is defined as an unorganized set of two kinds of objects: a function I of a set of geobodies G_i and random noise N.  The number of geobodies is I and unspecified.  The function  is an illumination function which places the geobodies in  The illumination function is defined by the choice of attributes.  This is the attribute selection list.  The number of geobodies in a natural cluster C_m is zero or more, 0<i<I.  The geobodies are distributed throughout the 3D survey.

The natural cluster concentrates geobodies of similar illumination properties.  If there are no geobodies or there is no illumination with a particular attribute selection list,  , so the set is only noise.  The attribute selection list is a critically import part of multi-attribute seismic interpretation.  The wrong attribute list may not illuminate any geobodies at all.

Geobody inversion from a math perspective

Multi-attribute seismic interpretation proceeds from the preceding equation in three parts.  First, as part of an inversion process, a natural cluster   is statistically estimated by a machine learning classifier such as SOM  with a neural network topology.  See Chopra, Castagna and Potniaguie (2006) for a contrasting inversion methodology.  Secondly, SOM employs a simple training rule that a neuron nearest a selected training sample is declared the winner and the winning neuron advances toward the sample a small amount.  Neurons are trained by attraction to samples.  One complete pass through the training samples is called an epoch.  Other machine learning algorithm have other training rules to adapt to data.  Finally, SOM has a dimensionality reducing feature because information contained in natural clusters is transferred (imperfectly) to the winning neuron set in the finalized neural network topology through cooperative learning.  Neurons in winning neuron neighborhood topology move along with the winning neuron in attribute space.  SOM training is also dynamic in that the size of the neighborhood decreases with each training time step so that eventually the neighborhood shrinks so that all subsequent training steps are competitive.

Because  is a statistical estimate, let it be called the statistical estimate of the “signal” part of .  The true geobody is independent of an illumination function.  The dimensionality reduction   associated with multi-attribute interpretation has a purpose of geobody recognition through identification, dimensionality reduction and classification.  In fact, in the chain of steps there is a mapping and un-mapping process with no guarantee that the geobody will be recovered: 

However, the image function   may be inappropriate to illuminate the geobody in F-space because of a poor choice of attributes.  So at best, the geobodies is illuminated by an imperfect set of attributes and detected by a classifier that is primitive.  The results often must be combined, edited and packaged into useful, interpreted geobody units, ready to be incorporated into an evolving geomodel on which the interpretation will rest.

Attribute Space Illumination

One fundamental aspect of machine learning is dimensionality reduction from attribute space because its dimensions are usually beyond our grasp.  The approach taken here is from the perspective of manifolds which are defined as spaces with the property of “mapability” where Euclidean coordinates may be safely employed within any local neighborhood (Haykin, 2009, p.437-442).

The manifold assumption is important because SOM learning is routinely conducted on multi-attribute samples in attribute space using Euclidean distances to move neurons during training.  One of the first concerns of dimensionality reduction is the potential to lose details in natural clusters.  In practice, it has been found that halving the original amplitude sample interval is advantageous, but further downsampling has not proven to be beneficial.  Infilling a natural cluster allows neurons during competitive training to adapt to subtle details that might be missed in the original data.

Curse of Dimensionality

The Curse of Dimensionality (Haykin, 2009) is, in fact, many curses.  One problem is that uniformly sampled space increases dramatically with increasing dimensionality.  This has implications when gathering training samples for a neural network.  For example, cutting a unit length bar (1-D) with a sample interval of .01 results in 100 samples.  Dividing a unit length hypercube in 10-D with a similar sample interval results in 1020 samples (1010 x 102).  If the nature of attribute space requires uniform sampling across a broad numerical range, then a large number of attributes may not be practical.  However, uniform sampling is not an issue here because the objective is to locate and detail features of natural clusters.

Also, not all attributes are important.  In the hunt for natural clusters, PCA (Haykin, 2009) is often a valuable tool to assess the relative merits of each attribute in a SOM attribute selection list.  Depending on geologic objectives, several dominant attributes may be picked from the first, second or even third principal eigenvectors or may pick all attributes from one principle eigenvector.

Geobody inversion from an interpretation perspective

Multi-attribute seismic interpretation is finding geobodies in survey space aided by machine learning tools that hunt for natural clusters in attribute space.  The interpreter’s critical role in this process is the following:

  • Choose questions that carry exploration toward meaningful conclusions.
  • Be creative with seismic attributes so as to effectively address illumination of geologic geobodies.
  • Pick attribute selection lists with the assistance of PCA.
  • Review the results of machine learning which may identify interesting geobodies  in natural clusters autopicked by SOM.
  • Look through the noise to edit and build geobodies  with a workbench of visualization displays and a variety of statistical decision-making tools.
  • Construct geomodels by combining autopicked geobodies which in turn allow predictions on where to make better drilling decisions.

The Geomodel

After classification, picking geobodies from their winning neurons starts by filling an empty geomodel .  Natural clusters are consolidators of geobodies with common properties in attribute space so M < B.  In fact, it is often found that M << B .  That is, geobodies “stack” in attribute space.  Seismic data is noisy.  Natural clusters are consequentially statistical.  Not every sample g classified by a winning neuron is important although SOM classifies every sample. Samples that are a poor fit are probably noise.  Construction of a sensible geomodel depends on answering well thought out geological questions and phrased by selection of appropriate attribute selection lists.

Working below classic seismic tuning thickness

Classical seismic tuning thickness is λ/4.  Combining vertical incidence layer thickness  with  λ=V/f leads to a critical layer thickness  Resolution below classical seismic tuning thickness has been demonstrated with multi-attribute seismic samples and a machine learning classifier operating on those samples in scaled attribute space (Roden, et. al., 2015). High-quality natural clusters in attribute space imply tight, dense balls (low entropy, high density).  SOM training and classification of a classical wedge model at three noise levels is shown in Figures 1 and 2 which show tracking well below tuning thickness.

Seismic Processing: Processing the survey at a fine sample interval is preferred over resampling the final survey to a fine sample interval. Highest S/N ratio is always preferred. Preprocessing: Fine sample interval of base survey is preferred to raising the density of natural clusters and then computing attributes, but do not compute attributes and then resample because some attributes are not continuous functions. Derive all attributes from a single base survey in order to avoid misties. Attribute Selection List: Prefer attributes that address the specific properties of an intended geologic geobody. Working below tuning, prefer instantaneous attributes over attributes requiring spatial sampling.  Thin bed results on 3D surveys in the Eagle Ford Shale Facies of South Texas and in the Alibel horizon of the Middle Frio Onshore Texas and Group corroborated with extensive well control to verify consistent results for more accurate mapping of facies below tuning without usual traditional frequency assumptions (Roden, Smith, Santogrossi and Sacrey, personal communication, 2017).

Conclusion

There is a firm mathematical basis for a unified treatment of multi-attribute seismic samples, natural clusters, geobodies and machine learning classifiers such as SOM.  Interpretation of multi-attribute seismic data is showing great promise, having demonstrated resolution well below conventional seismic thin bed resolution due to high-quality natural clusters in attribute space which have been detected by a robust classifier such as SOM.

Acknowledgments

I am thankful to have worked with two great geoscientists, Tury Taner and Sven Treitel during the genesis of these ideas.  I am also grateful to work with an inspired and inspiring team of coworkers who are equally committed to excellence.  In particular, Rocky Roden and Deborah Sacrey are longstanding associates with a shared curiosity to understand things and colleagues of a hunter’s spirit.

Figure 1: Wedge models for three noise levels trained and classified by SOM with attribute list of amplitude and Hilbert transform (not shown) on 8 x 8 hexagonal neuron topology. Upper displays are amplitude. Middle displays are SOM classifications with a smooth color map. Lower displays are SOM classifications with a random color map. The rightmost vertical column is an enlargement of wedge model tips at highest noise level.  Multi-attribute classification samples are clearly tracking well below tuning thickness which is left of the center in the right column displays.

Figure 2: Attribute space for three wedge models with horizontal axis of amplitude and vertical axis of Hilbert transform. Upper displays are multi-attribute samples before SOM training and lower displays after training and samples classified by winning neurons in lower left with smooth color map.  Upper right is an enlargement of tip of third noise level wedge model from Figure 1 where below-tuning bed thickness is right of the thick vertical black line.

References

Chopra, S. J. Castagna and O. Potniaguine, 2006, Thin-bed reflectivity inversion, Extended abstracts, SEG Annual Meeting, New Orleans.

Chopra, S. and K.J. Marfurt, 2007, Seismic attributes for prospect identification and reservoir characterization, Geophysical Developments No. 11, SEG.

Dutta, R.O., P.E. Hart and D.G. Stork, 2001, Pattern Classification, 2nd ed.: Wiley.

Haykin, S., 2009, Neural networks and learning machines, 3rd ed.: Pearson.

Kohonen, T., 1984, Self-organization and associative memory, pp 125-245. Springer-Verlag. Berlin.

Kohonen, T., 2001, Self-organizing maps: Third extended addition, Springer, Series in Information Services.

Laaksonen, J. and T. Honkela, 2011, Advances in self-organizing maps, 8th International Workshop, WSOM 2011 Espoo, Finland, Springer.

Ma, Y. and Y. Fu, 2012, Manifold Learning Theory and Applications, CRC Press, Boca Raton.

Roden, R., T. Smith and D. Sacrey, 2015, Geologic pattern recognition from seismic attributes, principal component analysis and self-organizing maps, Interpretation, SEG, November 2015, SAE59-83.

Smith, T., and M.T. Taner, 2010, Natural clusters in multi-attribute seismics found with self-organizing maps: Source and signal  processing section paper 5: Presented at Robinson-Treitel Spring Symposium by GSH/SEG, Extended Abstracts.

Smith, T. and S. Treitel, 2010, Self-organizing artificial neural nets for automatic anomaly identification, Expanded abstracts, SEG Annual Convention, Denver.

Taner, M.T., 2003, Attributes revisited, http://www.rocksolidimages.com/attributes-revisited/, accessed 22 March 2017.

Taner, M.T., and R.E. Sheriff, 1977, Application of amplitude, frequency, and other attributes, to stratigraphic and hydrocarbon  determination, in C.E. Payton, ed., Applications to hydrocarbon exploration: AAPG Memoir 26, 301–327.

Taner, M.T., S. Treitel, and T. Smith, 2009, Self-organizing maps of multi-attribute 3D seismic reflection surveys, Presented at the 79th International SEG Convention, SEG 2009 Workshop on “What’s New in Seismic Interpretation,” Paper no. 6.

ChingWen Chen, seismic interpreter THOMAS A. SMITH is president and chief executive officer of Geophysical Insights, which he founded in 2008 to develop machine learning processes for multiattribute seismic analysis. Smith founded Seismic Micro-Technology in 1984, focused on personal computer-based seismic interpretation. He began his career in 1971 as a processing geophysicist at Chevron Geophysical. Smith is a recipient of the Society of Exploration Geophysicists’ Enterprise Award, Iowa State University’s Distinguished Alumni Award and the University of Houston’s Distinguished Alumni Award for Natural Sciences and Mathematics. He holds a B.S. and an M.S. in geology from Iowa State, and a Ph.D. in geophysics from the University of Houston.