Machine Learning Essentials for Seismic Interpretation – an e-Course by Dr. Tom Smith

Machine Learning Essentials for Seismic Interpretation – an e-Course by Dr. Tom Smith

Machine learning is foundational to the digital transformation of the oil & gas industry and will have a dramatic impact on the exploration and production of hydrocarbons.  Dr. Tom Smith, the founder and CEO of Geophysical Insights, conducts a comprehensive survey of machine learning technology and its applications in this 24-part series.  The course will benefit geoscientists, engineers, and data analysts at all experience levels, from data analysts who want to better understand applications of machine learning to geoscience, to senior geophysicists with deep experience in the field.

Aspects of supervised learning, unsupervised learning, classification and reclassification are introduced to illustrate how they work on seismic data.  Machine learning is presented, not as an end-all-be-all, but as a new set of tools which enables interpretation on seismic data on a new, higher level that of abstraction  that promises to reduce risks and identify features that which might otherwise be missed.

The following major topics are covered:

  • Operation  – supervised and unsupervised learning; buzzwords; examples
  • Foundation  – seismic processing for ML; attribute selection list objectives; principal component analysis
  • Practice  – geobodies; below-tuning; fluid contacts; making predictions
  • Prediction – the best well; the best seismic processing; over-fitting; cross-validation; who makes the best predictions?

This course can be taken for certification, or for informational purposes only (without certification). 

Enroll today for this valuable e-course from Geophysical Insights!

The Holy Grail of Machine Learning in Seismic Interpretation

The Holy Grail of Machine Learning in Seismic Interpretation

A few years ago, we had geophysics and geology – two distinct that were well defined. Then, geoscience came along, and it was an amalgam of geology and geophysics.  Many people started calling themselves geoscientists as opposed “geologist” or “geophysicist”. But the changes weren’t quite finished. Along came a qualifying adjective, and that has to do with unconventional resource development or unconventional exploration. We understand how to do exploration, but unconventional has to do with understanding shale and finding sweet spots, but it is a type of exploration.  By joining unconventional and resource development, we broaden what we do as professionals.  However, the mindset of unconventional geophysics is really closer to mining geophysics than it is conventional exploration.

A few years ago, we had geophysics and geology – two distinct that were well defined. Then, geoscience came along, and it was an amalgam of geology and geophysics.  Many people started calling themselves geoscientists as opposed “geologist” or “geophysicist”. But the changes weren’t quite finished. Along came a qualifying adjective, and that has to do with unconventional resource development or unconventional exploration. We understand how to do exploration, but unconventional has to do with understanding shale and finding sweet spots, but it is a type of exploration.  By joining unconventional and resource development, we broaden what we do as professionals.  However, the mindset of unconventional geophysics is really closer to mining geophysics than it is conventional exploration.

So, today’s topic has to do with the “holy grail” of machine learning in seismic interpretation.  We’re trying to tie this to seismic interpretation only.  Even if that’s a pretty big topic, we’re going to focus on a few highlights.  I can’t even summarize machine learning for seismic interpretation.  It’s already too big!  Nearly every company is investigating or applying machine learning these days.  So, for this talk I’m just going to have to focus on this narrow topic of machine learning in seismic interpretation and hit a few highlights.

Let’s start at 50,000 feet – way up at the top.  If you’ve been intimidated by this machine learning stuff, let’s define terms.  Machine learning is an engine.  It’s an algorithm that learns without explicit programming. That’s really fundamental. What does that mean? That means an algorithm that’s going to learn from the data. So, that means given one set of data, it’s going to come up with an answer, but with a different set of data, it will come up with a different answer.  The whole field of artificial intelligence is broken up into strong AI and Narrow AI.  Strong AI is coming up with a robot that looks and behaves like a person. Narrow AI attempts to duplicate the brain’s neurological processes that have been perfected over millions of years of biological development. A Self-organizing map, or SOM, is a type of neural network that adjusts to training data.  However, it makes no assumptions about the characteristics of the data.  So, if you look at the whole field of artificial intelligence, and then we look at machine learning as a subset of that, there are two parts: unsupervised neural networks and supervised neural networks.  Unsupervised is where you feed it the data and say “you go figure it out.”  In supervised neural networks, you give it both the data and the right answer. Some examples of supervised neural networks would be convolutional neural networks and deep learning algorithms.  Convolutional is a more classical type of a supervised neural network, where for every data sample, we know the answer.  So, a data sample might be ‘we have x, y, and z properties, and by the way, we know what the classification is a pri·o·ri. A classical example of a supervised neural network would be this: Your uncle just passed away and gave you the canning operations in Cordova, Alaska.  You go to the plant to see what you’ve inherited. Let’s say you’ve got all these people standing at a beltline manually sorting fish, and they’ve got buckets eels, and buckets for flounder, etc. Being a great geoscientist, you recognize this as an opportunity to apply machine learning to possibly re-assign those people to more productive tasks. As the fish come along, you weight them, you take a picture of them, you see what the scales are, general texture, you get some idea about the general shape of them.  You see what I’ve described are three properties, or attributes. Perhaps you add more attributes and are up to four or five. Now, we have 5 attributes that define each type of fish, so in mathematical terms, we’re now dealing with a five dimensional problem. We call this ‘Attribute Space’. Pretty soon, you run through all the eels and you get measurements for each eel.  So, you get the neural network trained on eels. And then you run through all the flounder. And guess what – there’s going to be variations, of course, but in attribute space, of those four or five measurements that we made for each one of type of fish are going to wind up in a different cluster in Attribute Space. And that’s how we tell the difference between eels and flounder. Or whatever else you got.  And everything else that you can’t classify very well, goes into a bucket that is labeled ‘unclassified’. (More on this later in the presentation.) And, you put that into your algorithm.  So that’s basically the difference between supervised neural networks and unsupervised neural networks. Deep learning is a category of neural networks that can operate in both supervised and unsupervised discovery.

Now, before we get deeper into our subject today, I’d like to draw your attention to some of the terms: the concept of Big Data.  If you remember a few years ago, if you wanted to survive in the oil and gas business, finding large fields was the objective. Well, we have another big thing today – Big Data. Our industry is looking at ways to apply the concepts of Big Data analytics. We hear senior management of E&P companies talking about Big Data and launching Data Analytics teams. So, what is Big Data or Data Analytics? It’s access to large volumes of disparate kinds of oil and gas data that is analyzed by machine learning algorithms to discover unknown relationships, those that were not identified previously. The other key point about Big Data is that it is disparate kinds. So the fact is you say “I’m doing Big Data analytics with my seismic data” – that’s not really an appropriate choice of terms. If you say “I’m going to throw in all my seismic data, along with associated wells, and my production data” – now you’re starting to talk about real Big Data operations.  And, the opportunities are huge. Finally, there’s IoT – Internet of Things – which you’ve probably heard or read.  I predict that IoT will have a larger impact on our industry than machine learning, however, the two are related.  And why is that?  Almost EVERYTHING we use can be wired to the internet. In seismic acquisition, for instance, we’re looking at smart geophones being hooked up that sense the direction of the boat and can send and receive data. In fact, when the geophones get planted, they have a GPS in each one of those things so that when it’s pulled up and thrown in the back of a pickup truck, the geophones can report their location in real-time.  There are countless other examples of how IoT will change our industry.

Let’s consider wirelines as a starting point of interpretation and figuring out the deposition of the environment using wireline classifications. If we pick a horizon, and based on that auto-picked horizon, we have a wavelet at every bin. We pull that wavelet out. In this auto-picked horizon, we may have a million samples and we have a million wavelets because we have a wavelet for each sample. (Some early neural learning tools were based on this concept of classifying wavelets.) Using these different classes, machine learning analyzes and trains on those million wavelets, finding say seven most significantly different. And the we go back and classify all of them. And so we have this cut shown here, across the channel and the wavelet, closest to the center, discovered to be tied to that channel. So there’s the channel wavelet, and now we have overbank wavelets, some splay wavelets – several different wavelets. And from this, a nice colormap can be produced indicating the type of wavelet.

Horizon attributes look at the properties of the wavelet along the vicinity of the horizon, at say frequency of 25 to 80 hertz with attributes like instantaneous phase. So we know have a collection of information about that pic using horizon attributes. Using volume attributes, we’ll look at a pair of horizons and integrate the seismic attributes between the horizons. This will result in a number, such as the average amplitude or average envelope value, that represents a sum of seismic samples in a time or depth interval. However, when considering machine learning, the method of analysis is fundamentally different. We have one seismic sample and associated with that sample we have multiple seismic attributes associated with that sample. This produces a multi-attribute sample vector that is the subject of the machine learning process.

Ok, so let’s take a look at some of the results: This is a self-organizing map, analysis of a wedge using only 2 attributes. We’ve got three cases – low, medium, and high levels of noise, and in the box over here you can see tuning thickness is right here, and everything to the right of that arrow is below tuning. Now, the SOM is multi-attribute samples. And in this case, we are keeping things very simple since we only have two attributes. If you have only two attributes, you can plot them on a piece of paper – x axis, y axis. However, the classification process works just fine for two dimensions or twenty dimensions.  It’s a machine learning algorithm. In two dimensions, we can look at it and decide “did it do a good job or did it not?” For this example, we’ve used the amplitude and the Hilbert Transform because we know they’re orthogonal to each other. We can plot those as individual points on paper. Every sample is a point on that scatter plot. However, if we put it through a SOM analysis, the first stage is SOM training, which is trying to locate natural clusters in attribute space, and then the second phase is once those neurons have gone through the training process, we then take the results out and classify ALL the samples. So, we have here the results – every single sample is classified. Low noise, medium noise, high noise, and here are the results here.  If you go to tuning thickness, we are tracking with SOM analysis events way below tuning thickness.  And the fact that there’s the top of the wedge or … this one right here is where things get below tuning thickness. Eventually tip the corresponding trace right over there.  Now, there’s a certain bias.  We are using here for this analysis a two-dimensional topology – it’s two dimensions, but also the connectivity is hexagonal connectivity between these neurons, which is made use of during the training process.  And there’s a certain bias here because this is a smooth colormap.  By the way, these are colormaps as opposed to colorbars.  Right? Color maps, not colorbars. In terms of color MAPS, you can have four points of connectivity, and then it’s just like a grid.  Or 6 points of connectivity, and then it’s hexagonal.  That helps us understand the training that was used. Well, there’s a certain bias about having smooth colors and we have attempted in this process here – there’s 8 rows and 8 columns – every single one of those has gone looking for a natural cluster in attribute space.  Although it’s only two dimensions, they are still is a hunting process. Each of these 64 neurons, after the training process, are trying to zero in on a natural cluster. And there’s a certain bias here in using smooth colors because that happens like yellow and greens and here’s blues and reds. Here’s a random color – and you can see the results.  But even if we use random colors, we are still tracking events way below tuning thickness using the SOM classification.

We are demonstrating the resolution well below tuning.  There’s no magic.  We use only two attributes – the real part and the imaginary part, which is the Hilbert Transform, and we are demonstrating the SOM characteristics of training using only two attributes.

The self-organizing map, SOM, training algorithm is modeled on discovering of natural clusters in attribute space, using training rules based upon the human visual cortex.  Conceptually, this is a simple but powerful idea. We can see examples in nature of simple rules that lead to profound results.

So, the whole idea behind self-organizing assemblages is the following:  Snow geese and fish are both examples of self-organizing assemblages. Individuals follow a simple rule.  The individual goose is just basically following a very simple rule: Follow the goose in front of me, just a few feet behind and either left or right. It’s a simple as that.  That’s an example of self-organizing assemblage, but yet some of the properties of that are pretty profound, because once they get up to altitude, they can go for a long time and long distances using the slipstream properties of that “v” formation.  The basic rule for a schooling fish is ‘swim close to your buddies.  Not so close that you’ll bump into them, and not so far away that it doesn’t get represented as a school of fish.’ When the shark swims by, the school needs to look like one big fish. If those individual fish were too far apart, the shark would see the smaller isolated fish as easy prey. So, there’s even a simple rule here of a optimum distance one to the other. These are just two examples of where simple rules produce complex results when applied at scale.

Unsupervised neural networks work, which classify the data, also work on simple rules but operating on large volumes of seismic samples in attribute space.

The first example is the Eagle Ford case study. Patricia Santagrossi published these results last year.  This is a 3D survey of a little over 200 square miles. The SOM analysis was run between the Buda and the Austin Chalk and the Eagle Ford is right above the Buda in this little region right there.  The Eagle Ford shale layer was 108′ thick, which is only 14 ms.  Now both the Buda and Austin Chalk are known , strong peak events. So, if you count how many cycles we go through here, peak trough, kind of a doublet, trough, peak. The good stuff here is basically all the bed from one peak to one trough. Conventional seismic data. Here’s the Eagle Ford shale as measured right at the Buda break well there.  We have both a horizontal and a vertical well right here. And that trough is associated with the Eagle Ford Shale.  That trough and that peak. So, this is the SOM result with an 8x8 set of neurons that are used for the training. Look at the visible amount of detail here. Not just between the Buda and the Austin Chalk, but actually you can see how things are changing, even along the formation here, within the run of the horizontal well. Because every change in color here corresponds to a change in neuron.

These results were computed by machine learning using seismic attributes alone. We did not tie the results to any of the wells. The SOM analysis was run on seismic samples with multiple attributes values. The key idea here is simultaneous multi-attribute analysis using machine learning. Now, let’s look further at this Eagle Ford case study.

These are results computed by machine learning using seismic attributes.  We did not skew the results and tie them to any of the wells.  They were not forced to fit the wells or anything else. The SOM analysis was run strictly on seismic data and the multi-attribute seismic samples.  Again, the right term is simultaneous multi-attribute analysis. Multi-attribute, meaning it’s a vector. In our analysis every single sample is being used simultaneously to classify the data – a solution.  So although this area is 200 square miles from an aerial view, between the Buda and the Austin Chalk, we’re looking at every single sample – not just wavelets. By simple inspection, we can see that the results corroborate the results of applying machine learning with the well logs, but there has been no force fitting of the data. These arrows are referring to the SOM winning neurons. If we look at detail, here is Well #8, a vertical well in the Eagle Ford shale. The high resistivity zone is right in here. That could be tied into the red stuff. So, here again we’re dealing with seismic data on a sample-by-sample basis.

The SOM winning neurons identified 24 geobodies, autopicked in 150 feet of vertical section at this well on #8 in the Eagle Ford borehole. Some of the geobodies – not all of them – some of them track the underwells and went over the entire 200 sq. mile 3D survey.

This is to zero in a little bit more.  So I can give you some association here. This is the high resistivity zone is correlating with winning neuron 54, 60, and 53 in this zone right in here. There’s the Eagle Ford Ash that is identified with neurons 63 and 64. And Patricia even found to tie in with this marker right here – this is neuron 55.

And this well, by the way, well #8, was 372 Mboe. SOM classification neurons are associated with specific wireline lithofacies units. That’s really hard to argue against.  We have evidence, in this case up here for example, of an unconformity where we lost a neuron right through here and then we picked it up again over there.  And, there is evidence in the Marl of slumping of some kind.  So, we’re starting to understand what’s happening geologically using machine learning. We’re seeing finer detail – more than we would have using conventional seismic data and single attributes.

Tricia found a generalized cross-section of Cretaceous in Texas, northwest / southeast towards the gulf. Eagle Ford shale fits in here below the Marl and there’s an unconformity between those two – she was able to see some evidence of that.

The well that we just looked at was well #8, and it ties in with the winning neuron.  Let’s take a look at another well, say for example, well #3, a vertical well with some x-ray diffraction associated with it. We can truly nail this stuff with the real lithology, so not only do we have a wireline result, but we also have X-ray diffraction results to corroborate the classification results.

So, of the 64 neurons, over 41,000 were classified as “the good stuff.” Not on a sample basis, so you can integrate that – you can tally all that stuff up and start to come up with estimates.

So, specific geobodies relate to winning neuron that we’re tracking – #12 – that’s the bottom line. And from that we were able to develop a whole Wheeler diagram for the Eagle Ford group for the survey.  And the good stuff are the winning neurons 58 and 57. They end up on the neuron topology map here, so those two were associated with the wireline lithofacies footstep – the high resistivity part of the Eagle Ford shale. But she was able to work out additional things, such as more clastics and carbonates in the west and clastics in the southeast. And, she was able to work out not only Debris Apron, but the ashy beds and how they tie in.  Altogether, these were the neurons associated with the Eagle Ford shale. These were the neurons – 1, 9, and 10, that’s the basal clay shale.  And the Marls were associated with these neurons.

So, the autopicked geobodies, across the survey on the basis we’re developing the depositional environment of the Eagle Ford that compare favorable with the well logs. Using seismic data alone, one of our associates received feedback to the effect that “seismic is only good in conventionals, just for the big structural picture.” Man, what a sad conclusion that is.  There’s a heck of a lot  more out of this high resistivity zone pay that was associated with two specific neurons, demonstrating that this machine learning technology is equally applicable to unconventionals.

The second case study here is the Gulf of Mexico, by my distinguished associate, Mr. Rocky Roden. This is not deepwater – only approximately 300 feet. Here’s a north fault amplitude buildup. Here, these are time contours and the amplitude conformance to structure is pretty good. In this crossline – 3183 – going from west to east is the distribution of the histogram of the values. You can see here in the dotted portion, this is just the amplitude display, and the box right here is a blowup of the edge right there of that reservoir. What you can see here is the SOM classification using colors.  Red is associated with the gas-over-oil contact and oil-over-water contact. A single sample.  So here we have the use of machine learning to help us find fluid contacts, which are very difficult to see.  This is all without having bandwidth, frequency range, point source, point receivers – it isn’t a case of everything dialed in just the right way. The rest of the story is just the use of machine learning. However, it’s machine learning on not just samples of single numbers, but each sample as a combination of attributes; as a vector. Using that choice of attributes, we’re able to identify fluid contacts. For easier viewing, we make all these others transparent and only show those that you can see visually here of what has been estimated using the classifier of the fluid contacts and also the hills.  In addition, look at the edges. The ability to define the edge of the reservoirs and come up with volumetrics, is pretty clear to be superior. Over here on the left, Rocky’s taken the “goodness of fit”, which is an estimate of the probability of how well each of these samples fits the winning neuron, and by lowering the probability limit, and saying “I just want to look at the anomalies”, that edge of the amplitude conformance of structure, I think is clearly better than what you would have using amplitude alone.

So, new machine learning technology stuff using simultaneous multi-attributes is resolving much finer reservoir detail than we’ve had in the past, and the geobodies that fit the reservoirs are revealed in the details, frankly, previously not available.

In general, this is what our “Earth to Earth” model looks like.  If we start here with the 3D survey, and then from the 3D survey, we decide on a set of attributes.  We take all our samples, which are vectors because of our choice of attributes, and then literally, plot them in attribute space. If you’ve 5 attributes, it’s 5-dimensional space.  If you have 8 attributes, it’s 8-dimensional space. And your choice of attributes is going to illuminate different properties of the reservoir. So, the choice of attributes that Rocky used helped to zero in on those fluid contacts, would not be the ones he would use to illuminate the volume properties or the absorption properties, for example.  Once the attribute volumes is in attribute space, we use a machine learning classifier to analyze and look for natural clusters of information in attribute space. Once those are classified in attribute space, the results then, are presented back in a virtual model, if you will, of the earth itself. So, our job here is our picking geobodies, some of which have geologic significance and some of which don’t.   The real power is in the natural clusters of information in attribute space.  If you have a channel and you’ve got the attributes selected to illuminate channel properties, then, every single point that is associated with the channel, no matter where it is, is going to concentrate in the same place in attribute space.  Natural clusters of information in attribute space are all stacking.  The neurons are hunting, looking for natural clusters, or higher density, in attribute space.  They do this using very simple rules.  The mathematics behind this process were published by us in the November 2015 edition of the Interpretation journal, so if you would like to dig into the details, I invite you to read that paper, which is available on our website.

Two keys are: 1. Attribute selection list. Think about your choice of attributes as an illumination function. What you are trying to do with your choice of attributes is an illumination function of the real geobodies in the earth and how they end up as natural clusters in attribute space. And that’s the key.  2. Neurons search for clusters of information in attribute space. Remember the movie, The Matrix? The humans had to be still and hide from the machines that went crazy and hunted them. That’s not too unlike what’s going on in attribute space. It’s like The Matrix because the data samples themselves don’t move. They’re just waiting there. It’s the neurons that are running around in attribute space, looking for clusters of information. The natural cluster is an image of one or more geobodies in the earth, but it’s been illuminated in attribute space, totally depending on the illumination list.  It stacks in common place in attributes – that’s the key.

Seismic stratigraphy is broken up into two levels here: first is seismic sequence analysis where you look at your seismic data and you organize it and break it up in to packets of concordant reflections. It’s pretty straightforward stuff – chaotic depositional patterns.  And then after you have developed a sequence analysis, you can categorize the different sequences. You have a facies analysis trying to infer the depositional setting. Is the sea level rising? Is it falling? Is it stationary? All this naturally falls in because the seismic reflections are revealing geology on a very broad basis.

Well, the attribute – it’s hunting geobodies as well. Multi-attribute geobodies are also components of seismic stratigraphy. We define it this way: a simple geobody has been auto-picked by machine learning in attribute space. That’s all it is – we’re defining a simple geobody. We all know how to run an auto-picker. In 15 minutes, you can be taught how to run an auto-picker in attribute space. Complex geobodies are interpreted by you and I. We look at the simple geobodies and we composite those just the way we saw in that wheeler diagram. We combine those to make complex geobodies.  We give it a name, some kind of texture, some kind of surface – all those things are interpreted geobodies and the construction of these complex geobodies can be sped up by some geologic rule-making.

Now the mathematical foundation we published in 2015 ties this altogether pretty nicely. You see, machine learning isn’t magic.  It depends on the noise level of the seismic data. Random noise broadens natural clusters in attribute space. What that means then, is that we’re attenuating noise so optimum acquisition and data processing, delivering natural clusters with the greatest separation. In other words, nice, tight clusters in attribute space will be much easier for the machine learning algorithm to identify when you have nice, clean identification and separation. So, acquisition and data processing matters.

However, this isn’t talking about coherent noise. Coherent noise is something else. Because with coherent noise, you may have an acquisition footprint, but that forms a cluster in attribute space and one of those neurons is going to go after that just as well because it’s an increase in information density in attribute space and voila – you have a handful of neurons that are associated with an acquisition footprint. Coherent noise can be deducted by the classification process where the processor has merged two surveys.

Second thing: Better wavelet processing leads to narrower, natural clusters, more compact natural clusters leads to better geobody resolution because geobodies are derived from natural clusters.

Last but not least, larger neural networks produce greater geobody details. You run a 6x6, an 8x8 and a 10x10 2D Colormaps, you eventually get to the point where you’re just swamped with details and you just can’t figure this thing out. We see that again and again.  So, it’s better to look at the situation from 40K feet, and then 20, and then 10. Usually, we just go ahead and run all three SOM runs all at once to get them all done and to examine them in increasing levels of detail.

I’d like to now switch gears on something entirely different.  Put the SOM box here aside for a minute, and let’s revisit the work Rocky Roden did in the Gulf of Mexico . Rocky came up with an important way of thinking about the application of this new tool.

In terms of using multi-attribute seismic interpretation – think of it as a process and what’s really important is starting with the geologic question of what you want to answer. For example: we’re trying to illuminate channels. Ok, so there are a certain set of attributes that would be good.  So, what we have then here is, ask the question first. Firmly have that in your mind for this multi-attribute seismic interpretation process.

There’s a certain set of attributes for the geologic question, and the terminology for that set is the “attribute selection list”. When you do an interpretation like this, you really need to be aware of the current attributes being used when looking at the data. Depending on the question, we then take the discipline and we say “well, if this is the question you’re asking”, this attribute selection list is appropriate. Remember, the attribute selection list is an illumination function.

Once you have the geologic process, the next step is the attribute selection list, and then classify simple geobodies, which is auto-picking your data in attribute space and looking at the results.

Now, this just doesn’t happen in back and it just doesn’t happen at once – it’s an iterative process. So, interpreting complex geobodies is basically more than one SOM run, and more than one geologic question. And interpreting these results at different levels – how many neurons, that sort of thing, this is a whole seismic interpretation process. Interpreting these complex geobodies is the next step.

We’re looking at results and constructing geologic models. Decide which is the final geologic model, and then our last step is making property predictions.

So, in the world of multiple geologic models, or multiple statistic models, it really doesn’t make any difference. We select the model, we test the model, we select a bunch of models, we test those models, and we choose one! Why? Because we want to make some predictions.  There’s got to be one final model that we decide on as professional that this is most reliable and we’re going to use it.  Whether it’s exploration, exploitation, or even appraisal, same methodology – it’s all the same for geologic models and statistical models.

The point here boils down to something pretty fundamental.  As exploration geophysicists, we’re in the business of prediction. That’s our business. The boss wants to know “where do you want to drill, and how deep? And what should we expect on the way down? Do we have any surprises here?” They want answers! And we’re in the business of prediction.

So how good you are as a geoscientist depends, fundamentally, on how good are your predictions of your final model? That’s what we do. Whether you want to think about it like that or not, that’s really the bottom line. 

So this is really about model building for multi-attribute interpretation – that’s the first step. Then we’re going to test the model and choose the model. Ok, so, should that model-building be shipped out as a data processing project? Or through our geo-processing people?  Or is that really something that should be part of interpretation? Do you really trust that the right models have been built from geoprocessing? Maybe. Maybe not.  If it takes 3 months, you sure hope you have the right model from a data processing company. And foolish, foolish, foolish if you think there’s only one run.  That’s really dangerous.  That’s a kiss and a prayer, and oh, after three months, this is what you’re going to build your model on. 

So, as an aside, if you decide that building models is a data processing job, where’s the spontaneity? And I ask you – where’s the sense of adventure? Where’s the sense of the hunt? That’s what interpretation is all about – the hunt. Do you trust that the right questions have been asked before the models are built?  And my final point here is that there are hundred’s of reasons just to follow procedure.  Stay on the path and follow procedure. Unfortunately, nobody wants to argue. The truth here is what we’re looking for. And truth, invariably – that path has twists and turns. That’s exploration. That’s what we’re doing here.  That’s fun stuff. That’s what keeps our juices going… about finding those little twists and turns and zeroing in on finding truth. 

Now model testing and final selection have begun when models are built and you decide which is the right one. For example, you generate 3 SOMs – an 8x8, 12x12, 4x4, and you look at results and the boss says “ok, you’ve been monkeying around long enough, what’s the answer? Give me the answer”… “Well…hmm…” you respond. “I like this one. I think 8x8 is the right one.”  Now, you could do that, but you might not want to admit it to the boss! One quantitative way of comparing models would be to look at your residual errors.  The only trouble with that is it’s not very robust. However, a quantitative assessment – comparing models – is a good way to go. 

So, there is a better methodology – better than just comparing residual errors – this is a whole field of cross-validation methodologies. Not going to go into that stuff right here, but some cross-validation tools: bootstrapping, bagging, and even Bayesian statistics are helpful tools in helping us prepare models and helping us figure out the model that is robust and in the face of new data is going to give us a good strong answer – NOT the answer that fits the data the best. 

Think about the old problem of fitting a least squares line through some data. You write your algorithm in python or whatever tool, and it kind of fits through the data, and the boss goes “I don’t know why you’re monkeying around with lines. I think this is an exponential curve because this is production data.” So, you make an exponential curve.  Now, this business of cross-validation, think about this: fitting a polynomial to the data: two terms, a line, three terms, a parabola, four terms … until n… we could make n equal 15 and by golly there’s no possibility of error – we crank that thing down. The trouble is, we have over-fit the data. It fits this data perfect, but some new data comes in and it’s a terrible model because the errors are going to be really high. It’s not robust. So, this whole comes up to cross validation methodology is really very important. The future here is, “who’s going to be making the prediction – you, or the machine?” I maintain to make good decisions, it’s going to be us! We’re the ones that will be making the right characteristics – because we’ll leverage machine learning.

Let’s take a look at Machine Learning. Our company vision is the following: 

“There’s no reason why we cannot expect to query our seismic data for information with learning machines, just as effortlessly and with as much reliability as we query the web for the nearest gas station.” 

Now this statement of where our company is going is not a statement of “get rid of the interpreters“. It’s a statement, in my way of thinking, and in all of us at our operations, it’s a statement of a way forward. Because truly, this use of machine learning is a whole new way of doing seismic interpretation. It’s using it as a tool – it’s not replacing anybody.  Deep learning, which is important for seismic evaluation, might be a holy grail, but its roots are in image processing, not in the physics of wave motion. Be very careful with that.

Image processing is very good at telling the difference between Glen and me from that have pictures of us. Or if you have kitties and you have little doggies, image processing can classify those, even right down to those that you’re not real certain whether it’s a dog or cat.  So, deep learning is focused on image processing and also on the subtle distinctions between what is the essence of a dog and what is the essence of a cat, irrespective of whether the cat is laying there or standing there or climbing up a tree.  That’s the real power of this sort of thing. 

Here’s a comparison of SOM and Deep Learning in terms of all of its properties, and there’s good and bad things about each one of these.  There’s no magic about any one of these. Not to say one’s better than the other.

I would like to point out that unsupervised machine learning trains by discovering natural clusters in attribute space. Once those natural clusters have been identified in attribute space, attribute space is carved up and say any samples to this region right in here in attribute space corresponds to this winning neuron and over here is that winning neuron.  Your data is auto-picked and put back in 3-dimensional space in a virtual 3D survey. That’s the essence of what’s available today.

Supervised machine learning trains on patters that it discovers on amplitude data alone. Now there are two deep learning algorithms that are popular today. One’s called Convolutional Neural Network, which learns by visual patterns, faces, sometimes called eigenfaces, uses PCA. And then there are fully convolutional networks, which are using sample size patches and full connections between the network layers. 

Here’s a little cartoon showing you this business about layers.  This is the picture and trying to identify the little features of this, you can’t say that this is a robot, as opposed to a cat or a dog, until it goes through this analysis. Using patching and features maps, using different features for different things, it goes from one patch to the next to the next, until finally – your outputs here -well, it must be robot, dog, or kitty. It’s a classifier using the properties it has discovered in a single image. The algorithm has discovered its own attributes. You might say “that’s pretty cool”. And indeed it is, but it’s only using the information seen in that picture. So, it’s association – it’s the texture features of that image. 

Here’s an example from one of our associates – Tao Zhao – he’s been working in the area of full convolutional networks. This example is where he’s done some training – training lines A – clinoforms here, chaotic deposition here, maybe some salt down there, and then some concordant reflections up top. Here’s an example of the results of the FCN. And then here is the classification of salt down here. So, the displays here are examples of full convolutional networks. 

One final point and then I’ll sit down: Data is more important than the algorithms. The training rules are very simple. Remember the snow geese? Remember the fish? If you were a fish or if you were a snow goose, the rules are pretty simple. There’s a fanny – I’m gonna be about 3 feet behind it, and I’m not gonna be right behind the snow goose ahead of me – I want to be either to the left or the right. Simple rule. You’re a fish, you want to have another fish around you of a certain distance. Simple rules. What’s important here is data is more important than the algorithms.

Here is an example taken from E&P Magazine this month (January). For several years this company called Solution Seekers has been training on production data using a variety of different data and looking for patterns to develop best practice drilling recommendations. Kind of a cool big-picture kind of a concept.

So machine learning training rules are simple  – the real value is the classification of results it’s the data the builds the complexity. My question to you is: Does this really address the right questions? If it does, extremely valuable stuff. If it misses the direction of where we’re going – the geologic question – it’s not that useful.

So machine learning training rules are simple  – the real value is the classification of results it’s the data the builds the complexity. My question to you is: Does this really address the right questions? If it does, it’s extremely valuable stuff. If it misses the direction of where we’re going – the geologic question – it’s not that useful.

Machine Learning Revolutionizing Seismic Interpretation

Machine Learning Revolutionizing Seismic Interpretation

By Thomas A. Smith and Kurt J. Marfurt
Published with permission: The American Oil & Gas Reporter
July 2017

The science of petroleum geophysics is changing, driven by the nature of the technical and business demands facing geoscientists as oil and gas activity pivots toward a new phase of unconventional reservoir development in an economic environment that rewards efficiency and risk mitigation. At the same time, fast-evolving technologies such as machine learning and multiattribute data analysis are introducing powerful new capabilities in investigating and interpreting the seismic record.

Through it all, however, the core mission of the interpreter remains the same as ever: extracting insights from seismic data to describe the subsurface and predict geology between existing well locations–whether they are separated by tens of feet on the same horizontal well pad or tens of miles in adjacent deepwater blocks. Distilled to its fundamental level, the job of the data interpreter is to determine where (and where not) to drill and complete a well. Getting the answer correct to that million-dollar question gives oil and gas companies a competitive edge. The ability to arrive at the right answers in the timeliest manner possible is invariably the force that pushes technological boundaries in seismic imaging and interpretation. The state of the art in seismic interpretation is being redefined partly by the volume and richness of high-density, full-azimuth 3-D surveying methods and processing techniques such as reverse time migration and anisotropic tomography. Combined, these solutions bring new resolution and clarity to processed subsurface images that simply are unachievable using conventional imaging methods. In data interpretation, analytical tools such as machine learning, pattern recognition, multiattribute analysis and self-organizing maps are enhancing the interpreter’s ability to classify, model and manipulate data in multidimensional space. As crucial as the technological advancements are, however, it is clear that the future of petroleum geophysics is being shaped largely by the demands of North American unconventional resource plays. Optimizing the economic performance of tight oil and shale gas projects is not only impacting the development of geophysical technology, but also dictating the skill sets that the next generation of successful interpreters must possess. Resource plays shift the focus of geophysics to reservoir development, challenging the relevance of seismic-based methods in an engineering-dominated business environment. Engineering holds the purse strings in resource plays, and the problems geoscientists are asked to solve with 3-D seismic are very different than in conventional exploration geophysics. Identifying shallow drilling hazards overlying a targeted source rock, mapping the orientation of natural fractures or faults, and characterizing changes in stress profiles or rock properties is related as much to engineering as to geophysics.

Given the requirements in unconventional plays, there are four practical steps to creating value with seismic analysis methods. The first and obvious step is for oil and gas companies to acquire 3-D seismic and incorporate the data into their digital databases.  Some operators active in unconventional plays fully embrace 3-D technology, while others only apply it selectively. If interpreters do not have access to high-quality data and the tools to evaluate that information, they cannot possibly add value to company’s bottom line.The second step is to break the conventional resolution barrier on the seismic reflection wavelet, the so-called quarter-wave length limit. This barrier is based on the overlapping reflections of seismic energy from the top and bottom of a layer, and depends on layer velocity, thickness, and wavelet frequencies. Below the quarter-wave length, the wavelets start to overlap in time and interfere with one another, making it impossible by conventional means to resolve separate events. The third step is correlating seismic reflection data–including compressional wave energy, shear wave energy and density–to quantitative rock property and geomechanical information from geology and petrophysics. Connecting seismic data to the variety of very detailed information available at the borehole lowers risk and provides a clearer picture of the subsurface between wells, which is fundamentally the purpose of acquiring a 3-D survey. The final step is conducting a broad, multiscaled analysis that fully integrates all available data into a single rock volume encompassing geophysical, geologic and petrophysical features. Whether an unconventional shale or a conventional carbonate, bringing all the data together in a unified rock volume resolves issues in subsurface modeling and enables more realistic interpretations of geological characteristics.

The Role of Technology

Every company faces pressures to economize, and the pressures to run an efficient business only ratchet up at lower commodity prices. The business challenges also relate to the personnel side of the equation, and that should never be dismissed. Companies are trying to bridge the gap between older geoscientists who seemingly know everything and the ones entering the business who have little experience but benefit from mentoring, education and training. One potential solution is using information technology to capture best practices across a business unit, and then keeping a scorecard of those practices in a database that can offer expert recommendations based on past experience. Keylogger applications can help by tracking how experienced geoscientists use data and tools in their day-to-day workflows. However, there is no good substitute for a seasoned interpreter. Technologies such as machine learning and pattern recognition have game-changing possibilities in statistical analysis, but as petroleum geologist Wallace Pratt pointed out in the 1950s, oil is first found in the human mind. The role of computing technology is to augment, not replace, the interpreter’s creativity and intuitive reasoning (i.e., the “geopsychology” of interpretation).

Delivering Value

A self-organizing map (SOM) is a neural network-based, machine learning process that is simultaneously applied to multiple seismic attribute volumes. This example shows a class II amplitude-variation-with-offset response from the top of gas sands, representing the specific conventional geological settings where most direct hydrocarbon indicator characteristics are found. From the top of the producing reservoir, the top image shows a contoured time structure map overlain by amplitudes in color. The bottom image is a SOM classification with low probability (less than 1 percent) denoted by white areas. The yellow line is the downdip edge of the high-amplitude zone designated in the top image. Consequently, seismic data interpreters need to make the estimates they derive from geophysical data more quantitative and more relatable for the petroleum engineer. Whether it is impedance inversion or anisotropic velocity modeling, the predicted results must add some measure of accuracy and risk estimation. It is not enough to simply predict a higher porosity at a certain reservoir depth. To be of consequence to engineering workflows, porosity predictions must be reliably delivered within a range of a few percentage points at depths estimated on a scale of plus or minus a specific number of feet.

3-d seismic image

Class II amplitude-variation-with-offset response from the top of gas sand.

Machine learning techniques apply statistics-based algorithms that learn iteratively from the data and adapt independently to produce repeatable results. The goal is to address the big data problem of interpreting massive volumes of data while helping the interpreter better understand the interrelated relationships of different types of attributes contained within 3-D data. The technology classifies attributes by breaking data into what computer scientists call “objects” to accelerate the evaluation of large datasets and allow the interpreter to reach conclusions much faster. Some computer scientists believe “deep learning” concepts can be applied directly to 3-D prestack seismic  data volumes, with an algorithm figuring out the relations between seismic amplitude data patterns and the desired property of interest.  While Amazon, Alphabet and others are successfully using deep learning in marketing and other functions, those applications have access to millions of data interactions a day. Given the significantly fewer number of seismic interpreters in the world, and the much greater sensitivity of 3-D data volumes, there may never be sufficient access to training data to develop deep learning algorithms for 3-D interpretation.The concept of “shallow learning” mitigates this problem.

Stratigraphy above the Buda

Conventional amplitude seismic display from a northwest-to-southeast seismic section across a well location is contrasted with SOM results using multiple instantaneous attributes.

First, 3-D seismic data volumes are converted to well-established relations that represent waveform shape, continuity, orientation and response with offsets and azimuths that have proven relations (“attributes”) to porosity, thickness, brittleness, fractures and/or the presence of hydrocarbons. This greatly simplifies the problem, with the machine learning algorithms only needing to find simpler (i.e., shallower) relations between the attributes and properties of interest.In resource plays, seismic data interpretations increasingly are based on statistical rather than deterministic predictions. In development projects with hundreds of wells within a 3-D seismic survey area, operators rely on the interpreter to identify where to drill and predict how a well will complete and produce. Given the many known and unknown variables that can impact drilling, completion and production performance, the challenge lies with figuring out how to use statistical tools to apply data measurements from the previous wells to estimate the performance of the next well drilled within the 3-D survey area. Therein lies the value proposition of any kind of science, geophysics notwithstanding. The value of applying machine learning-based interpretation boils down to one word: prediction. The goal is not to score 100 percent accuracy, but to enhance the predictions made from seismic analysis to avoid drilling uneconomic or underproductive wells. Avoiding investments in only a couple bad wells can pay for all the geophysics needed to make those predictions. And because the statistical models are updated with new data as each well is drilled and completed, the results continually become more quantitative for improved prediction accuracy over time.

New Functionalities

In terms of particular interpretation functionalities, three specific concepts are being developed around machine learning capabilities:

  • Evaluating multiple seismic attributes simultaneously using self-organizing maps (multiattribute analysis);
  • Relating in multidimensional space natural clusters or groupings of attributes that represent geologic information embedded in the data; and
  • Graphically representing the clustered information as geobodies to quantify the relative contributions of each attribute in a given seismic volume in a form that is intrinsic to geoscientific workflows.

A 3-D seismic volume contains numerous attributes, expressed as a mathematical construct representing a class of data from simultaneous analysis. An individual class of data can be any measurable property that is used to identify geologic features, such as rock brittleness, total organic carbon or formation layering. Supported by machine learning and neural networks, multiattribute technology enhances the geoscientist’s ability to quickly investigate large data volumes and delineate anomalies for further analysis, locate fracture trends and sweet spots in shale plays, identify geologic and stratigraphic features, map subtle changes in facies at or even below conventional seismic resolution, and more. The key breakthrough is that the new technology works on machine learning analysis of multiattribute seismic samples.While applied exclusively to seismic data at present, there are many types of attributes contained within geologic, petrophysical and engineering datasets. In fact, literally, any type of data that can be put into rows and columns on a spreadsheet is applicable to multiattribute analysis. Eventually, multiattribute analysis will incorporate information from different disciplines and allow all of it to be investigated within the same multidimensional space.That leads to the second concept: Using machine learning to organize and evaluate natural clusters of attribute classes. If an interpreter is analyzing eight attributes in an eight-dimensional space, the attributes can be grouped into natural clusters that populate that space. The third component is delivering the information found in the clusters in high-dimensionality space in a form that quantifies the relative contribution of the attributes to the class of data, such as simple geobodies displayed with a 2-D color index map.This approach allows multiple attributes to be mapped over large areas to obtain a much more complete picture of the subsurface, and has demonstrated the ability to achieve resolution below conventional seismic tuning thickness. For example, in an application in the Eagle Ford Shale in South Texas, multiattribute analysis was able to match 24 classes of attributes within a 150-foot vertical section across 200 square miles of a 3-D survey. Using these results, a stratigraphic diagram of the seismic facies has been developed over the entire survey area to improve geologic predictions between boreholes, and ultimately, correlate seismic facies with rock properties measured at the boreholes.Importantly, the mathematical foundation now exists to demonstrate the relationships of the different attributes and how they tie with pixel components in geobody form using machine learning. Understanding how the attribute data mathematically relate to one another and to geological properties gives geoscientists confidence in the interpretation results.


Leveraging Integration


The term “exploration geophysics” is becoming almost a misnomer in North America, given the focus on unconventional reservoirs, and how seismic methods are being used in these plays to develop rather than find reservoirs. With seismic reflection data being applied across the board in a variety of ways and at different resolutions in unconventional development programs, operators are combining 3-D seismic with data from other disciplines into a single integrated subsurface model. Fully leveraging the new sets of statistical and analytical tools to make better predictions from integrated multidisciplinary datasets is crucial to reducing drilling and completion risk and improving operational decision making. Multidimensional classifiers and attribute selection lists using principal component analysis and independent component analysis can be used with geophysical, geological, engineering, petrophysical and other attributes to create general-purpose multidisciplinary tools of benefit to all oil and gas company departments and disciplines. As noted, the integrated models used in resource plays increasingly are based on statistics, so any evaluation to develop the models also needs to be statistical. In the future, a basic part of conducting a successful analysis will be the ability to understand statistical data and how the data can be organized to build more tightly integrated models. And if oil and gas companies require more integrated interpretations, it follows that interpreters will have to possess more integrated skills and knowledge. The geoscientist of tomorrow may need to be more of a multidisciplinary professional with the blended capabilities of a geologist, geophysicist, engineer and applied statistician. But whether a geoscientist is exploring, appraising or developing reservoirs, he or she only can be as good as the prediction of the final model. By applying technologies such as machine learning and multiattribute analysis during the workup, interpreters can use their creative energies to extract more knowledge from their data and make more knowledgeable predictions about undrilled locations.

ChingWen Chen, seismic interpreterTHOMAS A. SMITH is president and chief executive officer of Geophysical Insights, which he founded in 2008 to develop machine learning processes for multiattribute seismic analysis. Smith founded Seismic Micro-Technology in 1984, focused on personal computer-based seismic interpretation. He began his career in 1971 as a processing geophysicist at Chevron Geophysical. Smith is a recipient of the Society of Exploration Geophysicists’ Enterprise Award, Iowa State University’s Distinguished Alumni Award and the University of Houston’s Distinguished Alumni Award for Natural Sciences and Mathematics. He holds a B.S. and an M.S. in geology from Iowa State, and a Ph.D. in geophysics from the University of Houston.
Dr. Kurt MarfurtKURT J. MARFURT is the Frank and Henrietta Schultz Chair and Professor of Geophysics in the ConocoPhillips School of Geology & Geophysics at the University of Oklahoma. He has devoted his career to seismic processing, seismic interpretation and reservoir characterization, including attribute analysis, multicomponent 3-D, coherence and spectral decomposition. Marfurt began his career at Amoco in 1981. After 18 years of service in geophysical research, he became director of the University of Houston’s Center for Applied Geosciences & Energy. He joined the University of Oklahoma in 2007. Marfurt holds an M.S. and a Ph.D. in applied geophysics from Columbia University.

Geobody Interpretation Through Multi-Attribute Surveys, Natural Clusters and Machine Learning

By Thomas A. Smith 
June 2017

Geobody interpretation through multi-attribute surveys, natural clusters and machine learning


Multi-attribute seismic samples (even as entire attribute surveys), Principal Component Analysis (PCA), attribute selection lists, and natural clusters in attribute space are candidate inputs to machine learning engines that can operate on these data to train neural network topologies and generate autopicked geobodies. This paper sets out a unified mathematical framework for the process from seismic samples to geobodies.  SOM is discussed in the context of inversion as a dimensionality-reducing classifier to deliver a winning neuron set.  PCA is a means to more clearly illuminate features of a particular class of geologic geobodies.  These principles are demonstrated with geobody autopicking below conventional thin bed resolution on a standard wedge model.


Seismic attributes are now an integral component of nearly every 3D seismic interpretation.  Early development in seismic attributes is traced to Taner and Sheriff (1977).  Attributes have a variety of purposes for both general exploration and reservoir characterization, as laid out clearly by Chopra and Marfurt (2007).  Taner (2003) summarizes attribute mathematics with a discussion of usage.

Self-Organizing Maps (SOM) are a type of unsupervised neural networks that self-train in the sense that they obtain information directly from the data.  The SOM neural network is completely self-taught, which is in contrast to the perceptron and its various cousins undergo supervised training.  The winning neuron set that results from training then classifies the training samples to test itself by finding the nearest neuron to each training sample (winning neuron).  In addition, other data may be classified as well.  First discovered by Kohonen (1984), then advanced and expanded by its success in a number of areas (Kohonen, 2001; Laaksonen, 2011), SOM has become a part of several established neural network textbooks, namely Haykin (2009) and Dutta, Hart and Stork (2001).  Although the style of SOM discussed here has been used commercially for several years, only recently have results on conventional DHI plays been published (Roden, Smith and Sacrey, 2015).

Three Spaces

The concept of framing seismic attributes as multi-attribute seismic samples for SOM training and classification was presented by Taner, Treitel, and Smith (2009) in an SEG Workshop.  In that presentation, survey data and their computed attributes reside in survey space.  The neural network resides in neuron topology space.  These two meet in attribute space where neurons hunt for natural clusters and learn their characteristics.

Results were shown for 3D surveys over the venerable Stratton Field and a Gulf of Mexico salt dome.  The Stratton Field SOM results clearly demonstrated that there are continuous geobody events in the weak reflectivity zone between C38 and F11 events, some of which are well below seismic tuning thickness, that could be tied to conventional reflections and which correlated with wireline logs at the wells.  Studies of SOM machine learning of seismic models were presented by Smith and Taner (2010).  They showed how winning neurons distribute themselves in attribute space in proportion to the density of multi-attribute samples.  Finally, interpretation of SOM salt dome results found a low probability zone where multi-attribute samples of poor fit correlated with an apparent salt seal and DHI down-dip conformance (Smith and Treitel, 2010).

Survey Space to Attribute Space:

Ordinary seismic samples of amplitude traces in a 3D survey may be described as an ordered  set .  A multi-attribute survey is a “Super 3D Survey” constructed by combining a number of attribute surveys with the amplitude survey.  This adds another dimension to the set and another subscript, so the new set of samples including the additional attributes is .  These data may be thought of as separate surveys or equivalently separate samples within one survey.  Within a single survey, each sample is a multi-attribute vector.  This reduces the subscript by one count so the set of multi-attribute vectors  .

Next, a two-way mapping function may be defined that references the location of any sample in the 3D survey by single and triplet indices  Now the three survey coordinates may be gathered into a single index so the multi-attribute vector samples are also an unordered set in attribute space  The index map is a way to fine a sample a sample in attribute space from survey space and vice versa.

Multi-attribute sample and set in attribute space: 

A multi-attribute seismic sample is a column vector in an ordered set of three subscripts c,d,e representing sample index, trace index, and line index. Survey bins refer to indices d and e.  These samples may also be organized into an unordered set with subscript i.  They are members of an -dimensional real space.  The attribute data are normalized so in fact multi-attribute samples reside in scaled attribute space.

Natural clusters in attribute space: 

Just as there are reflecting horizons in survey space, there must be clusters of coherent energy in attribute space.  Random samples, which carry no information, are uniformly distributed in attribute space just as in survey space.  The set  of natural clusters in attribute space is unordered and contains m  members.  Here, the brackets [1, M]  indicate an index range.  The natural clusters may reside anywhere in attribute space, but attribute space is filled with multi-attribute samples, only some of which are meaningful natural clusters.  Natural clusters may be big or small, tightly packed or diffuse.  The rest of the samples are scattered throughout F-space.  Natural clusters are discovered in attribute space with learning machines imbued with simple training rules and aided by properties of their neural networks.

A single natural cluster: 

A natural cluster may have elements in it.  Every natural cluster is expected to have a different number of multi-attribute samples associated with it.  Each element is taken from the pool of the set of all multi-attribute samples   Every natural cluster may have a different number of multi-attribute samples associated with it so for any natural cluster,  then N(m).  Every natural cluster has its own unique properties described by the subset of samples  that are associated with it.  Some sample subsets associated with a winning neuron are small (“not so popular”) and some subsets are large (“very popular”).  The distribution of Euclidean distances may be tight (“packed”) or loose (“diffuse”).

Geobody sample and geobody set in survey space: 

For this presentation, a geobody G_b is defined as a contiguous region in survey space composed of elements which are identified by members g.  The members of a geobody are an ordered set  which registers with those coordinates of members of the multi-attribute seismic survey .

A geobody member is just an identification number (id), an integer .  Although the 3D seismic survey is a fully populated “brick” with members ,  the geobody members  register at certain contiguous locations, but not all of them.  The geobody  is an amorphous, but contiguous, “blob” within the “brick” of the 3D survey.  The coordinates of the geobody blob in the earth are  where  By this, all the multi-attribute samples in the geobody may be found, given the id and three survey coordinates of a seed point.

A single geobody in survey space

Each geobody  is a set of  N geobody  members with the same id.  That is, there are N members in , so N(b).  The geobody members for this geobody are taken from the pool of all geobody samples, the set  Some geobodies are small and others large.  Some are tabular, some lenticular, some channels, faults, columns, etc.  So how are geobodies and natural clusters related?

A geobody is not a natural cluster

This expression is short but sweet.  It says a lot.  On the left is the set of all B geobodies.  On the right is the set of M natural clusters.  The expression says that these two sets aren’t the same.  On the left, the geobody members are id numbers  These are in survey space.  On the right, the natural clusters  These are in attribute space.  What this means is that geobodies are not directly revealed by natural clusters.  So, what is missing?

Interpretation is conducted in survey space.  Machine learning is conducted in attribute space.  Someone has to pick the list of attributes.  The attributes must be tailored to the geological question at hand.  And a good geological question is always the best starting point for any interpretation.

A natural cluster is an imaged geobody

Here, a natural cluster C_m is defined as an unorganized set of two kinds of objects: a function I of a set of geobodies G_i and random noise N.  The number of geobodies is I and unspecified.  The function  is an illumination function which places the geobodies in  The illumination function is defined by the choice of attributes.  This is the attribute selection list.  The number of geobodies in a natural cluster C_m is zero or more, 0<i<I.  The geobodies are distributed throughout the 3D survey.

The natural cluster concentrates geobodies of similar illumination properties.  If there are no geobodies or there is no illumination with a particular attribute selection list,  , so the set is only noise.  The attribute selection list is a critically import part of multi-attribute seismic interpretation.  The wrong attribute list may not illuminate any geobodies at all.

Geobody inversion from a math perspective

Multi-attribute seismic interpretation proceeds from the preceding equation in three parts.  First, as part of an inversion process, a natural cluster   is statistically estimated by a machine learning classifier such as SOM  with a neural network topology.  See Chopra, Castagna and Potniaguie (2006) for a contrasting inversion methodology.  Secondly, SOM employs a simple training rule that a neuron nearest a selected training sample is declared the winner and the winning neuron advances toward the sample a small amount.  Neurons are trained by attraction to samples.  One complete pass through the training samples is called an epoch.  Other machine learning algorithm have other training rules to adapt to data.  Finally, SOM has a dimensionality reducing feature because information contained in natural clusters is transferred (imperfectly) to the winning neuron set in the finalized neural network topology through cooperative learning.  Neurons in winning neuron neighborhood topology move along with the winning neuron in attribute space.  SOM training is also dynamic in that the size of the neighborhood decreases with each training time step so that eventually the neighborhood shrinks so that all subsequent training steps are competitive.

Because  is a statistical estimate, let it be called the statistical estimate of the “signal” part of .  The true geobody is independent of an illumination function.  The dimensionality reduction   associated with multi-attribute interpretation has a purpose of geobody recognition through identification, dimensionality reduction and classification.  In fact, in the chain of steps there is a mapping and un-mapping process with no guarantee that the geobody will be recovered: 

However, the image function   may be inappropriate to illuminate the geobody in F-space because of a poor choice of attributes.  So at best, the geobodies is illuminated by an imperfect set of attributes and detected by a classifier that is primitive.  The results often must be combined, edited and packaged into useful, interpreted geobody units, ready to be incorporated into an evolving geomodel on which the interpretation will rest.

Attribute Space Illumination

One fundamental aspect of machine learning is dimensionality reduction from attribute space because its dimensions are usually beyond our grasp.  The approach taken here is from the perspective of manifolds which are defined as spaces with the property of “mapability” where Euclidean coordinates may be safely employed within any local neighborhood (Haykin, 2009, p.437-442).

The manifold assumption is important because SOM learning is routinely conducted on multi-attribute samples in attribute space using Euclidean distances to move neurons during training.  One of the first concerns of dimensionality reduction is the potential to lose details in natural clusters.  In practice, it has been found that halving the original amplitude sample interval is advantageous, but further downsampling has not proven to be beneficial.  Infilling a natural cluster allows neurons during competitive training to adapt to subtle details that might be missed in the original data.

Curse of Dimensionality

The Curse of Dimensionality (Haykin, 2009) is, in fact, many curses.  One problem is that uniformly sampled space increases dramatically with increasing dimensionality.  This has implications when gathering training samples for a neural network.  For example, cutting a unit length bar (1-D) with a sample interval of .01 results in 100 samples.  Dividing a unit length hypercube in 10-D with a similar sample interval results in 1020 samples (1010 x 102).  If the nature of attribute space requires uniform sampling across a broad numerical range, then a large number of attributes may not be practical.  However, uniform sampling is not an issue here because the objective is to locate and detail features of natural clusters.

Also, not all attributes are important.  In the hunt for natural clusters, PCA (Haykin, 2009) is often a valuable tool to assess the relative merits of each attribute in a SOM attribute selection list.  Depending on geologic objectives, several dominant attributes may be picked from the first, second or even third principal eigenvectors or may pick all attributes from one principle eigenvector.

Geobody inversion from an interpretation perspective

Multi-attribute seismic interpretation is finding geobodies in survey space aided by machine learning tools that hunt for natural clusters in attribute space.  The interpreter’s critical role in this process is the following:

  • Choose questions that carry exploration toward meaningful conclusions.
  • Be creative with seismic attributes so as to effectively address illumination of geologic geobodies.
  • Pick attribute selection lists with the assistance of PCA.
  • Review the results of machine learning which may identify interesting geobodies  in natural clusters autopicked by SOM.
  • Look through the noise to edit and build geobodies  with a workbench of visualization displays and a variety of statistical decision-making tools.
  • Construct geomodels by combining autopicked geobodies which in turn allow predictions on where to make better drilling decisions.

The Geomodel

After classification, picking geobodies from their winning neurons starts by filling an empty geomodel .  Natural clusters are consolidators of geobodies with common properties in attribute space so M < B.  In fact, it is often found that M << B .  That is, geobodies “stack” in attribute space.  Seismic data is noisy.  Natural clusters are consequentially statistical.  Not every sample g classified by a winning neuron is important although SOM classifies every sample. Samples that are a poor fit are probably noise.  Construction of a sensible geomodel depends on answering well thought out geological questions and phrased by selection of appropriate attribute selection lists.

Working below classic seismic tuning thickness

Classical seismic tuning thickness is λ/4.  Combining vertical incidence layer thickness  with  λ=V/f leads to a critical layer thickness  Resolution below classical seismic tuning thickness has been demonstrated with multi-attribute seismic samples and a machine learning classifier operating on those samples in scaled attribute space (Roden, et. al., 2015). High-quality natural clusters in attribute space imply tight, dense balls (low entropy, high density).  SOM training and classification of a classical wedge model at three noise levels is shown in Figures 1 and 2 which show tracking well below tuning thickness.

Seismic Processing: Processing the survey at a fine sample interval is preferred over resampling the final survey to a fine sample interval. Highest S/N ratio is always preferred. Preprocessing: Fine sample interval of base survey is preferred to raising the density of natural clusters and then computing attributes, but do not compute attributes and then resample because some attributes are not continuous functions. Derive all attributes from a single base survey in order to avoid misties. Attribute Selection List: Prefer attributes that address the specific properties of an intended geologic geobody. Working below tuning, prefer instantaneous attributes over attributes requiring spatial sampling.  Thin bed results on 3D surveys in the Eagle Ford Shale Facies of South Texas and in the Alibel horizon of the Middle Frio Onshore Texas and Group corroborated with extensive well control to verify consistent results for more accurate mapping of facies below tuning without usual traditional frequency assumptions (Roden, Smith, Santogrossi and Sacrey, personal communication, 2017).


There is a firm mathematical basis for a unified treatment of multi-attribute seismic samples, natural clusters, geobodies and machine learning classifiers such as SOM.  Interpretation of multi-attribute seismic data is showing great promise, having demonstrated resolution well below conventional seismic thin bed resolution due to high-quality natural clusters in attribute space which have been detected by a robust classifier such as SOM.


I am thankful to have worked with two great geoscientists, Tury Taner and Sven Treitel during the genesis of these ideas.  I am also grateful to work with an inspired and inspiring team of coworkers who are equally committed to excellence.  In particular, Rocky Roden and Deborah Sacrey are longstanding associates with a shared curiosity to understand things and colleagues of a hunter’s spirit.

Figure 1: Wedge models for three noise levels trained and classified by SOM with attribute list of amplitude and Hilbert transform (not shown) on 8 x 8 hexagonal neuron topology. Upper displays are amplitude. Middle displays are SOM classifications with a smooth color map. Lower displays are SOM classifications with a random color map. The rightmost vertical column is an enlargement of wedge model tips at highest noise level.  Multi-attribute classification samples are clearly tracking well below tuning thickness which is left of the center in the right column displays.

Figure 2: Attribute space for three wedge models with horizontal axis of amplitude and vertical axis of Hilbert transform. Upper displays are multi-attribute samples before SOM training and lower displays after training and samples classified by winning neurons in lower left with smooth color map.  Upper right is an enlargement of tip of third noise level wedge model from Figure 1 where below-tuning bed thickness is right of the thick vertical black line.


Chopra, S. J. Castagna and O. Potniaguine, 2006, Thin-bed reflectivity inversion, Extended abstracts, SEG Annual Meeting, New Orleans.

Chopra, S. and K.J. Marfurt, 2007, Seismic attributes for prospect identification and reservoir characterization, Geophysical Developments No. 11, SEG.

Dutta, R.O., P.E. Hart and D.G. Stork, 2001, Pattern Classification, 2nd ed.: Wiley.

Haykin, S., 2009, Neural networks and learning machines, 3rd ed.: Pearson.

Kohonen, T., 1984, Self-organization and associative memory, pp 125-245. Springer-Verlag. Berlin.

Kohonen, T., 2001, Self-organizing maps: Third extended addition, Springer, Series in Information Services.

Laaksonen, J. and T. Honkela, 2011, Advances in self-organizing maps, 8th International Workshop, WSOM 2011 Espoo, Finland, Springer.

Ma, Y. and Y. Fu, 2012, Manifold Learning Theory and Applications, CRC Press, Boca Raton.

Roden, R., T. Smith and D. Sacrey, 2015, Geologic pattern recognition from seismic attributes, principal component analysis and self-organizing maps, Interpretation, SEG, November 2015, SAE59-83.

Smith, T., and M.T. Taner, 2010, Natural clusters in multi-attribute seismics found with self-organizing maps: Source and signal  processing section paper 5: Presented at Robinson-Treitel Spring Symposium by GSH/SEG, Extended Abstracts.

Smith, T. and S. Treitel, 2010, Self-organizing artificial neural nets for automatic anomaly identification, Expanded abstracts, SEG Annual Convention, Denver.

Taner, M.T., 2003, Attributes revisited,, accessed 22 March 2017.

Taner, M.T., and R.E. Sheriff, 1977, Application of amplitude, frequency, and other attributes, to stratigraphic and hydrocarbon  determination, in C.E. Payton, ed., Applications to hydrocarbon exploration: AAPG Memoir 26, 301–327.

Taner, M.T., S. Treitel, and T. Smith, 2009, Self-organizing maps of multi-attribute 3D seismic reflection surveys, Presented at the 79th International SEG Convention, SEG 2009 Workshop on “What’s New in Seismic Interpretation,” Paper no. 6.

ChingWen Chen, seismic interpreterTHOMAS A. SMITH is president and chief executive officer of Geophysical Insights, which he founded in 2008 to develop machine learning processes for multiattribute seismic analysis. Smith founded Seismic Micro-Technology in 1984, focused on personal computer-based seismic interpretation. He began his career in 1971 as a processing geophysicist at Chevron Geophysical. Smith is a recipient of the Society of Exploration Geophysicists’ Enterprise Award, Iowa State University’s Distinguished Alumni Award and the University of Houston’s Distinguished Alumni Award for Natural Sciences and Mathematics. He holds a B.S. and an M.S. in geology from Iowa State, and a Ph.D. in geophysics from the University of Houston.