Sources of Value in Seismic Imaging
The Earth's Interior can be decomposed everywhere in two elements. First, the geometry of all rock layers underneath the surface, called Structure, and, second the velocity distribution of these rocks. Such velocities refer to the speed of sound when transmitted trough a particular rock across the structure. That velocity is an intrinsic property to the nature of the rocks, and altogether constitutes the Velocity Field. Structure and Velocity Field are independent variables of the Earth's Interior that when combined give place to the Velocity Model of a given place on the Earth.
Subsurface Seismic Imaging consists on determining simultaneously the Structure and the Velocity Field somewhere on the Earth from a single seismic experiment.
Since there are two unknown variables, Structure and Velocity Field and a single experiment, the solution to the problem is indeterminate. The only way to solve it is by iterating from an initial Velocity Model that somehow has to be guessed before the process start.
There are three Sources of Value in Seismic Imaging: Velocity Model, Algorithms used in the iterations, and the Capacity of the computers used to iterate.
Focusing seismic energy in Seismic Imaging is exactly the same process as focusing light in Optics. The Velocity Model in Seismic Imaging plays the same role of a lens. Repsol-YPF strongly believes that the greatest value added during the Seismic Imaging process comes from the time spent "crafting the lens", i.e. building the Velocity Model. Moreover, as clearer subsalt images are provided by new algorithms, there is still the problem of spatially accurate images, resolvable only with a detailed velocity model.
To maximize the time for the data to be in the hands of the interpreters to build the Velocity Model, the computation time has to be minimized, therefore requiring a very fast computational solution.
Within Kaleidoscope, the Velocity Model Building process is improved with the development of tools that reduce the need for subjective and sometimes inconsistent human interaction, and at the same time, increasing turnaround for large, dense 3D seismic datasets.
Algorithms are crucial for the quality of the final image. Seismic Imaging algorithms have been known even from the sixties, but they had not been coded due to the lack of adequate or affordable hardware. The affordability of the hardware is growing faster than the capacity to progress in the algorithm implementation and coding.
But because of lack of coordination between algorithm builders and hardware manufacturers, there is a real race, to provide with newer and faster algorithms and codes through taking shortcuts and compromises from the original algorithms. This competition is leading to a situation that somebody quoted as "algorithm pollution" referring to the amount of different flavors for the implementation of Seismic Imaging technologies.
The Kaleidoscope Project tackles this problem from a different perspective: Collaboration between algorithm coders (3DGeo, BSC) and hardware manufacturers (IBM). In doing so, there is a parallel simultaneous research in hardware and software. Therefore, there is no compromise in the imaging quality by making trade-offs between algorithm accuracy and algorithm speed. The speed will come from the Cell/B.E. (*) processor, the I/O improvements and the tailoring of codes to the Cell/B.E. (*) processor.
The imaging algorithms being developed are grouped in three categories:
- Anisotropic One-pass and Two-pass Shot Profile Migration
- Plane-Wave Migration & Migration in Tilted Coordinates (PWSPM)
- Anisotropic Reverse Time Migration (RTM)
Every iteration of an average Seismic Imaging production processing project requires floating point operations. That means that every iteration in an average 10 Tflops machine requires four months. On the other hand, such iteration would only require one and a half day in a peta-scale machine.
The need for Capacity is obvious in Seismic Imaging. The amazing price-performance ratio of Linux PC-Clusters made Seismic Imaging Technology a reality. Wave Equation algorithms have been only widely available to the industry from around year 2003. The evolution of the algorithms, and the application to exploration in increasingly geologically complex areas is a consequence of the ever increasing performance of PC-Clusters.
During the last five years, the need in computing power needed for seismic imaging in the oil industry has increased two orders of magnitude and the storage, and I/O needs, three orders of magnitude. At present time, and considering the new algorithms to come (Waveform Inversion, Plane Wave Reverse Time Migration, ...) there is no indication that this rate will decrease.
Capacity evolution is predictable. By looking at the Top500 computers by year on a log-log scale, it can be predicted than in three years peta-scale capacity will be widely available.
The problem is not when to achieve the peta-scale capacity widely available, but how to do it. It is clear that widely available peta-scale capacity will require:
- A multielement(multicore) processor, given that CPU frequency has reached a limit. Any multielement processor will be difficult to program.
- This multielement processor must have very low power consumption to make peta-scale capacity economically viable and technically feasible.
- Widely available peta-scale capacity requires a very cheap processor. That implies mass production in figures larger than the current production for processors in the PC industry.
The Cell/BE processor met these three characteristics and in addition, testing on the processor evidences that the performance of present Cell/BE vs. present Superscalar Processors is superior in FFT´s (up to 40 times), Stencil Computations (more than 15 times), with a power consumption reduction of 10 times.
Migration software must give simultaneously the maximum performance and the maximum flexibility to simulate different scenarios. It is critical to exploit the different levels of parallelism with maximum efficiency. In Kaleidoscope we manage the following levels of parallelism:
- Grid level: All the shots in a migration algorithm could be process in parallel. The different computational nodes of a cluster could be used to run simultaneously different shots. This is the called Grid or Application level parallelism. We use the Grid-SuperScalar programming model to exploit it. This level requires manage workflows with some data dependencies defined by input/output files. Moreover is critical to have a fault tolerance mechanism at this level due to the duration of a complete migration execution. The parallel efficiency of this level is 100%. The key issue at this level is the simplicity to express the workflows.
- Process level: Each individual shot requires some hardware resources. If these resources are larger than the available resources in a single computational node of the cluster, the shot execution must be splited between some computational nodes using domain decomposition techniques. We use MPI programming model to exploit this level. The scalability of this level is limited by the use of Finite Differences as discretization technique. However, this is not a problem because the number of domains needed to have enough hardware resources is quite small. We always work with a parallel efficiency at this level greater than 90%. Moreover, at this level we must manage the IO needed by RTM. Using asynchronous IO and checkpointing capability we are able to minimize the IO time in a RTM execution.
- Thread level: In the present supercomputers, a single computational node use to be a shared memory multiprocessor. In order to use efficiently all the processors in a computational node we use the OpenMP programming model. This allows us to use all the memory in a single node for a single shot or for a domain from a single shot. The key issue is to manage properly the memory access in order to have a good thread load balancing and the minimum thread memory interferences. We have obtained a 94% parallel efficiency at this level using IBM JS21 blades as computational nodes.
- Processor level: Because migration algorithms use to be limited by memory bandwidth it is critical to minimize the cache miss ratio of the computational kernel. This is accomplished using blocking algorithms. Other important point at this level is to exploit vector capabilities of the processor.