![]() The researchers then use a combination FP64, FP32, FP21 and FP16 to further reduce the computational and communication costs. This efficiently identified where to apply lower or higher precision to reduce the overall time to solution, a unique use of AI. The researchers turned to AI to determine how to improve the effectiveness of their preconditioner by training a neural network on smaller models to help identify regions of slow convergence in their solver. Researchers have been cautious about using lower precision in the past because the solutions take longer to converge. Fast solution of such a large system of equations requires a good preconditioner to reduce the computational cost. The size of the domain and the physics involved requires a system of equations with 302 billion unknowns that need to be solved iteratively until the solution converges. The simulation includes hard soil, soft soil, underground malls, and subway systems in addition to the complicated buildings above ground. Nonlinear dynamic equations simulate the movement or displacement of the various buildings as a result of the seismic wave. The simulation starts with 3D data of the various buildings in the city of Tokyo. The scientists simulate a seismic wave spreading through the city. ![]() Earthquake simulation model used by the Supercomputing 2018 (SC’18) Gordon Bell finalist paper for Tokyo including layers of soil, underground and above ground structures, and the coupling between them. researchers from the University of Tokyo, Oak Ridge National Laboratory (ORNL), and the Swiss National Supercomputing Centre collaborated on this new solver, called MOTHRA (iMplicit sOlver wiTH artificial intelligence and tRAnsprecision computing). Running MOTHRA on the Summit supercomputer using a combination of AI and mixed-precision, MOTHRA achieved a 25x speed-up compared to the standard solver. Scientists also want to include the shaking of soft soil near the surface as well as the building structures below and above the ground level, shown in figure 3. Current seismic simulations can compute the properties of hard soil shaking deep underground. One of the Gordon Bell finalists simulates an earthquake using AI and transprecision computing (transprecision is synonymous with mixed-precision). Using Mixed-Precision for Earthquake Simulation Tensor Cores provide fast matrix multiply-add with FP16 input and FP32 compute capabilities. Let’s look at a few examples discussed at SC18 on how researchers used Tensor Cores and mixed-precision for scientific computing. The 16x multiple versus FP64 within the same power budget has prompted researchers to explore techniques to leverage Tensor Cores in their scientific applications. Tensor Cores provide up to 125 TFlops FP16 performance in the Tesla V100. Volta V100 and Turing architectures, enable fast FP16 matrix math with FP32 compute, as figure 2 shows. Accumulation to FP32 sets the Tesla V100 and Turing chip architectures apart from all the other architectures that simply support lower precision levels. Using FP16 with Tensor Cores in V100 is just part of the picture. NVIDIA Tesla V100 includes both CUDA Cores and Tensor Cores, allowing computational scientists to dramatically accelerate their applications by using mixed-precision. Figure 1: IEEE 754 standard floating point format Figure 1 describes the IEEE 754 standard floating point formats for FP64, FP32, and FP16 precision levels. Using reduced precision levels can accelerate data transfers rates,increase application performance, and reduce power consumption, especially on GPUs with Tensor Core support for mixed-precision. In recent years, the big bang for machine learning and deep learning has focused significant attention on half-precision (FP16). Researchers have experimented with single-precision (FP32) in the fields of life science and seismic for several years. Problem complexity and the sheer magnitude of data coming from various instruments and sensors motivate researchers to mix and match various approaches to optimize compute resources, including different levels of floating-point precision. However, FP64 also requires more computing resources and runtime to deliver the increased precision levels. Most numerical methods used in engineering and scientific applications require the extra precision to compute correct answers or even reach an answer. Double-precision floating point (FP64) has been the de facto standard for doing scientific simulation for several decades.
0 Comments
Leave a Reply. |