On Performance Optimizations of Lattice-Boltzmann codes Operating on Sparse-Matrix Data Representations

Stefan Donath

Stefan Donath did his honours project at the Regional Copmputing Centre Erlangen (RRZE).

The mesoscopic Lattice-Boltzmann method (LBM), in recent years, has evolved to a competitive simulation approach in particular for flows through porous media, flows with complex structures (direct numerical simulation of turbulent flows, e.g.), or multi-phase flows. In its usually applied form, it is an explicit approach which algorithmically resembles a Jakobi method; however, with a 19-point stencil in three dimensions. Therefore, quite a lot of data has to be transferred between main memory and processor in each time step. The performance of LBM codes therefore suffers from the so-called “memory gap”, i.e. the speed difference between current CPUs and memory hardware.

Pull Stream Pull stream directions in a LBM cell.

This memory gap is usually levelled by introducing hierarchies of faster, but smaller, cache memories. Thus, to obtain reasonable performance on such cache-based processors it is vital to reach a very high cache-hit ratio, which can only be achieved by ensuring a local access to data. The layout of LBM methods has been shown to have a tremendous effect on the cache use and thus the sustained performance.

In the case of more complex geometric structures, fluid and solid cells are often discriminated via a “full-matrix” representation. Solid cells are allocated in memory, as well, but masked to avoid calculations on them. In particular in the case of porous media with a low fluid-to-solid ratio, this can not only be a waste of memory but additionally results in poor performance owing to the significantly reduced cache-hit ratio.

In this project, “sparse-matrix”-based representations of data were studied as an alternative to the “full matrix” representation. To optimize the locality of the data access, a clever ordering of the elements based on the use of space-filling curves was used. First results of this approach were presented at the ASIM conference in September 2005.

Construction of the Hilbert space-filling curve.