Vectorized arithmetic code
The Mali Utgard and Midgard GPU architectures implement Single Instruction Multiple Data (SIMD) maths units, exposing vector instructions to each thread of execution. The Mali Bifrost GPU architecture switches to scalar arithmetic instructions, but still implements vector access to memory.
You must understand the following concepts:
- Vector and scalar arithmetic instructions.
- Vector processing units.
How to optimize the use of vectorized arithmetic code on Mali GPUs
Try using the following optimization steps:
- Write vector arithmetic code in your shaders. While doing so is less critical with the introduction of Bifrost architecture, there are still large numbers of devices that are using Utgard and Midgard architecture.
- Write compute shaders so that work items contain enough work to fill the vector processing units.
Something to avoid when optimizing your use of vectorized arithmetic code on Mali GPUs
Do not write scalar code and hope that the compiler optimizes it. The compiler can, but it is more reliably vectorized if the input code starts out in vector form.