Multisampling for Vulkan
For most multisampling, all of the data for the additional samples are kept in the tile memory, which is inside the GPU. This data is resolved to a value of a single pixel color as part of the tile writeback. This is efficient because the bandwidth for the additional samples never enters the external main memory.
You must understand the following concepts:
- Vulkan APIs.
Optimal multisampling performance for Vulkan
Multisampling is fully integrated with Vulkan render passes, which allows the
multisample resolve to be explicitly specified at the end of the subpass using the
end of pass
How to optimize the use of MSAA with Vulkan
Try using the following optimization steps:
- Use 4x MSAA as it is not expensive and provides good image quality improvements.
loadOp = LOAD_OP_CLEARor
loadOp = LOAD_OP_DONT_CAREfor the multisampled image.
pResolveAttachmentsin a subpass to automatically resolve a multisampled color buffer into a single-sampled color buffer.
storeOp = STORE_OP_DONT_CAREfor the multisampled image.
- Use LAZILY_ALLOCATED memory to back the allocated multisampled images. No physical backing storage is required as they do not need to be stored in the main memory.
Vulkan MSAA steps to avoid
Arm recommends that you:
- Do not use vkCmdResolveImage(). Bandwidth and performance is negatively impacted.
- Do not use
storeOp = STORE_OP_STOREfor multisampled image attachments.
- Do not use
storeOp = LOAD_OP_LOADfor multisampled image attachments.
- Do not use more than 4x MSAA without checking performance.
The negative impact of implementing MSAA with Vulkan incorrectly
Failing to resolve multisampling inline results in higher memory bandwidth and reduced performance. For example, manually writing and resolving a 4xMSAA 1080p surface at 60 FPS requires 3.9GB/s of memory bandwidth. This is compared to 500MB/s when using an inline resolve.