2.9.16
Coherent GT
A modern user interface library for games
Migration guide for rendering between GT versions

Transitioning from GT 2.6 to GT 2.7

This section outlines the changes you need to apply in order to upgrade from GT 2.6 and prior to GT 2.7 and later.

Coming with GT 2.7, the internal graphics library, Renoir, has been updated with optimizations and capabilities to balance the load between the CPU and GPU. The fine balancing act gives the best results on mobile architectures.

There are several changes in the graphics backend that must be adapted from previous versions in order for the new version to work.

The backend now provides a new capability - PreferCPUWorkload, which executes more work on the CPU instead of the GPU. This is beneficial on systems with fast CPUs, or if the application is already GPU-bound. The current recommendation is to set this option to false on PC/Consoles and true on mobile devices.

The CPU workload preference allows for batching graphics commands of different type (e.g. draw image/draw text/etc.), while otherwise only the same type of commands can be grouped together. This requires some new shaders that are labeled as ST_BatchedXXX.

An orthogonal optimization valid for both preferences (CPU/GPU workload) is splitting the standard shader (which included code handling many different types of graphics commands) into 2 parts - frequently used (e.g. images, rectangles, text) and rarely used (e.g. ellipses, YUV to RGB conversion, filter transform).

Here’s an overview of the changes needed to make your backend work:

  • The StandardVertex structure now uses a float4 instead of float3 for the Position attribute
  • The Standard shader is split into Frequent/Rare parts
  • This also changes the expected slots for some bound textures
  • If you intend to use the PreferCPUWorkload option in the backend, you’ll need to implement the the batched version of the shaders as well
  • Text rendering is using different texture slots so their implementation should be updated

Here are the steps for migrating:

  • Change the input layout of the StandardVertex
    • Previously, it was Float3, Float4, Float4
    • Now, it is Float4, Float4, Float4
  • The Backend::FillCaps function must set the PreferCPUWorkload option
    • Set to false for PC/Consoles
    • Set to true for mobile
  • The Backend::FillCaps function must map the shaders according to the new scheme
    • ST_Standard, ST_StandardTexture, ST_Text, ST_TextSDF are still mapped to ST_Standard
    • ALL other shader types, previously mapped to ST_Standard, are now mapped to ST_StandardRare
    • ST_Stencil, ST_StencilTexture are still mapped to ST_Stencil
    • ST_StencilRRect, ST_StencilCircle are now mapped to ST_StencilRare
    • You must map the new ST_StandardRare and ST_StencilRare to themselves
  • Create the new shaders when initializing your backend
    • ST_BatchedStandard (new VS + PS)
    • ST_StandardRare (can use the ST_Standard VS, new PS)
    • ST_StencilRare (can use the ST_Standard VS, new PS)
  • Take care of the new shader types when receiving a RendererBackend::CreatePipelineState call
    • Add handlers for ST_BatchedStandard, ST_StencilRare
    • Remove ST_Text, it is no longer used (it’s now part of ST_Standard)
  • Change the shaders accordingly
    • The VS input structure must now use float4 instead of float3 for the Position attribute
    • The ST_Standard shader is now used for ShaderTypes 0 (draw rect), 3 (draw texture), 17 (raster text), 18 (SDF text)
      • Previously it was used for several other types
      • It was not used for text, now it is
    • The ST_StandardRare handles the rest of the ShaderTypes that were previously handled by the ST_Standard one.
    • The ST_BatchedXXX pixel shaders are almost the same as the ST_XXX variants, except that the ShaderType is no longer a uniform parameter, but a varying one instead (coming from the PS input data)
    • The ST_BatchedXXX vertex shader must set the varying param for the pixel shader that contains the ShaderType. It originally comes from the vertex shader input, from the Additional.w attribute
    • The ST_StencilXXX shaders are split similar to ST_Standard - one handling the 0,3,17,18 types, and another all the others
    • The text shader previously used texture slot 0 for rendering, but now it uses texture slot 1 for raster text, and texture slot 2 for SDF text. Change the text shader to use the new texture slots depending on the type of text.

Transitioning from GT 2.8.1.0 to 2.8.2.0

This section outlines the changes you need to apply in order to upgrade from GT versions 2.8.1.0 and prior to 2.8.2.0.

The current backend API was designed around DirectX 11 and therefore implementing a DirectX 11 backend with the current API is simple and intuitive. However, with the advent of modern low-level APIs (e.g. Dx12, Metal and Vulkan) and concepts such as render passes, enhancements of the backend API were needed, so that the backend implementation for those graphics APIs is easier and more efficient.

You can read more about the motivation for the backend API changes here.

In GT 2.8.2.0, Renoir's backend API has been modified significantly with the introduction of 2 new commands, 4 new Renoir Core capabilities and one new shader type. There are also several changes in the graphics backend that must be adapted from previous versions in order for the new version to work.

If you prefer to make minimal changes to you backend and not use the new capabilities, here are the steps for migrating:

  • Add the following lines to the FillCaps method
    outCaps.ShouldUseRenderPasses = false;
    outCaps.ShouldClearRTWithClearQuad = false;
    outCaps.ConstantBufferBlocksCount = 1;
    outCaps.ConstantBufferRingSize = 1;
    ...
    outCaps.ShaderMapping[ST_ClearQuad] = ST_ClearQuad;
  • Add unsigned size as last argument of the CreateConstantBuffer method and use it for the constant buffer allocation size. This is how the CreateConstantBuffer method declaration in the backend header should look like:
    virtual bool CreateConstantBuffer(CBType type, ConstantBufferObject object, unsigned size) override;
    Example of using the size parameter on constant buffer creation from the DirectX11 backend:
    bool Dx11Backend::CreateConstantBuffer(CBType, ConstantBufferObject object, unsigned size)
    {
    D3D11_BUFFER_DESC bufferDesc;
    bufferDesc.ByteWidth = size;
    ...
    // Create the constant buffer with the filled buffer description
    m_Device->CreateBuffer(&bufferDesc, ..)
    }
  • In SetRenderTarget stop using EnableColorWrites flag as it is no longer present and instead handle the PipelineState's ColorMask field in CreatePipelineState. This field currently only supports the values ColorWriteMask:CWM_None and ColorWriteMask::CWM_All, which correspond to the previous false and true values of EnableColorWrites. Set the appropriate value of the graphics API's render target color write mask. E.g. in DirectX11 the write mask is placed in the description of the blend state:
    bool Dx11Backend::CreatePipelineState(const PipelineState& state, PipelineStateObject object)
    {
    ...
    D3D11_BLEND_DESC desc;
    desc.RenderTarget[0].RenderTargetWriteMask = UINT8(state.ColorMask);
    ...
    // Create the blend state with the filled description
    m_Device->CreateBlendState(&desc, ...)
    }
  • In ExecuteRendering add empty cases with only break in them for BC_BeginRenderPass and BC_EndRenderPass:
    case BC_BeginRenderPass:
    {
    break;
    }
    break;
    case BC_EndRenderPass:
    {
    break;
    }
    break;

The MSAASamples field was added to the PipelineState structure, so you may start using it in CreatePipelineState.

Below we will describe each new capability, how it can affect you backend and what are the needed changes you need to make to use it.

When the ShouldUseRenderPasses capability is enabled, then Renoir starts enqueuing the commands BeginRenderPass and EndRenderPass and stops issuing the SetRenderTarget and ResolveRenderTarget commands. The BeginRenderPass command provides all the needed information for starting a render pass in modern graphics APIs like Metal and Vulkan. This information includes the render targets, whether they should be cleared on render pass load and if they should be resolved on store. Here are the additional steps you need to make to start using this capability:

  • Set ShouldUseRenderPasses to true in the FillCaps method
  • You can remove the implementation of the SetRenderTarget and ResolveRenderTarget methods and add an assert that they are never called
  • Implement a BeginRenderPass method, which handles the corresponding command by using the provided information by it to begin a render pass in the graphics API
  • Implement a EndRenderPass method, which handles the corresponding command by ending the current render pass and possibly also resetting any currently kept state of the render pass. E.g. in our Metal backend the implementation of the EndRenderPass method is the following:
    [m_State->CurrentCmdEncoder endEncoding];
    m_State->CurrentCmdEncoder = nil;
    m_State->BoundGPUState = GPUState();

Enabling the ShouldClearRTWithClearQuad capability will make Renoir issue fullscreen clear quad instead of calling ClearRenderTarget. The clear quad is done through a new vertex and pixel shader. The capability was added so that we don't need to create a new render pass to clear a render target in graphics APIs like Metal, which do not provide an easier way to do it. Here are the additional steps you need to make to start using this capability:

  • Set ShouldClearRTWithClearQuad to true in the FillCaps method
  • You can remove the implementation of the ClearRenderTarget method and add an assert that it is never called
  • You need to create a new ST_ClearQuad vertex and pixel shader, compile them if necessary and start using them. You can check out the example ST_ClearQuad HLSL shaders provided with the DirectX11 backend. The Metal backend is using the clear quad capability, so you can check out how to use the new shaders in its implementation.

The ConstantBufferRingSize capability allows you to set the size of the internal ring buffer, which is used to manage Renoir's constant buffers. We recommend to set this size to 4 for low-level graphics APIs like Dx12, Metal and Vulkan. The motivation for this particular size is that the maximum count of buffered frames in a standard pipeline is three and in order to surely avoid overlap of constant buffers, they should be managed by a circular buffer with size 4. If you have a pipeline with higher maximum count of buffered frames, then this value should be changed accordingly. For most high-level graphics APIs ring buffer size should be set to 1, because the drivers for them handle constant buffer overlap internally and therefore a greater value for the ring buffer size is unnecessary.

The only steps you need to make to start using this capability are:

  • Set ConstantBufferRingSize to the appropriate value in the FillCaps method
  • If you have a ring buffer for the constant buffers in your backend, then you can remove it, because Renoir will do it automatically for you

The ConstantBufferBlocksCount capability allow you to set the count of aligned constant buffer blocks for each constant buffer type. Renoir will issue a CreateConstantBuffer call with size equal to (constant buffer blocks count) * (aligned specific constant buffer size) for each constant buffer type. If the blocks count value is greater than 1, then if the regular constant buffer becomes full, Renoir will make sure that a new auxiliary constant buffer is allocated. If the blocks count value is equal to 1, then Renoir won't create any auxiliary constant buffers. Auxiliary constant buffers are allocated per frame, thus being allocated before ExecuteRendering is called and deallocated immediately after that. Setting constant buffer ring size and blocks count value to greater than 1 usually goes hand in hand, because both provide functionality that otherwise should be explicitly implemented in the backend for low-level graphics APIs like Dx12, Metal and Vulkan. For other APIs that don't support constant buffers, but use uniform slots (e.g OpenGL) both capabilities should be set to 1 in order to avoid unnecessary constant buffer creation.

The only steps you need to make to start using this capability are:

  • Set ConstantBufferBlocksCount to the appropriate value in the FillCaps method
  • Remove all the logic in your backend, which manually creates auxiliary buffers once the regular ones are full. Renoir will create and manage them automatically

Transitioning from GT 2.8.9.0 to 2.8.10.0

GT version 2.8.10.0 introduces support for using images that do not have their alpha channel premultiplied into the other color channels. Prior to version 1.8, all images in the SDK were treated as if they were using premultiplied alpha, disregarding any image metadata that might tell otherwise.

User images (images that are preloaded by the engine, instead of decoded internally by GT) can now specify whether their alpha channel is premultiplied via the new Coherent::UIGT::ResourceResponseUIGT::UserImageData::AlphaPremultiplication property. You can set the property to the correct value in the UserImageData object that is passed to the Coherent::UIGT::ResourceResponseUIGT::ReceiveUserImage API in the Coherent::UIGT::ResourceHandler::OnResourceRead callback of your resource handler. This allows you to re-use the same image in both your engine and UI even if the engine uses a non-premultiplied alpha pipeline.

Following is a table that describes the differences and solutions for various image formats:

Format 1.7 and prior 1.8
PNG, JPG, Other RGB(A) formats Automatically premultiplied after decode No change - Automatically premultiplied after decode
DDS Assumed to have premultiplied alpha May need to re-save - Attempts to determine if alpha is premultiplied from metadata
KTX, ASTC, PKM Assumed to have premultiplied alpha Must re-save - Assumed NOT to have premultiplied alpha
User images Assumed to have premultiplied alpha User controlled via Coherent::UIGT::ResourceResponseUIGT::UserImageData::AlphaPremultiplication

Note that the output of all operations in the GT is still an alpha-premultiplied texture so blending of the resulting UI texture is not affected.

Transitioning from GT 2.9.4.0 to 2.9.5.0

New fields in the renoir::RendererCaps structure.

There are 2 new fields in the renoir::RendererCaps structure: renoir::RendererCaps::MaxTextureWidth and renoir::RendererCaps::MaxTextureHeight. The default versions of all backends are updated to fill in these values, which represent the maximum possible width and height of a 2D texture for the device, respectively. These values are used to limit the size of temporary textures that the Renoir library creates. A good default is 8192x8192px, which is what was used internally before exposing these options.

Warning
Filling in correct values in the renoir::RendererCaps::MaxTextureWidth and renoir::RendererCaps::MaxTextureHeight fields is required, since otherwise these limits will end up with uninitialized values and lead to undefined behavior.

renoir::DrawCustomEffectCmd has been changed to provide more information.

Previously, the structure had the renoir::float2 fields TextureCoordinates and TextureDimensions, which represented the placement of the drawn effect in the currently set render target. These values are now passed through the renoir::float4 field renoir::DrawCustomEffectCmd::TargetGeometryPositionSize.