Skip to the content.

Game Engine RHI System Analysis Series 1: Instance (2024.11.25)

Home

Overview

In order to use a Graphics API, once has to initialize something called an Instance. In Direct3D, the Microsoft DirectX Graphics Infrastructure or DXGI is the instance, and in Vulkan, VkInstance is the instance.

The primary goal of DXGI is to manage low-level tasks that can be independent of the DirectX graphics runtime. DXGI provides a common framework for future graphics components. DXGI’s purpose is to communicate with the kernel mode driver and the system hardware, as shown in the following diagram.

dxgi-dll

There is no global state in Vulkan and all per-application state is stored in a VkInstance object. Creating a VkInstance object initializes the Vulkan library and allows the application to pass information about itself to the implementation.

Anki

Anki does not keep the IDXGIFactory in memory. It is created via CreateDXGIFactory API call when needed.

When initializing their graphics manager, Anki creates an IDXGIFactory6 by creating a IDXGIFactory2 from the CreateDXGIFactory2(UINT, REFIID, void**) API call, then querying IDXGIFactory6 from the created IDXGIFactory2 instance.

If GPU validation is required, then DXGI_CREATE_FACTORY_DEBUG flag is set when creating an instance.

Anki then queries the physical device(IDXGIAdapters).

When creating a swap chain, an IDXGIFactory2 is created by the CreateDXGIFactory2(UINT, REFIID, void**) API call. The created instance is used to create a swap chain by the IDXGIFactory2::CreateSwapChainForHwnd(IUnknown*, HWND, const DXGI_SWAP_CHAIN_DESC1*, const DXGI_SWAP_CHAIN_FULLSCREEN_DESC*, IDXGIOutput*, IDXGISwapChain1). Anki does not support fullscreen transitions, so the instance is used to call IDXGIFactory::MakeWindowAssociation(HWND, UINT).

All these initializations happens when the graphics manager is initialized.

InstanceAnkiD3D12

Just like D3D12, VkInstance is initialized in the graphics manager. Unlike IDXGIFactory, VkInstance is kept by the manager for future uses.

Instance creation in Vulkan is more complex compared to DirectX 12. In order to initialize an instance in Vulkan, one has to provide an application information VkApplicationInfo, an array of instance layers to enable, validation features VkValidationFeaturesEXT, and an array of extensions to enable.

Anki uses VK_LAYER_KHRONOS_validation instance layer if GPU validation is enabled. User can provide instance layers via command-line argument.

Anki supports VK_VALIDATION_FEATURE_ENABLE_DEBUG_PRINTF_EXT if enabled. If GPU validation is enabled, VK_VALIDATION_FEATURE_ENABLE_GPU_ASSISTED_EXT is added to enabled validation features.

If user is using headless surface, VK_EXT_headless_surface instance extension is used. If the user is using Linux OS, VK_KHR_wayland_surface extension is used, VK_KHR_win32_surface for Windows, VK_KHR_android_surface for Android. To support swap chains, VK_KHR_surface extension is used. If GPU validation is enabled, VK_EXT_debug_utils extension is used.

After creating the instance, Anki uses Volk to load all required Vulkan entrypoints using volkLoadInstance(VkInstance), sets debug callbacks with vkCreateDebugUtilsMessengerEXT(VkInstance, const VkDebugUtilsMessengerCreateInfoEXT*, const VkAllocationCallbacks*, VkDebugUtilsMessengerEXT*), and creates the physical device with vkEnumeratePhysicalDevices(VkInstance, uint32_t*, VkPhysicalDevice*).

When instance initialization is over, instance is used to create the surface. Anki supports surface creation via SDL, for Android, and for headless case.

Instance is later used when DLSS needs to be initialized. This is initialized when renderer initializes the renderer objects. One of the renderer objects, TemporalUpscaler uses a GrUpscaler that which can use DLSS.

InstanceAnkiVK

Both graphics manager GrManager and renderer Renderer is initialized when the application App is initialized.

BGFX

InstanceBgfx

BGFX has a Dxgi struct where it manages the DXGI instances such as the IDXGIFactory. When running the application, the engine initializes the instance when available during the main loop(Context::renderFrame). Context has a renderer context, which has the instance.

Enabled layers:

Enabled extensions:

@startuml Anki DXGI
class MakeSingletonPtr

class GrManager {
    +init(GrManagerInitInfo&): Error
}

MakeSingletonPtr <|-- GrManager

class GrManagerImpl {
    -m_crntSwapchain: MicroSwapchainPtr
    +initInternal(const GrManagerInitInfo&): Error
}

GrManager <|-- GrManagerImpl

class MicroSwapchain {
    +MicroSwapchain()
    -initInternal(): Error
}

GrManagerImpl *-- MicroSwapchain

class App {
    +init(): Error
    -initInternal(): Error
}
@enduml

@startuml Anki VkInstance
class MakeSingletonPtr

class GrManager {
    +init(GrManagerInitInfo&): Error
    +newGrUpscaler(const GrUpscalerInitInfo&): GrUpscalerPtr
}

MakeSingletonPtr <|-- GrManager

class GrManagerImpl {
    -m_instance: VkInstance
    +getInstance(): VkInstance
    +initInternal(const GrManagerInitInfo&): Error
    +initInstance(): Error
    +initSurface(): Error
}

GrManager <|-- GrManagerImpl

class GrObject

class GrUpscaler

GrObject <|-- GrUpscaler

class GrUpscalerImpl {
    +initInternal(const GrUpscalerInitInfo&): Error
    -{static} newInstance(const GrUpscalerInitInfo&): GrUpscaler*
    -initDlss(const GrUpscalerInitInfo&): Error
}

GrUpscaler <|-- GrUpscalerImpl

class RendererObject

class TemporalUpscaler {
    -m_grUpscaler: GrUpscalerPtr
    +init(): Error
}

RendererObject <|-- TemporalUpscaler
TemporalUpscaler *-- GrUpscaler

class Renderer {
    -m_temporalUpscaler: TemporalUpscaler
    +init(const RendererInitInfo&): Error
    -initInternal(const RendererInitInfo&): Error
}

class MakeSingleton

MakeSingleton <|-- Renderer
Renderer *-- TemporalUpscaler

class App {
    +init(): Error
    -initInternal(): Error
}
@enduml
@startuml BGFX RHI Instance

struct RendererContextI

struct Dxgi {
    +m_factory: FactoryI*
    +init(_caps: Caps&): bool
}

struct RendererContextD3D12 {
    +m_dxgi: Dxgi
    +init(_init: const Init&)
}

RendererContextI <|-- RendererContextD3D12
RendererContextD3D12 *-- Dxgi

struct RendererContextVK {
    +m_instance: VkInstance
    +init(_init: const Init&)
}

RendererContextI <|-- RendererContextVK

struct Context {
    +m_renderCtx: RendererContextI*
    +renderFrame(int32_t): RenderFrame::Enum
    +rendererExecCommands(CommandBuffer&)
}

Context *-- RendererContextI

@enduml

Diligent Engine

Diligent engine does not keep track of the DXGI adapter / factory. The adapter can be retrieved from the D3D12 device by the LUID, and the factory can be created any time.

  1. Load d3d12.dll
  2. Find Adapters
    1. Create a DXGI Factory CreateDXGIFactory1
    2. Enumerate adapters and check if adapter can create a D3D12 device using minimum feature level (IDXGIFactory::EnumAdapters)
    3. For each enumerated adapter,
      1. Create the D3D12 Device that supports the highest feature level
      2. Check supported features
      3. If tiled resource tier is greater or equal to 1, load NVAPI (if NVPAI is enabled)
      4. Check outputs
    4. For each enumerated adapters,
      1. Get the best adapter (discrete > integrated, more memory)
    5. Get display modes (IDXGIOutput::GetDisplayModeList)
    6. Create debug layer
    7. Create DXGI factory and get the predetermined adapter, and create a D3D12 device that supports the highest feature level
    8. Create the info queue
    9. Create a direct command queue as the default immediate context, and its fence
    10. Create a diligent engine’s D3D12 render device
      1. Create query managers
      2. Create shader compilation thread pool
    11. For each immediate contexts,
      1. Create a diligent engine’s D3D12 immediate context
    12. Create a swap chain
    13. Create a FrameLatencyWaitableObject from the swap chain IDXGISwapChain2::GetFrameLatencyWaitableObject
    14. Create a texture for each back buffers, and create their RTVs
    15. Create a depth buffer texture, and it DSV

Vulkan:

  1. Find Adapters
    1. Create Vulkan instance
      1. Initialize Volk
        1. Load vulkan-1.dll
        2. Use volk to load Vulkan functions
      2. Add instance extensions
        1. VK_KHR_surface
        2. VK_KHR_win32_surface / VK_KHR_android_surface / VK_KHR_wayland_surface / VK_KHR_xlib_surface / VK_KHR_xcb_surface / VK_EXT_metal_surface
        3. VK_KHR_get_physical_device_properties2
      3. Create Vulkan instance
      4. Load instance-related function using Volk
      5. Set up debug layer
      6. Enumerate physical devices
      7. For each devices,
        1. Get properties, features, memory properties, queue family properties
        2. Check supported extensions and add features to query accordingly
          1. VK_KHR_shader_float16_int8,
          2. VK_KHR_storage_buffer_storage_class
            1. VK_KHR_16bit_storage
            2. VK_KHR_8bit_storage
          3. VK_EXT_mesh_shader
          4. VK_KHR_acceleration_structure
          5. VK_KHR_ray_tracing_pipeline
          6. VK_KHR_ray_query
          7. VK_KHR_buffer_device_address
          8. VK_EXT_descriptor_indexing
          9. VK_KHR_spirv_1_4
          10. VK_KHR_portability_subset
          11. VK_EXT_vertex_attribute_divisor
          12. VK_KHR_timeline_semaphore
          13. VK_KHR_multiview
          14. VK_KHR_create_renderpass2
          15. VK_KHR_fragment_shading_rate
          16. VK_EXT_fragment_density_map
          17. VK_EXT_host_query_reset
          18. VK_KHR_draw_indirect_count
          19. VK_KHR_maintenance3
          20. VK_EXT_multi_draw
        3. Check physical device info
      8. For each enumerated adapters,
        1. Get the best adapter (discrete > integrated, more memory)
  2. Create device and contexts
    1. Use device extensions
      1. VK_KHR_swapchain
      2. VK_KHR_maintenance1
      3. VK_EXT_mesh_shader
      4. VK_KHR_shader_float16_int8,
      5. VK_KHR_storage_buffer_storage_class
        1. VK_KHR_16bit_storage
        2. VK_KHR_8bit_storage
      6. VK_KHR_acceleration_structure
      7. VK_KHR_ray_tracing_pipeline
      8. VK_KHR_ray_query
      9. VK_KHR_buffer_device_address
      10. VK_EXT_descriptor_indexing
      11. VK_KHR_spirv_1_4
      12. VK_KHR_portability_subset
      13. VK_EXT_vertex_attribute_divisor
      14. VK_KHR_timeline_semaphore
      15. VK_KHR_multiview
      16. VK_KHR_create_renderpass2
      17. VK_KHR_fragment_shading_rate
      18. VK_EXT_fragment_density_map
      19. VK_EXT_host_query_reset
      20. VK_KHR_draw_indirect_count
      21. VK_KHR_maintenance3
      22. VK_KHR_maintenance2
      23. VK_EXT_multi_draw
    2. Enable device features
    3. Enumerate a queue that supports both graphics and compute queue flags
    4. Create a logical device
    5. Use volk to load related functions
    6. Enable features
    7. Create a graphics context queue (from the Vulkan queue we enumerated)
    8. Create a Vulkan render device
      1. Create transient command pool managers and query manager per command queues
      2. Create shader compilation thread pool
    9. Create a Vulkan device context
      1. Allocate a command buffer and begin
      2. Reset stale queries
      3. Create a dummy vertex buffer
      4. Create a AS compacted size query pool
    10. Set created device context as the immediate context of the render device
  3. Create swap chain
    1. Create a surface
    2. Check if the current queue can present to the given surface
    3. Check which supported formats support the current color format
    4. Check surface capabilities and present modes
    5. Set present modes based on VSync support
      1. If VSync enabled
        1. FIFO relaxed
        2. FIFO
      2. Else Vsync disabled
        1. Mailbox
        2. Immediate
        3. FIFO
    6. Set the number of back buffers based on the hardware limits and the desire back buffer count
    7. Create Vulkan swap chain
    8. For each back buffers,
      1. Create some semaphores for
        1. Image acquisition
        2. Draw completion
      2. Create a image acquition fence
    9. For each back buffers
      1. Create a texture
      2. Create RTV
    10. Create a depth buffer texture
    11. Create default DSV

Filament

  1. Create instance
    1. Add VK_LAYER_KHRONOS_validation if validation is enabled
    2. Create Vulkan instance (vkCreateInstance)
    3. Load functions (using BlueVK)
  2. Select physical device
    1. Enumerate devices (vkEnumeratePhysicalDevices)
    2. For each devices,
      1. Get their properties and check versions, graphics bit support (vkGetPhysicalDeviceProperties, vkGetPhysicalDeviceQueueFamilyProperties)
      2. Enumerate extension properties (vkEnumerateDeviceExtensionProperties)
      3. Check for swap chain extension support
    3. Sort devices by device type(discrete/integrated) and pick the best one (discrete > integrated > cpu > virtual gpu > other)
  3. Print device info (vkGetPhysicalDeviceProperties2, vkGetPhysicalDeviceProperties)
  4. Check physical device properties, features, and memory properties (vkGetPhysicalDeviceFeatures2, vkGetPhysicalDeviceMemoryProperties)
  5. Select appropriate queues
  6. Create logical device (vkCreateDevice)
  7. Get queue (vkGetDeviceQueue)
  8. Create Vulkan driver
    1. Create commands
      1. Create command pool (vkCreateCommandPool)
        1. Create command buffers, and for each command buffers,
          1. Allocate command buffer from the pool (vkAllocateCommandBuffers)
          2. Create submission semaphore (vkCreateSemaphore)
          3. Create fence (vkCreateFence)