The only reference solution calls remaining to eliminate are those that are responsible for Vulkan initialization and the main loop that runs our code.
This code is boilerplate; it's going to appear (in approximately this form) in any real-time graphics application built on Vulkan.
Swapchain Management
We just dealt with the application side of the swapchain in the last step.
So let's work through the code on the harness side.
The relevant data is stored in the RTG structure:
The surface (VkSurfaceKHR surface) is where the swapchain (VkSwapchainKHR swapchain) images will be presented (shown),
and our code that recreates the swapchain will be called whenever the current swapchain is no longer compatible with the surface (because, e.g., the surface changed size).
The swapchain itself consists of a list of images, all of the same size (VkExtent2D swapchain_extent), which our code will fetch handles to into swapchain_images, and make views of in swapchain_image_views.
Making the Swapchain
Taking a look at the signature of the refsol:: code, you might get the feeling that we've got a few steps to get through here; and we do:
Our plan for this function is to discard the old swapchain (if it exists), create the new swapchain, and then extract handles for the swapchain images and make images views for them.
Destroy the Old Swapchain
Let's start by destroying the old swapchain (if needed):
A note on clean-up: you can more gracefully handle changing a swapchain by using the .oldSwapchain parameter of the swapchain creation function;
for the purposes of this tutorial we're just going to do a slightly less efficient but nonetheless correct thing and destroy the swapchain and rebuild it from scratch.
Determine Swapchain Size (etc.)
Speaking of rebuilding the swapchain, we need to know some things about the surface we're going to be building the swapchain for before we can build it.
So we'll add code to query the VkSurfaceCapabilitiesKHR of the surface and extract the data we need:
We're determining three things here.
The swapchain_extent is straightforward -- we just use the current size of the surface.
Similarly, the transform alluded to in the comment is capabilities.currentTransform, which we just pass on to the swapchain creation function.
We set the number of images we'll ask for in the swapchain (requested_count) to one more than the minimum supported, but clamp it to the maximum supported count (which will only be non-zero if there is a defined maximum).
Our intent here is that the minimum probably reflects the number of images that will be simultaneously in use by the presentation system plus one (since if all the images in the swapchain are in use by the presentation system, no images will be available to render into).
We add one more to allow some amount of parallelism in rendering.
Create the Swapchain Size
With our parameters computed, we now can create the swapchain itself:
The only odd bit of this creation procedure is the code around queue indices.
If the queue that images are presented on is different than the queue we're running graphics commands on, we need to set some extra parameters so the swapchain creation function knows to create images that are shared between queues.
(We could, alternatively, do a queue family ownership transfer on the image after rendering and before presenting;
by making the image shared here we save having to write that code.)
Get Image Handles
Creating the swapchain created a list of images, but we can't do anything with those images without VkImage handles.
So our next chunk of code fetches those handles:
This "two-calls" pattern is common in the Vulkan API.
Queries that return a variable-length array will return just the length if the data parameter is nullptr;
so the first call is getting the array size and the second is actually fetching the contents.
Note, also, that these image handles are all owned by the swapchain; so we don't need to individually destroy them later.
Make Image Views
Vulkan code that accesses images generally does so through an image view (handle type: VkImageView).
So we might as well create image views for all our swapchain images here.
At this point the code should compile and run, and your renderer should proceed as usual.
Extra: Print Debugging Information
We might as well report a little bit about the swapchain when it is created:
If you were curious, this might also be a good place to also print out information about the min and max image counts supported by the surface, the current present mode, or the current transformation of the surface.
Cleaning up the Swapchain
To destroy the swapchain, we run the creation steps in reverse:
There are two subtleties to this code.
The first is that we call vkDeviceWaitIdle to ensure that nothing is actively rendering to a swapchain image while we are freeing it.
The second is that we don't need to call vkDestroyImage on the swapchain image handles, since these are owned by the swapchain itself.
(We do need to destroy the image views, since we created those ourselves.)
After this replacement, the code should again compile and run properly.
When testing, make sure to resize the window a few times so this code gets called repeatedly.
Running the Application
"But wait," you exclaim, "I've been running bin/main this whole time."
True enough.
What the title of this section references is RTG::run, the "harness" that connects an RTG::Application (like Tutorial) to the windowing system and GPU.
To understand the job of the run function, take a look at the functions it will need to call on an application structure:
So RTG::run will need to capture events (provided by the GLFW window system interface layer);
notify the application of swapchain changes;
tell the application about elapsed time;
and let the application know when it is time to render.
Let's sketch out the framework for these functions:
Notice that the core of our run function is a while loop that will run until GLFW lets us know the window should be closed via the glfwWindowShouldClose call.
Time Handling
The simplest thing that our run loop must do is let the application know about elapsed time.
We can do this by using the functions provided by the standard library in the std::chrono namespace.
To start with, we establish a time point outside the loop:
Now, every trip through the loop, our code gets the current time, reports the difference to the application, and updates the value of before:
Notice the line dt = std::min(dt, 0.1f); -- this is here to make sure that frame times don't fall into a black hole if the application's update function starts doing updates slower than real-time.
(To see why this might be a problem, consider what would happen without it if application.update(dt) started taking \( 2 dt \) seconds to return -- exponential update-time growth!)
Note that at this point the code will compile and run, but you won't be able to close the window (at least on some platforms) because we're not letting GLFW check for events.
Event Handling
GLFW already converts events into platform-neutral callback parameters;
our code just needs to capture these and translate them into InputEvents for consumption by the application code.
First, a quick look at the InputEvent union:
The crucial thing to notice is that this is a union, not a struct -- this means that the members are all overlapped in memory.
We use the Type type value (which is a member of every branch of the union) to determine which branch of the union is valid to access.
A more C++-y way to do the same thing would be to use a std::variant.
Regardless, let's get our event handling set up by making a vector in which to store events and installing event handlers to receive the events from GLFW:
Notice, also, the use of GLFW's window user pointer to pass a pointer to our event queue into the handlers.
Of course, we should definitely unset this user pointer and the event handlers at the end of the loop:
Then, during the loop itself we can tell GLFW to read events from the windowing system (which will call our callbacks), then drain our event queue into the application's event handling function:
Which reminds me we should probably write those event handlers.
I put these in RTG.cpp just above the run function, and declare them static so the symbol doesn't get exported where it might collide with functions in other files.
The cursor position callback generates a mouse motion event.
It needs to do a bit of extra work to generate the bitmask of currently-pressed buttons.
The memset here is to make sure that any parts of the union we don't write are in a known (and boring) state.
The mouse button callback is similar, with some added code to convert the press and release actions to appropriate Type values:
The scroll callback is probably the most straightforward of the bunch -- it just copies parameters into the event structure:
And the keyboard callback does a bit of type-determining but otherwise just copies values:
I chose to have the code ignore key repeats because they are almost never what you want to use in an interactive application.
(Key repeats behave very differently across computers and OSs, and users almost never know how to change the behavior.)
With these event callbacks defined, the code will compile and run (and you'll be able to close the window!); but it still won't show anything.
Render and Swapchain Handling
We are going to tackle the final two functions of the application together, for reasons that will soon become evident.
First, we need to call Application::on_swapchain before the loop starts to give the application the chance to create framebuffers for the current state of the swapchain.
We'll wrap this call in a lambda because we'll need it again in a few more places later in this function.
(But be sure to also call it here!)
On to the code in the loop.
To support rendering, we need to do four things: get a workspace that isn't being used, acquire an image to render into, let the application use the workspace to render into the image, and (finally) hand that image to Vulkan for presentation.
This is complicated by the fact that both acquiring and presenting an image may trigger swapchain recreation.
Let's sketch the code out:
Acquiring the Workspace
Acquiring a workspace means getting a set of buffers that aren't being used in a current rendering operation.
How do we know a workspace isn't being used?
Each workspace has an associated workspace_available fence, which is signaled when the rendering work on this workspace is done.
So our workspace acquisition code just pulls out the next workspace in the list and then waits until any rendering work that is using it finishes:
By the way, this workspace_available fence is exactly the one our render function passes to vkQueueSubmit to signal when the rendering work is done.
Acquiring an Image
Since images are owned by the swapchain (and, generally, the window system interface layer), we acquire them through a call -- vkAcquireNextImageKHR -- to that layer.
Unfortunately, the call can fail in two ways we need to handle gracefully, so we can't use the VK macro here.
Notice that we also pass a VkSemaphore to the vkAcquireNextImageKHR function.
It will queue work to signal this semaphore when the swapchain image with the returned index is ready to be rendered to.
Returning early like this allows render preparation code to proceed in parallel with the image finishing presentation.
(Recall, also, how we added code to wait on this semaphore to our queue submit code in Tutorial::render!)
Also, yes, I snuck a goto into the code.
If you are foundationally opposed, feel free to try another structure.
I've tried a few and I think this particular one is no worse.
Calling the Render Function
With workspace and image in hand, our code now assembles a RenderParams parameter structure and hands it to the application:
Presenting the Image
The present function needs a semaphore to know when the rendering work is done.
Thankfully, we passed one (image_done) to the rendering function to signal in just this circumstance.
Also note that we -- again -- need to do some careful return value handling.
And with that we should have the code compiling and drawing things again!
Let's celebrate by removing that warning comment from the input event header:
Admittedly, it's still a true comment, we just aren't using RTG_run in our code any longer.
Workspace Wrangling
Since we just got done with the run-loop code that uses the synchronization primitives in RTG::PerWorkspace we might as well write the code to create and destroy those primitives.
A quick reminder of the per-workspace data we need to create and destroy:
We create the fence and semaphores with -- as you would expect -- fence and semaphore creation functions.
The only subtlety here is that we pass the VK_FENCE_CREATE_SIGNALED_BIT as a flag when creating the fence, since our run-loop code waits on the fence to make sure the workspace is available.
If the fences didn't start signalled, the run-loop code would wait forever when trying to acquire the next workspace.
If you want to spice things up a bit, consider restructuring the loop so you can use the same two create_info structures for all workspaces.
Workspace Destruction
For every Create there must be a Destroy.
The compile should, again, compile and work without complaint.
Two more refsol:: functions done.
Initializing Vulkan
Finally, we come to the end of our tutorial.
And what better way to end than at the outermost resource scope -- the code that actually sets up and tears down Vulkan.
This code is primarily concerned with five items:
the Vulkan instance (VkInstance instance) -- the handle to the library;
the physical device (VkPhysicalDevice physical_device) -- the handle to the GPU;
the logical device (VkDevice device -- the handle to our code's view of the GPU;
the window (GLFWWindow *window -- the handle to the window our code is showing output in (managed by GLFW);
and the surface (VkSurfaceKHR surface -- Vulkan's view of the part of the window that shows our graphics.
Tearing down Vulkan
We'll start by writing the code that tears down each of these items:
The debug_messenger holds information about the callback function that we've been using to get information from the validation layer.
We'll create this structure along with the Vulkan instance.
Creating a Vulkan Instance
The first thing any Vulkan code needs to do is create a Vulkan instance.
This is the handle that you use to access the rest of the library.
When creating an instance, you also tell Vulkan about the extensions and layers you want to use.
Extensions are extra functionality that your Vulkan driver adds to support certain use-cases or extra hardware features; the Window System Interface that we've been using to create swapchains and present images consists of several(!) extensions.
Layers are extra functionality provided by libraries that sit between your code and the driver's Vulkan implementation; the validation layer that provides so much nice debugging information is one we've already been using.
Let's start by sketching out the creation function.
As you can see, we've got a bunch of extensions and layers to add into the respective lists.
We request extensions and layers by adding a pointer to a string containing the name of the extension to the lists that get passed to the creation function.
For extensions, there are helpful #defines for the names (so you can avoid typos); for layers you will need to type carefully.
If we request an extension or layer that is not available, instance creation will fail.
First, the portability layer extensions (which are only needed on macOS).
These allow our app to work on macOS through the MoltenVK translation layer between Vulkan and Metal.
If you aren't on an Apple system, add them anyway since you may want to run your code on a mac at some point.
(Yes, I'm skipping the debug extensions for the moment, we'll handle those below.)
Next, we'll handle the extensions needed for GLFW (the library we're using to get a window).
GLFW provides two relevant function calls for this process -- glfwVulkanSupported, which tells us if the GLFW version we're using can actually do things with Vulkan;
and glfwGetRequiredInstanceExtensions, which returns an array of extensions that GLFW wants.
The Debug Messenger
For debugging, we're using both the debug utils extension (which allows us to get debug messages delivered to a callback of our choosing), and the validation layer (which will check that our Vulkan usage comports with the specification).
The debug utils extension allows you to have debugging messages delivered to a custom logging function.
Let's write one (above RTG::RTG):
Note that the \x1b[... parts of the strings are ANSI escape codes;
these escape codes will ensure that compliant terminals print our error logging messages in color.
(If your error messages aren't in color, it is probably the case that you are running in Windows on an old build that doesn't support ANSI color in all terminals, or on a codepage that causes the escape sequence to be interpreted as something else.
You can fix the codepage thing by including a manifest; see this example build command and manifest.)
To install a debug message handler you call -- unsurprisingly -- a create function.
But the create function requires an instance.
So how do you get debug messages back from the instance create function?
The debug utils extension provides an interesting hack to work around this problem:
you can pass the create info you would use to install the debug handler as part of the pNext chain of your instance create info structure.
So let's define the debug messenger create info structure now:
And we can pass it to instance creation if debugging is enabled:
And, once the instance is created, we can call the debug messenger creation function:
Notice that, because vkCreateDebugUtilsMessengerEXT is an extension function, we need to get its address dynamically using vkGetInstanceProcAddr before we can call it.
This is likely to be a familiar bit of code if you have ever used OpenGL extensions.
At this point, your code should compile and run as before (with exactly the same amount of debug complaints).
Creating a Surface
The process of creating a window and surface are different on each windowing system.
GLFW abstracts the process to a pair of calls:
Selecting a Physical Device
To select a physical device, our code will look through all of the available devices and pick the best one (more on that in a moment):
To pick the best physical device, our code will either (a) look for a name matching the configuration, if one was specified or (b) look for a device with a high "score" for a simple scoring function:
As you can see, the scoring function just looks for any discrete GPU.
You might -- at some point -- want to refine this to look for specific features of interest.
Finally, we add code to print a nice error message if no GPUs are found:
Selecting a Surface Format and Presentation Mode
Now that we have a physical device and surface our code can determine what surface formats (i.e., storage format and color space) and present modes are supported by this combination.
This is also a good place to write some debug code to list all the supported surface formats and present modes.
Creating the Logical Device
Finally, we can create the logical device -- the root of all our application-specific Vulkan resources.
Part of creating the logical device is also getting handles to the queues we will be submitting work on.
So let's sketch that process out:
Queues come from various families.
We need to find a queue family (index) that supports graphics, and one that we can use to present on the supplied surface.
So we write code to list all of the queue families and check for the desired properties on each:
Note that the queue family variables have type std::optional< uint32_t > which is why we can both check them as bools (testing if they contain a value) and set them to indices.
Device and instance extensions are separate. We only need one device extension: the one that lets us create swapchains. (Well, and one for portability on macOS.)
There are also parameters in device creation for device layers, but these are deprecated because it wasn't clear what they were useful for.
Now that we know the indices of the queues we want, and the device extensions to enable, we can construct the appropriate create info structure and actually make the device:
Notice that we also take the time to get handles to the queue(s) after creating the device.
(If the queue indices are the same, this is just getting two handles to the same queue, but that's okay.)
Congratulations
You've now removed all of the refsol:: code from RTG.cpp and the build as a whole.
Nicely done.
Celebrate by removing the refsol.hpp include:
And then go even further by patching the refsol.o out of the build altogether:
The code should continue to build and run without problems.
Afterword
Wow! You made it!
You now know everything you need to know to make your own pipelines, shaders, render passes, and more.
Nice job.
If you have any comments on this tutorial, or even if you just want to show off some of the cool creative exercises you've embarked upon, feel free to file an issue against the nakluV repository.