Babylon Native in a Headless Environment
When we say rendering, we are often talking about rendering at 60fps for an application, whether it be a game or other application that utilize the GPU. There are, however, other scenarios where we might want to use the GPU to run processes that never display anything at all, such as processing a video, manipulating images, or rendering 3D assets, perhaps all running on a server. In this article, I will describe how such scenarios can be achieved with Babylon Native. Specifically, I will show an example of how we can use Babylon Native to capture screenshots of 3D assets using DirectX 11 on Windows.
If you haven’t already read some of our other stories regarding Babylon Native, especially if you haven’t heard of Babylon Native before, it might be worth some time to read them to gain some context before continuing.
Disclaimer: The API contracts used in this example is subject to change as the core team is still working on the correct API contract shape.
ConsoleApp
The example repo is located here. It uses CMake to generate Visual Studio projects targeting Windows. Babylon Native and DirectXTK dependencies are included via submodules and consumed in CMake. The DirectXTK dependency is used just to save a DirectX texture to a PNG file. The meat of the application is a file called App.cpp plus the JavaScript counterpart in index.js. Let’s dive into some details starting with the native side.
The Graphics Device
First, we need to create a standalone DirectX device.
This code is nothing unusual, but it can be tweaked to use WARP, for example, if the environment doesn’t have a GPU.
Next, we will create a Babylon Native graphics device using this DirectX device.
We must specify the width and height (1024x1024 in this example) as the Babylon Native device is not associated with a window or view as it typically is.
The JavaScript Host
And, of course, we must also create the JavaScript host environment, in this case using Chakra (the default for Windows), to load the Babylon.js core and loaders modules as well as the index.js mentioned earlier where the JavaScript logic sits. Afterwards, we also start rendering a frame which will unblock the JavaScript from queuing graphics commands.
Using Chakra is convenient with Visual Studio since we can add a debugger;
statement anywhere in the JavaScript code and the Visual Studio Just-In-Time Debugger will prompt with a dialog to debug the JavaScript. Note that the application must be running in the Debug configuration for this to work.
The Output Texture
We also must create an output render target texture for the outputRenderTarget
of the Babylon.js camera. First, we create a DirectX render target texture.
Then, we expose the native texture to JavaScript via a Babylon Native plugin called ExternalTexture
.
Note that because this a not a normal rendering application, we are explicitly rendering individual frames and thus we also need synchronization constructs (std::promise
in this case) to ensure correct order. As noted in the documentation for ExternalTexture, the ExternalTexture::AddToContextAsync
function requires that the graphics device renders one frame before it will complete. The addToContext
future will wait until AddToContextAsync
is called and FinishRenderingCurrentFrame
will render a frame to allow AddToContextAsync
to finish.
The JavaScript (Part 1)
Next, we will review the first part (startup
) of the JavaScript side. Ignoring the typical Babylon.js engine and scene setup, this function takes an argument called nativeTexture
which is the texture from the result of AddToContextAsync
. This argument is then wrapped using wrapNativeTexture
and added as the color attachment of a Babylon.js render target texture. We will see how this is used shortly.
The glTF Assets
Back to the native side, we are now ready to load the glTF assets and capture screenshots.
This might look long, but it is not too complicated. We are looping through each asset and calling the JavaScript function loadAndRenderAssetAsync
, waiting for it to complete, and saving a PNG to disk.
The JavaScript (Part 2)
The loadAndRenderAssetAsync
function, on the JavaScript side, imports the glTF asset, sets up a camera, waits for the scene to be ready, and renders a single frame. This should look similar to what would happen for a web application using Babylon.js!
The output render target of the camera is assigned the output render target texture from earlier so that the scene will render to this output texture instead of the default back buffer which, of course, doesn’t exist in this context. This, in turn, will render directly to the native DirectX render target texture we set up earlier.
The Result
Building and running the ConsoleApp example looks like this.
Along with three PNG files.
The RenderDoc
There is one more thing! Notice the helper function calls to RenderDoc::StartFrameCapture
and RenderDoc::StopFrameCapture
? These will tell RenderDoc to start and stop capturing a frame since RenderDoc won’t know when a frame starts or stops since we are not in the typical rendering case. We can turn on RenderDoc capture by uncommenting one line in RenderDoc.h
. Using RenderDoc is incredibly useful for debugging issues with the GPU.
Conclusion
I hope this gives you an idea of how Babylon Native can be used in headless environment. It is not the typical scenario, but it is a scenario that is more difficult or more expensive to achieve using other technologies. We will continue to strive to make Babylon Native useful in as many scenarios as possible.
Gary Hsu — Babylon Native Team Lead