Mixing Neural Style Transfers post processing Effects with Babylon Native rendering
Babylon Native is a technology that allows native applications (usually developed in C++) to perform rendering using Babylon.js scripts. This allows not only for native applications to have access to the powerful rendering techniques available in the web, but also for some really cool interactions between Babylon.js and native APIs.
Since our application is native, we have access to all sorts of platform specific functionally and we have a lot of freedom on how to integrate those with the Babylon Native rendering.
To illustrate that interaction, I’ve prepared a small sample project on how to use the Windows Machine Learning APIs to perform a style transfer AI function into a scene that was rendered by Babylon.
Neural Style Transfers
Neural Style Transfers (NST) are AI based algorithms that allow a particular artistic style to be applied over an Image. It requires the training of a neural network model, but once the model is trained it can be applied to any image.
For more information about NST you can find more information in the following link:
The goal for this project was to be able to apply this type of effect on top of a Babylon Native rendered scene. For doing that, we will be rendering the scene into a Render Target and then feeding that render target image into a Windows Machine Learning style transfer model and rendering the result into the screen in real time.
Windows Machine Learning API
There are many API consuming neural network models, some of them are cross platform such as Tensor Flow, other are part of the platform specific API provided by the Operating System.
In order to facilitate the integration with Babylon Native, I have choosen to use Window’s Machine Learning (WinML) API that allow for an easy integration with DirectX 11.
Windows Machine Learning is capable of consuming ONNX models and can run its neural networks on either the CPU or the GPU.
Introduction to Windows Machine Learning | Microsoft Learn
Since the focus of this article is in the integration between WinML and Babylon Native, we will be using the models that are already trained and provided as part of the WinML samples GitHub repo:
Integrating Babylon Native and WindowsML
When doing Babylon Native examples for Windows, we usually create our own D3D11 Device and then provide that device to Babylon Native. However, since we want to be able to share resources between WinML and Babylon Native, this time we will allow WinML to create the device for us:
We will than initialize the require Babylon Native D3D11 graphics objects using this device:
We also need to initialize the models from the ONNX files we have in our solution.
After that we need to bind an output image to each model session.
Finally, we can run our render loop and copy the result from the Babylon Native render target into a VideoFrame, use that video frame as input for the WinML LearningModelBinding and copy the resulting output from LearningModelBinding into our output window.
Finally, we can now run our application and see the multiple style transfer effects been applied to the scene at real time!
Not only that, but we can also switch between multiple effects instantaneously, since they are just a post processing effect. The images below show all the effects been applied to the model from the same view angle.
Conclusion
This blog was intended to show how flexible Babylon Native is and how easily it can be integrated with multiple pipelines and workflows. Since we have full access to the native graphics API objects and our application controls how Babylon Native interacts with them, we have the possibility of integrating it with a wide range of technologies.
I hope people find it interesting and have as much fun playing around it as I had building it.
Please feel free to leave any comments or suggestions. The code used in this example can be found in Babylon Native Examples repo:
Sergio Ricardo Zerbetto Masson — Babylon.js Team
(2) Sergio Ricardo Zerbetto Masson (@ZerbettoMasson) / Twitter