Technical Basis of XR Virtual Production

2023-05-12 14:10:23 Ari

LED video wall, camera tracking technology and real-time graphics rendering technology are the three major technical foundations of XR virtual production. In recent years, these three technologies have developed rapidly, and the maturity of technology applications and the threshold for use have been continuously reduced, promoting the rapid development of XR virtual production technology applications and business.

1. LED video wall 

High-density small-pitch LED splicing screens can achieve a completely seamless display with natural and realistic colors, providing an excellent shooting environment for XR virtual production.

In XR virtual production, LED video walls are used instead of traditional green screens to display virtual scene content in real time, providing a faster and higher quality shooting method for virtual production. Through the LED video walls, actors, hosts, etc. can see the CG (computer animation) content in the virtual scene in real time and interact with them. At the same time, since the content of the virtual scenes are rendered in real time, the producer can make more creative visual decisions on the spot in real time.

At present, the more commonly used XR virtual production LED video walls generally consist of two adjacent vertical screens and one ground screen, as shown in the following figure.

SEEDER camera support equipment for Broadcast television industry

Due to the existence of the synthesis process combining virtual and real, XR virtual production has higher requirements on LED display performance. In the actual shooting scene, the effect of XR virtual production can be improved in the following six aspects.

(1) Reduce reflection

LED screens reduce reflection by adding anti-reflection treatment on the screen surface to ensure that the screen does not reflect light under strong light, while the LED surface is blacker and the image display contrast is higher, showing more realistic colors and shooting effects. 

(2) Reduce moire

In XR shooting, LED manufacturers often use LED display screens with special treatment to cleverly reduce the influence of moire on the display screen in the finished product, and finally present an immersive virtual studio effect.  

(3) Use low latency mode + Genlock function 

The low latency mode can ensure signal synchronization during virtual shooting. The Genlock function can be used to lock the processor to the camera shutter to avoid the black screen time between frames being captured by the lens during display refresh. Using the Genlock function can also synchronize multiple processor devices to avoid image tearing and misalignment. Now LED screens can achieve as low as 1 frame of delay, while using the Genlock function, the display effect is smoother and more compact.

(4) Use HDR 

HDR dynamic display can make the images finally presented by virtual shooting more delicate and realistic, and the colors and brightness are more restored. Now LED screens can already support 10bit/12bit color depth image input and 16bit image output, which better reflects the visual effects in the real environment. 

(5) Use 16bit LED video walls

The display bit depth is too low, which will lead to uneven color display. In low light conditions, large areas of detail loss will occur. The XR shooting uses a 16bit LED display up to 16 bits, which can better present image levels and details, and improve the effect of low light and high gray.

(6) Use ultra-high refresh rate 

The refresh rate of the display is too low, and scan lines will appear on the screen captured by the camera, unable to display a normal picture. Now LED screens can already support an exaggerated refresh rate of up to 7680Hz, making the screen perform better in the camera and accurately presenting real pictures.

2. Camera Tracking System

The camera tracking system is an important part of XR virtual production. The tracking system provides the virtual camera in the virtual engine with the position and posture information (six degrees of freedom, 6DOF) of the real camera in three-dimensional space, as well as parameters such as lens focal length and aperture, so that the movement trajectory of the virtual camera is synchronized with the real camera. The real-time rendering engine media platform renders the virtual scene content in real time according to the camera tracking data and outputs it to the LED video wall, so that the output virtual scene content changes in real time with the camera movement to synchronize parallax, and the pictures captured by the camera are more realistic, achieving the effect of shooting live scenes.

With the continuous development of domestic digital imaging technology, major breakthroughs have been made in domestic camera tracking system products. Related technologies have become more and more mature and promoted applications, greatly reducing the cost of using camera tracking systems and accelerating the application of this technology in virtual production. The technical characteristics of the camera tracking system mainly include the following aspects.

(1) Camera lens and pan&tilt head control

The optimized virtual camera crane circuit design can realize the compatibility of three types of lenses: traditional semi-servo control lenses, mainstream full-servo control lenses and lightweight DV cameras with the same controller, improving the interoperability of a single device with different brands of lenses.

(2) Pan&tilt head power circuit design 

The commonly used 12V DC motor has an idle starting voltage of more than 1V, resulting in a higher starting speed and poor low-speed motion characteristics of the pan&tilt head. The virtual camera crane uses high-quality motors that can start at an ultra-low voltage of 0.01V, uses highly efficient MOS bridge drive circuits without heat dissipation structure, and provides a high-quality power source for the system from the bottom up.

(3) Digital control

Virtual camera crane can adopt highly digital circuit control. Through hardware circuit design and program optimization, it ensures stability while providing more scalability and compatibility. The handle sampling circuit realizes analog-to-digital conversion, integrates the horizontal, pitch and Z-axis control amounts of the head, speed, damping, direction setting amounts and the lens focusing servo motor control amounts required for semi-servo lenses. Data sharing is realized within the system, thus realizing intelligent functions such as cable connection prompts and handle insertion detection.

(4) Pan&tilt head automation control

The camera crane provides pan&tilt head automation control functions so that the pan&tilt head can automatically track the marked features. At present, based on the mature control circuit design, domestic camera tracking systems optimize cable layout, use highly reliable encoder connections, and provide concise customer operation ends and human-computer interface designs, so that the VR tracking crane can be transported and quickly assembled as easily as an ordinary camera crane, suitable for not only studio recording but also longer jib arm and more mobile stage recording. 

3. Real-time Graphics Rendering Engine

The real-time graphics rendering engine renders and outputs content in real time according to camera trajectory data. As the camera moves, the content projected on the LED video walls also moves, creating a false sense of depth on the screen and extending the virtual scene beyond the space of the LED screen and combining with the camera to output the captured content. 

Real-time graphics rendering engines have been used in the broadcast television program production industry for many years. Companies such as Vizrt, Orad, Brainstorm, and SEEDER provide products and solutions in the field of virtual studios and live graphics. The above companies all have their own unique real-time graphics rendering engines.  

At present, the solutions used in XR virtual production are basically developed around unreal engine. For XR virtual production, there are several technical key points, specifically as follows:

(1) Accurately synchronize the projection of virtual scene content onto the LED video wall 

The real-time rendering scenes supported by the XR system are pure three-dimensional virtual scenes. All models in the scene have their own X, Y, Z and P, T, Z in three-dimensional space, that is, they have a spatial positioning in three-dimensional space. In the actual studio execution, the system will set up a virtual camera in the virtual scene. By combining the data transmitted in real time by the camera tracking system, the real camera and the virtual camera can be combined into one. 

In the XR system, multiple LED screens are opened through system opening for multiple LED screens. Corresponding windows of size and rotation are opened in the virtual scene. These windows are unified in spatial positioning, just like how to unify the positions of the virtual camera and the real camera as mentioned above. Multiple virtual LED screens and the corresponding real LED video walls are unified together. The sizes of these virtual LED video walls are completely based on the sizes of the real LED video walls. Numerical values are assigned in the system (these values are consistent with the units of the real LED video wall parameters).  

In this way, after completing the above operations, the following digital assets exist in the virtual scene: virtual scene (model), virtual camera, virtual LED video wall. And all of these are controlled by the unified camera tracking signal, which allows all digital assets to form a unified spatial logic during shooting.

(2) Color difference correction between inside and outside the screen 

Due to factors such as the LED screen controller and on-site lighting, the picture projected onto the LED screen through the SDI output signal of the XR server will have color differences from the final synthesized picture directly transmitted to the PGM through the SDI output signal of the AR server. This involves a unified color difference analysis and adjustment of the combined signals inside and outside the screen.

The XR system has a built-in Color Correction node, which can perform very detailed color difference adjustments to the output image. This adjustment can be as detailed as each chromaticity option for the output image, such as the numerical value of the color value under the R/G/B channel, brightness, gray level, contrast and a series of node operations. 

For the final output synthesized AR image of the XR system, to obtain an almost perfect color correction, the following processing needs to be done: output 5 colors, red, green, blue, white and black, for all output signals (including each screen of AR and XR). Through the final output of these 5 colors, the final output color of each screen and the AR part outside the screen can be seen through camera shooting in the final AR synthesized image. With the help of color analysis software, the specific numerical values of the colors of each screen area and the AR area in the final output image can be read, which can easily obtain the color differences between each screen and the AR area, and then unify the color values of each LED screen in the studio with the color values of the AR area through the correction software.

In the virtual system, a Color Correction node is added before the final output of each LED screen. With the digital adjustment of each node, the final output image of each LED screen can be infinitely close to the AR composite image, and thus obtain an almost perfect final composite image. 

(3) Limitations and challenges of the tracking system for XR shooting  

After obtaining the system output that can be projected on the corresponding LED video wall and the final synthesized image for single screen color correction, there is still a very big challenge in XR shooting, which is the need for an accurate tracking system.

XR shooting is different from AR and VR shooting. Since several virtual LED screen windows are set up in the scene, when the camera is moved (including movements on the X, Y, Z axes and movements on the P, T, Z), the LED video wall output image will adjust synchronously. When there is deviation in tracking, the inside and outside of the screen will have problems due to inaccurate camera tracking data.  Therefore, in the XR shooting environment, a high-precision virtual tracking system becomes critical.

The standard FreeD protocol tracking system cannot transmit the complete lens data file to the virtual studio system when transmitting tracking data. As the zooming, dollying and panning of the camera, the images inside and outside the LED video wall will not be unified and deformed. This is a common problem in XR shooting environments. 

To improve this effect, the XR system has two solutions: First, the system is configured with a built-in lens file calibration software tool. With this tool, the system engineer can perform separate lens file calibration operations for each lens actually used for shooting. Second, choose a tracking system that can directly transmit lens file data, such as SEEDER, Egripment, Stype or NCam. This type of tracking system can not only transmit standard X, Y, Z and P, T, Z data like the FreeD protocol tracking system when transmitting tracking data, but also transmit key data such as FOV, K1, and K2 of the lens used to the XR system. With these key data, we do not need to perform complex lens file calibration work and can directly obtain a very perfect tracking result.