EVERYTHING TECHNICAL ABOUT METRO EXODUS UPGRADE FOR PLAYSTATION 5 AND XBOX SERIES X|S

Metro Exodus - Xbox Series X|S & PS5 Launch Trailer

IT'South OUT!

We're thrilled to allow y'all know that the Metro Exodus next gen upgrade for PlayStation 5 and Xbox Series 10|S is OUT NOW. This is a perfect chance to swoop into Metro Exodus since this upgrade includes generational advancements you will read virtually further in this blog. On top of that, it's a costless upgrade for current owners.

Moreover, nosotros're happy to share that Metro Exodus Consummate Edition is also out now. This is the definitive concrete version for Xbox One|Xbox Series 10 and PlayStation 5, and includes the Metro Exodus base game as well as the ii story expansions – The Two Colonels and Sam's Story. Please check with you local retailer for the availability.

To keep the discussion on the technological improvements we presented in our previous blog on Metro Exodus PC Enhanced Edition, this weblog will swoop into our adventures bringing the Enhanced Edition features of Metro Exodus to ninth generation consoles.

The original Metro Exodus (captured on PlayStation 4)

Metro Exodus Adjacent Gen (captured on PlayStation v)

IMPLEMENTING RAY TRACING ON GEN 9 CONSOLES

IC1_0352-Edit.jpg

When we establish out that the Gen 9 consoles would support Ray Tracing, nosotros were very excited. This level of technical capability opens up many options for us regarding how we accelerate our Ray Tracing engineering and bring that technology to as many fans of our games as possible. Also, it has established new approaches for how we can now build our future titles. For a while, in that location was a real business concern that the consoles, nevertheless astonishing they sounded, wouldn't be able to provide loftier-quality Ray Tracing at a decent price point, and then at that place were fears that there would be a need to support two completely different classes of technology in a tandem.

The big unknown was the exactly how much of a performance toll Ray Tracing on consoles would incur. If nosotros compare, for example, the ninth generation Xbox Series X to eighth generation Xbox One X, Xbox Series X has a Graphic Processing Unit (GPU) that is capable of performing at approximately twice the speed of that of Xbox One X. The metric nosotros measure this by is known as the teraflop (trillion floating point operations per second), essentially the number of unproblematic arithmetic operations that the processing unit tin can perform in a second. For instance, Xbox Series Ten is capable of performing approximately 12 teraflops and Xbox One X operates at almost half dozen teraflops. It looks like a generational leap in operation until you realize that while y'all are doubling the speed of the processor you are as well asking it to do its job in half the time since you are now targeting 60FPS instead of the 30FPS that we were targeting previously.

The teraflops metric is a proficient proxy for the GPU performance on a compute-heavy game such every bit Metro Exodus, simply information technology is not everything. Different issues tin can still be more or less pronounced on dissimilar devices. Bottlenecks, elements of a given characteristic which rely heavily on one part of the hardware and significantly underutilize others, can occur in different places and for different reasons. On the surface though, the default, non-raytraced port of Metro Exodus seemed to be quite doable. It would accept been a reasonably like-for-similar comparing: double the ability, half the time, no major concerns there. Adding Ray Tracing into the mix, however, was still questionable at the time.

That was the riskiest part of the decision, but nosotros decided to pursue the Ray-Tracing-only road until we could test the real performance on real devkits. In any instance, it was articulate that nosotros had to make everything run faster and be more scalable at the same time no matter what course we ended up taking.

PLATFORM-SPECIFIC TECHNOLOGICAL IMPROVEMENTS

Lighting Artist demonstrating various lighting setups

DDGI

DDGI (every bit detailed in our previous blog post), the system that now underpins our recursive bounce functionality, was originally implemented with the master motivation that it was a much cheaper characteristic computationally, and in principle it could have stood in equally a console-based replacement for our precise per-pixel Ray Traced Global Illumination (RTGI), if we fell short of our performance targets. Given our concerns virtually the capabilities of pieces of hardware, which none of the platform holders had nonetheless released to developers, and equally such were still completely untested, it was definitely the sensible decision to err on the side of caution. Ultimately though, bodily testing told a different story, and we were able to set these reservations aside: all Gen 9 platforms were quite capable of running the full implementation of per-pixel RTGI. See our previous commodity for more detail on our new cross-platform RTGI implementation.

Denoiser

As one of our earliest enhancements, we overhauled our denoiser. The original raytraced Metro Exodus' denoiser was of high quality merely was quite heavy on the GPU and its cost was far from being constant. While this improvement was initially a benefit to the PC version, it was nonetheless a key part of our power to hit our console targets. That said, at that place was really a number of general Ray Tracing performance optimizations developed specifically for the consoles, which later managed to find their way back to the PC.

Ray-binning

How nosotros organized and dispatched our Ray Tracing tasks benefited from an improvement known as ray-binning. This system collects together groups of rays which are cast (travelling, if you prefer) in roughly like directions to 1 some other and processes them together in batches. We refer to such batches as thread groups: a group of processes operating in parallel on the GPU. Grouping rays, co-ordinate to their direction like this, decrease the probability that the directions of the rays will become spatially divergent within any given thread group. Divergent Rays, as such cases are known to be terrible for performance because very literally each goes off and does their own separate thing. They can't share any data they find most the scene with other rays and effectively perform all of their calculations independently.

The cardinal benefit we receive from grouping our rays according to their direction and then, is that it makes it more than probable that many of the rays existence candy in a unmarried batch volition be evaluated against the same pieces of scene geometry. Simply put, if they are pointing in the same management then it is more than likely they will hit the aforementioned objects and so can share their findings within their group. They work faster, if they cooperate. If we have a scenario where 1 or more rays does land upon the same object in the scene (samples the aforementioned piece of geometry), and so that group will ultimately have to load in less information from memory, which tin can potentially reduce retentivity bandwidth force per unit area.

The original Metro Exodus (captured on Xbox One)

Metro Exodus Next Gen (captured on Xbox Series Ten)

Different LoDs (Levels-of-Detail) of a mesh

Bounding Volume Hierarchy (BVH)

Objects stored in our scenes' Bounding Volume Hierarchy (BVH) now support Levels-of-Detail (LoDs) to ease scene reconstruction and to reduce the amount of geometry that rays are ultimately tested confronting.

How you lot organize your scene'southward geometry is critical to real-fourth dimension Ray Tracing. If you take a warehouse full of items, it's no good simply tipping them in by the truck load and so having sift through the mess every time you want to take something out. You demand to take some kind of a system to organize everything. In a game, you don't desire to test against every unmarried polygon (there are literally billions of those). So, you subdivide your scene into partitions called Bounding Volumes (literally a petty rectangular box into which yous put a handful of polygons). You can then cheque to run into, if a ray even hits ane of these Bounding Volumes (boxes) before you bank check each polygon that information technology contains. If you don't striking the box, you certainly won't hit its contents. That'south not enough though. If yous can gather together a handful of polygons, you tin gather together a handful of the Bounding Volumes into larger groups (put them in a bigger box) and exam that group showtime. The higher up in this nested hierarchy you go, the more of the scene yous can disregard from your calculations with a single test against ane of the larger partitions: by subdividing your scene this way, you can ask, if a ray fifty-fifty approaches whole regions of the game earth before you go in and cheque against the finer details. To wrap up the warehouse analogy, put an particular in a box, put the box on a shelf, locate the shelf in an aisle, inside a edifice with an address. Then you can tell where the item y'all are looking for is, among the millions inside, just from the sign on the door.

This process of dividing the scene into dissimilar regions with different pieces of geometry in each exponentially reduces the time taken for any given trace, but it does hateful that much of the cost of Ray Tracing is at present incurred by evaluating the potential intersections betwixt rays and the boundaries of the scene partitions (the Bounding Volumes). Nosotros call those particular tests Bounding Box tests (the procedure really is very analogous to boxing upward bigger and bigger parts of the scene in memory in order to make them easier to organize, address, observe, and sort later).

Reducing the geometric complexity of individual objects can then potentially reduce the number of levels of the hierarchy required to optimally partition that object (divide it up so that you can sort through geometry as fast as possible), besides equally potentially make the reconstruction of that hierarchy faster when objects in the scene motion. This organizational structure underpins everything in Ray Tracing, so it is vital that nosotros do all we tin can to brand sure that information technology is built well, equally rapidly equally possible, and as early in the frame as possible. Employing aggressive Level-of-Detail optimizations as well helped our rasterization performance so this was a double win.

FP16

Half-precision floating point arithmetic (or FP16) refers to arithmetic operations which are performed using one-half of the standard number of $.25. Reducing the number of bits used to store individual values reduces the precision and accuracy of those values, merely also greatly reduces the physical complexity of the hardware needed to perform operations on them. Sometimes, that extra precision is absolutely essential to the thing you are trying to summate, sometimes information technology is not and there are serious operation gains to exist made, if y'all tin can identify what level of precision you actually need.

For case, I don't need to know how far my commute to piece of work is down to the nearest millimeter: kilometers will usually practise merely fine in such a instance. In a game setting, if a big object is out of place by the width of a human hair, information technology doesn't usually show any more than than if it was out of identify by just a few atoms. No-one looks that closely – no-1 fifty-fifty can. A 12 teraflop GPU using full precision arithmetic is actually a 24 teraflop GPU when doing exclusively FP16 math, although that is under ideal circumstances, of course. Then, we near always come across performance gains whenever and wherever we find an opportunity to reduce the precision of our arithmetic.

Temporal Reconstruction and Up-sampling

Temporal Reconstruction and Up-sampling was a component that was critical to get correct as it was quite clear from the first that the operation (with Ray Tracing) would be much more than variable than before. Temporal Reconstruction is the process of taking information from final images produced by previous frames to help you identify how features of the current epitome should look. Up-sampling is the procedure of taking a smaller image and blowing it up to a larger size while attempting to keep the aforementioned level of quality (not blurring or distorting it in any way). Our console versions have ever targeted rock-solid stable, Five-Sync-locked framerate, so that means that we demand a way to stabilize frame return time even as the computational load varies considerably. Rendering at lower internal resolutions and upscaling provides united states of america with a solution to this trouble just reconstructing a college resolution prototype from the data available in a lower resolution image requires a lot of work in order to make sure that the image quality remains as high as possible. A lot of work has gone into balancing this technique both for the final quality (which is to say the quality of the entire output image) and the speed at which this process operates.

The original Metro Exodus (captured on Xbox One)

Metro Exodus Side by side Gen (captured on Xbox Series 10)

Other noteworthy improvements

A fair corporeality of the piece of work of bringing a game to console isn't in the implementation of large features, though. Information technology is in the fine tuning to account for the specific nature of the hardware you are working on. An example could be little tweaks to make pieces of code more friendly towards the machine's retentiveness configuration. The denoiser's recurrent blur (the gradual refinement of lighting information from frame to frame) and our pre-trace laissez passer (a fast but less accurate, screen-infinite equivalent of Ray Tracing that nosotros use to capture the results of very short rays) were configured to have business relationship of the exact cache sizes of the device at mitt.

Some enhancements became available to us equally optional platform features on the Gen 9 GPUs such as Draw Stream Binning Rasterization (DSBR), which provides additional means of culling subconscious geometry and reduces depict-primitive pressure level and retentivity bandwidth. Delta Color Pinch (DCC), which is a lossless form of colour compression, can exist enabled on a per-render target basis once again to convalesce retentiveness bandwidth force per unit area. DCC did actually feature on Xbox Ane X, only in a more limited form. The updated version has a lower operation impact making information technology a more viable option. Variable Rate Shading (VRS) would be another feature that has only become available this generation, merely which can, with relatively niggling effort, almost always provide some degree of a performance increase.

The original Metro Exodus (captured on Xbox One)

Metro Exodus Next Gen (captured on Xbox Serial X)

WORKING WITH THE HARDWARE

All of this was originally developed on Nvidia's RTX 20-series GPUs, with fore-knowledge of the features that would exist available to us on the upcoming consoles, while nosotros were waiting for devkits to be manufactured and distributed. One time we had admission to the physical units, we were able to begin porting this work over.

The starting time frames running on bodily devkits revealed that nosotros had been somewhat overly optimistic in our initial estimations of the expected Ray Tracing performance. That was not really an issue though as denoiser's frontend (it'due south temporal accumulation component) had already been made scalable, allowing for whatever resolution input while still running at full resolution itself. Nosotros halved per-pixel ray-count and it soon became apparent to us that all of the major features nosotros were hoping to implement could then actually be viable!

So in that location were several months of continuous optimizations, async compute rebalancing, hunting the bugs as they ever crawl out when new hardware and new software development kits (SDKs) are introduced. The SDKs were constantly changing also, bringing in enhancements, some of which were specifically requested by usa for the platform holders to implement. On the fun side, due to the specific way we utilize Ray Tracing on consoles, we are tied to exact formats for the acceleration structures in our BVH: formats that were irresolute with well-nigh every SDK release. Constant reverse-engineering became an inherent role of the work-cycle. :)

Speaking of async compute, our typical 16ms frame has about 12ms of work on async-queue. And all of the workloads were clogging-balanced for Gen nine GPUs, co-ordinate to what exactly was running on graphics-queue. That made a huge difference performance-wise on the Gen 9 consoles (on the order of beingness xxx% faster). This is axiomatic in the PC Enhanced version as well, with a typical 18% functioning gain on Nvidia hardware compared to async-off.

Equally for our own technological progress, the first frames rendered in a relatively simple level (the games menu screen) were running at an internal resolution of less than 2 mega-pixels at 60FPS. This gradually climbed upwardly and up with each week of piece of work to the current 4K at 60FPS output resolution with typical internal dynamic resolution hovering effectually 5 mega-pixels on our heaviest scenes (on Xbox Series Ten and PlayStation 5).

The original Metro Exodus (captured on PlayStation four)

Metro Exodus Next Gen (captured on PlayStation v)

FUTURE Panel DEVELOPMENT

We saw many strides in functioning during this phase of console optimization, many of which gave us a cause to rethink our approaches to certain solutions. Nosotros've remained very conscious of the fact that we were aiming to have the consoles providing a consistent 60FPS experience for this title, and, with that in the back of our minds, the gradual performance improvements allowed u.s.a. to besides include more and more features. Despite superficial differences in the actual layout and approach of the platform-specific Application Programming Interfaces (APIs), all platforms are now running remarkably like CPU and GPU code, and have managed to maintain a very consistent feature set.

The groundwork has been laid though and we have successfully brought a product to the 9th generation consoles complete with substantially our entire Ray Tracing feature set. This sets a baseline for this generation's future projects. Nosotros mentioned that we had initially thought of some features as potential fallbacks solution to be maintained alongside superior and steadily evolving equivalents on PC. This wasn't the case, just it could have been, if the consoles weren't as expert equally they are. If it had been the example, then so many members of the team would have been hitting by the massive increment in workload that comes with working with 2 separate and distinct systems in parallel. Instead, we now accept a new gear up of standards to base our work on that are consistent across all target platforms.

Every bit it stands so, we can say for sure that projects of this generation, across all targeted platforms, volition exist based off of this raytraced feature set. That is great news for the end result: it is assuasive us to produce scenes with the highest level of graphical fidelity we take ever achieved and that is what the public gets to see, though these features are just as of import behind the scenes.

The original Metro Exodus (captured on PlayStation 4)

Metro Exodus Next Gen (captured on PlayStation 5)

There is a reason why we have always been and then vocally critical of the idea of baking assets (pre-generating the results of things like lighting calculations) and shipping them as immutable monoliths of data in the games package files, rather than generating as much as possible on the fly: everything that you pre-calculate is something that you are stuck with. Not "stuck with" in the sense that if information technology is wrong it tin can't exist stock-still (everyone loves a 50GB patch after all) but "stuck" in a much more limiting sense – that any function of your game, any object in the scene that relies on baked avails will exist static and unchanging. You won't be able to change the mode information technology is lit so you have to be overly cautious with decisions about how the histrion can touch dynamic lights (you won't be able to move it, so you disable physics on as much as possible), and the player can't interact with it in interesting ways, so y'all laissez passer that problem onto UX design.

The more than you have to rely on baked avails for your scenes, the more you restrict your game design options, and the more you accept the take a chance that your environments will experience rigid and lifeless. Perhaps, the biggest advantage that Ray Tracing brings is that it gives game developers a huge boost in the direction of worlds that are truly, fully dynamic, with no dependencies on pre-computed assets whatsoever. At that place are nevertheless similar examples where such problems demand to be solved, but lighting was one of the biggest and well-nigh all-encompassing examples of the lot and Ray Tracing solves it.

Lighting Artist hand-placing lights

Lighting Creative person enabling RT

It doesn't but come down to what y'all end upward shipping either. Game development is by its very nature an iterative process. You lot demand to have some plan for where you want to accept a project from the start, or course, just in one case yous brainstorm working on assets and developing features you ever test them as role of the larger context of the game, and more frequently than not this leads to tweaks, refinements, and balancing. Even that might not be the cease of it, other features can come up forth irresolute the feel and myriad dissimilar ways leading to notwithstanding more alterations and adjustments. For this process to work developers need an environs to work in that is intuitive, responsive, and every bit shut a representation of the main game as possible. Our editor has always run basically the aforementioned simulation as the final game, but technological advancements seen in this generation streamlined the procedure significantly. Testing environments are quicker to set up with fewer dependencies on assets or on the work of other departments. Changes in visual design are visible (in their last grade) immediately. The physical simulation on the whole feels more like a sandbox in which ideas can exist tested and iterated upon. This makes for a more comfortable and fluent development feel, more conducive to inventiveness and experimentation.

The true effects of all this will accept longer to exist realized. The boundaries of what you can (and can't) practice in a video game have been shifted and design practices will take a while to feel them out and to fill them. But ultimately, what we see is the promise of more than dynamic and engaging game worlds, with fewer limitations, which can exist developed in faster and more intuitive ways. On the player side, all that translates to hopefully more content without the usual associated spiraling evolution price, and to richer, more believable game experiences.

Scene lit using analytical DI in the original The Ii Colonels (captured on Xbox One)

Scene lit using RTGI in The Two Colonels Next Gen (captured on Xbox Series Ten)

We're looking frontward to WORKING FURTHER WITH all these exciting improvements, and developing them in our future projects!

- The 4A Games Squad