Left to right: AMD’s chip leaders Joe Macri, Andy Pomianowski, Sam Naffziger and Mike Mantor.
Image Credit: GamesBeat/Dean Takahashi
Did you miss a session from GamesBeat Summit Next 2022? All sessions are now available for viewing in our on-demand library. Click here to start watching.
Advanced Micro Devices starts shipping its AMD Radeon RX 7900 XTX and RX 7900 XT graphics cards built on its RDNA 3 graphics architecture on December 13. The launch comes a month after AMD launched its Ryzen 7000 processors based on the Zen 4 architecture for central processing units. And along with last week’s AMD 4th Gen Epyc processor launch it marks a time of renewal for AMD.
Often an also-ran in the past in both processor and graphics chips, AMD has been executing well and putting pressure on rivals Intel and Nvidia. I recently attended a technical briefing with AMD in Las Vegas and listened to talks about the approach for the new graphics chips, which will sell for $999 (XTX) and $899 (XT).
Those prices are definitely cheaper than Nvidia’s high-end graphics card, the GeForce RTX 4090, which started selling for $1,600 in September. The performance isn’t as fast as Nvidia’s high-end card, but AMD said its designs will pay off in cost and power efficiency benefits over time. Lisa Su, CEO of AMD, announced the new cards and those benefits a couple of weeks ago.
AMD assembled a panel on RDNA 3 that included Joe Macri, senior vice president and corporate fellow; Sam Naffziger, senior vice president and corporate fellow; Mike Mantor, corporate fellow; and Andy Pomianowski, corporate vice president of silicon design and engineering. Macri said that thousands of people worked on the RDNA 3 architecture and accompany chip designs over years.
Su said AMD’s goal is to achieve cross-platform gaming leadership, with design wins not only on the PC but with gaming-specific hardware such as Valve’s Steam Deck, the Microsoft Xbox Series X/S, and the Sony PlayStation 5. And it was the job of the chip designers to come up with a flexible and efficient architecture to deliver across a wide number of platforms.
Naffziger led a team that developed the use of chiplets in both central processing units (CPUs) and graphics processing units (GPUs). Chiplets are a way to keep advancing chip design in light of the slowdown of Moore’s Law, formulated in 1965 by Intel chairman emeritus Gordon Moore. He predicted technology would progress so that chip makers would be able to double the number of components on the same-size chip every couple of years. And that would improve performance (as dense chips shorten the distance that electrons have to travel) and reduce costs as well.
But Moore’s law has either halted (as suggested by Nvidia CEO Jensen Huang) or just slowed down (as noted by Intel CEO Pat Gelsinger). Part of the reason is that the circuitry just can’t be miniaturized so easily now. The layers for the physical structures are just a few atoms thick now, and the width between circuits is around 5 nanometers, or five billionth of a meter.
Naffziger said the whole chip industry is likely moving to chiplets because the manufacturing improvements that have driven the chip industry for decades are diminishing.
“The technologies are slowing down at different rates,” Naffziger said.
And so AMD foresaw some of these challenges coming. It tried out chiplets first with its Ryzen processors and now it is adding them to Radeon graphics. With the latest Radeon chips, it targeted the 5-nanometer manufacturing process for the graphics chip and 6-nanometer production for six accompany chiplets that provide fast-access cache memory.
All of those chips are packaged together in the same module so that it is easier to create fast connections between the processing and memory components. Su said the designed helped achieve the 54% higher performance per watt, 18% higher frequency, 2.7 times peak bandwidth at 61 teraflops, and two times instructions per clock compared to the prior generation.
The graphics compute die (GCD) is built with 5nm, while the memory cache dies (MCD, a total of six of the chiplets) are built with 6nm manufacturing. That enables it to get a bandwidth of 5.3 terabytes per second and 24GB of GDDR6 graphics memory.
All told, the RDNA 3 graphics chips will have as many as 58 billion transistors, Su said.
“That gives us an incredible amount of gaming performance,” Su said.
The cards are optimized for high-resolution gaming with unified compute units with 165% more transistors per millimeter squared, said Naffziger.
“Chiplets are the right process technology for the right job,” he said.
With chiplets, the main challenge is where to partition the chips into separate pieces, or die. That’s because electrical signals can usually move quicker on the same chip. When you go off the chip to fetch some data out of memory, it can cause a long delay. But that’s why the chiplets are placed close to the GPU and the pathway connecting them is as fast as possible.
“Our packaging team worked on this with our circuit design team,” Naffziger said. “We had to address latency,” or the interaction delays that happen and often hobble interactions in games.”
They enable a second-generation Infinity Cache in the MCDs, which means the graphics chip uses less power and less time by getting data from the cache rather than going off the chip to DRAM memory, he said. Overall, the features enable the processing of times more instructions per given clock cycle.
The new dual media engine with two simultaneous encode and decode streams, along with AI-enhanced video encode. So it has 1.8 times engine frequency, reducing export times in half for video. It has decoupled clocks in different parts of the chip, with front-end running at 2.5GHz and shaders running at 2.3GHZ. That gets 15% frequency improvements and 25% power savings, Naffziger said.
The card is up to 1.7 times faster on Cyberpunk 2077 and 1.5 times faster on Call of Duty: Modern Warfare II. It has 96 compute units, a 2.3GHz clock, 24GB of 384-bit GDDR6 memory, and enables 2.1 DisplayPort, AV1 encode and decode. And it consumes 355 watts of power. It has the same footprint as the RX 6950 XT card. You pull out the old card and put in a new one.
At 1440K, the card can run Apex Legends at 300 frames per second. You can play Assassin’s Creed Valhalla — Dawn of Ragnarok at 96 frames per second using DisplayPort 1.4. DirectX ray tracing will come to Halo Infinite on AMD hardware.
Nvidia’s 4090 comes in at around 440 watts, while the (unlaunched) 4080 comes in at 320 watts and $1,200.
Herkelman said in a Q&A that AMD’s graphics chips are competing in the segment for gaming graphics chips below $1,000, meaning Nvidia’s 4070 graphics chip and 4080 chips are the targets as opposed to its Nvidia GeForce RTX 4090 flagship graphics chip unveiled in September.
Frank Azor, chief architect of gaming solutions and marketing, said AMD is updating its Adrenaline software for running the games with a unified user interface and an unlocked experience with no registration required. With Open Broadcaster Software, AMD has partnered to enable better video recording and streaming. The AMD SmartAccess Video distributes encode and decode workloads across both the CPU and the graphics chip to get a 30% uplift in video processing.
Azor said esports players will feel confident enough to go beyond 1080p and still get enough frame rates, with graphics up to 4K gaming at the highest speeds. Herkeleman said he is hopeful that gamers should be able to get their hands on the cards in December and more afterward, despite the topsy-turvy supply chain picture of recent weeks
AMD also announced several updates to its software suite, including the next iteration of the popular FidelityFX Super Resolution (FSR) temporal upscaling technology, FSR 2.2, expected to be available in Forza Horizon 5 on November 8, 2022. AMD also announced that it plans to release FSR 3 featuring AMD Fluid Motion Frames technology in 2023.
AMD also announced the AMD Advantage for Desktop PCs, which brings the AMD Advantage framework to desktops, fusing together the top-of-the-line AMD Ryzen 7950x processors and AMD Radeon RX 7900 XTX graphics cards with AMD Software: Adrenalin Edition and AMD smart technologies to deliver the ultimate platform for gamers and creators.
How it started
Mantor was one of the first to get started as chief architect on RDNA 3.
“I had pretty grandiose ideas about what we should do and the process really starts with us having some meetings and brainstorming and how we can substantially change our position in the market,” Mantor said. “We looked at the people we have, how we can move the technology, and then we spend a good deal of time with our business unit partners in understanding the cost structures and how they want to create products so that we can be successful in the market.”
Pomianowski and Naffziger were also very involved in the product planning, as were hardware and software partners. AMD’s designers wanted to see where the shortcomings were in the RDNA 2 architecture and where they could make improvements in the next generation, Mantor said. The team did competitive analysis of rivals to see where they would land in the future.
“We’ve looked at all the technologies inside our company and outside of other companies, and how we can really change the game this time,” Mantor said.
Naffziger was very involved in bringing chiplets to the Ryzen CPUs, and he initiated some “massive thinking about chiplets,” Mantor said.
“Chiplets became an important ingredient,” Mantor said. “We look at the trends in silicon and that was going to be there. Another thing that was really strong in our minds was how AI technology intersects in the gaming consumer market. That was a big part of what we were thinking that.”
And of course, AMD focused on raytracing, where rays of light are traced and used to construct what is visible to the user in a given 3D scene. Nvidia had focused heavily on the tech. The Unreal Engine technology was also leaning on micro polygons. And all of that informed the design team, Mantor said.
With RDNA, AMD made big leaps in frequency and performance per watt. Those remained big pillars for RDNA 3 and the company also wanted to reduce the area of the chip to lower costs and improve power efficiency.
“The goal was to make every transistor count,” said Mantor. “How do we utilize the transistors in the chip and make it better?
The teams across architecture, design, and manufacturing try to improve on all fronts, but the big decisions have to focus on where the technologies will be at a given time when they debut in the market.
“As engineers, we like to have a clear answer and one right thing and move towards it,” Pomianowski said.
This isn’t easy because of the “sheer complexity of workloads” different kinds of games and different players, Pomianowski said.
“There’s no one right size so we’re trying to identify things and optimize for the things that people value,” Pomianowski said. “How do we give gamers the best value for their money?’
Naffziger said AMD is lucky to have good competitors because it drives deep innovation.
“It really makes you wake up in the morning drive fast into work,” Macri said. “It gives you those light bulb moments.”
Naffziger said he knew the team could not meet its aggressive targets without doing something different, and that was where the chiplet ideas arose.
“I started scratching out a little, you know, hotel pad there” at a retreat, Naffziger said. “And on a napkin.” And he passed it over to Pomianowski.
“It’s amazing how much architecture is done on napkins,” Macri said.
What followed were a million-person hours of work to get the product into the market, Macri said. They had to figure out how to tackle challenges such as latency. With chiplets, once you break the processing into different chips, the interfaces between chips have to be faster enough to not add to the latency, Macri said.
“We knew that was going to be one of the challenges with jeopardizing a GPU because the bandwidth requirements are dramatically higher than and yet the architecture is very sensitive to latency,” Naffziger said. “So we pulled teams together. We walked through every cycle at the hundreds of cycles from GPU and back to the Infinity cache. At the end of the day, we lowered latency.”
No transistor left behind
Pomianowski said the designers focused on a unified compute engine. That was the way to make sure that transistor are being used all the time and aren’t sitting idle, even with the variety of gaming workloads. He said the team thought of this as “no transistor left behind.”
The team also had to consider that it wanted developers to have a simpler time writing code to run on the machines. The team also added some logic to make its multiplayer and adders function in a more general purpose way, Mantor said. But they didn’t want to add too many transistors that would slow down the machine. The team also had to figure out whether to add dedicated AI processing features.
They also decided that raytracing is here to stay and they had to improve it and do it right, Mantor said.
Ultimately, the team had to make compromises and push some goals out into the future and figure out exactly what it could deliver in 2022.
“We have to deliver this product,” Macri said. “We say it and then we do it. That’s a critical part of the culture. You make the decision and move forward. I call it engineering courage. It’s the courage to make a decision that you’re going to have to live with for years.”
AMD paid my way to Las Vegas. Our coverage remains objective.
GamesBeat’s creed when covering the game industry is “where passion meets business.” What does this mean? We want to tell you how the news matters to you — not just as a decision-maker at a game studio, but also as a fan of games. Whether you read our articles, listen to our podcasts, or watch our videos, GamesBeat will help you learn about the industry and enjoy engaging with it. Discover our Briefings.