AMD Zen: Core Complexes and Configurations (with a Q&A)

This is going to be a somewhat shorter and a more image-rich article than the previous two. Today, I'll be discussing possible core configurations for AMD Zen-based products. I've also planned a Q&A for a few weeks now, so that will be the final instalment in this article. This section will be updated over time, so if you have a question relating to Zen, feel free to leave a comment.

Again, I want to stress that the configurations shown on this page are entirely predictions on my part (although some are confirmed).

Zen Core Complexes

Zen has previously been described as bringing the best of both of AMD's previous architectures into one solid product. That is to say that Zen incorporates design choices from both Excavator and Puma. One of those choices, is that cores arrive in pairs. For all Bulldozer-derived architectures, these pairs were referred to as modules, while all cat-based architectures had no such distinction. The underlying design of each architecture actually differs between the two families (construction architectures share resources while cat architectures don't), and Zen is much closer in design to the cat-based designs than it is to any of the construction architectures.

Zen also takes the grouping of cores to the next level; Zen's cores will come in groups of four, equipped with the new nomenclature of "core complex," which is abbreviated as CCX. For this reason, it's very unlikely that you'll see any dedicated 2-core or 6-core dies being produced, but it doesn't rule out the possibility of such models existing from 4-core and 8-core dies, respectively.

The Zen architecture can clock down and power down each core individually, and features much improved clock gating, first introduced with Excavator in 2015. This theoretically means that 6-core and 8-core parts can clock as high as 4-core chips with fewer cores active. It will also make XFR very interesting indeed.

The diagram below demonstrates the differences between three quad-core processors; a Zen core complex, two Excavator modules and a quad-core Skylake chip. You can really see the similarities with Skylake here, and that's good news for Zen and its successors, Zen+ and Zen++. You can also apply the Skylake design to Nehalem, Sandy Bridge, Ivy Bridge, Haswell, Broadwell and Kaby Lake, as they are essentially identical.

'Zen' Core Complex (CCX) vs. 'Excavator' Modules vs. 'Skylake' Quad-Core
'Zen' Core Complex (CCX) vs. 'Excavator' Modules vs. 'Skylake' Quad-Core

Zen Against Its Predecessors and Opposition

As previously mentioned, Zen returns to a more traditional processor core design, akin to K10 and the cat-based architectures like Jaguar and Puma. By this, I mean that each Zen core has its own dedicated integer and floating-point units, with its own 64 kB level 1 instruction and 32 kB data caches, as well as dedicated 512 kB level 2 caches. What this means is that for the same number of cores as a related Piledriver, Steamroller or Excavator processor, Zen will have the same number of integer units, but twice as many floating-point units. Additionally, the improved cache subsystem also translates into up to five times as much bandwidth from all three levels of cache. The construction architectures were notable for lacking in floating-point performance because of this, and it's good to see that Zen returns to a more conventional design; something we were familiar with from AMD, before 2011.

Each floating-point unit can accept 128-bit-wide instructions like AVX, and two units can fuse together to tackle 256-bit instructions such as AVX2. Unfortunately, this does put Zen behind Intel's Haswell, Broadwell and Skylake architectures, which contain dedicated 256-bit-wide floating-point units; a single dedicated unit will always be more efficient than two units fusing together to accomplish the same goal. As such, in workloads that utilize 256-bit instructions, IPC will be somewhat closer to Ivy Bridge (although still likely ahead in terms of performance). With that said, workloads of this type are a very limited scope, and for the most part, Zen will be very competitive with the aforementioned architectures. Intel has hindered AVX2 support by refusing to incorporate the instructions into its lower-end products, which will also work in AMD's favor for Zen.

Zen's successor, known as Zen+, will bring dedicated 256-bit-wide floating-point units, as well as higher clock frequencies and other currently unknown features. However, Zen in 2017 needs to lay down the path for which its successors can build upon. This is predominantly the reasoning behind Zen's narrower compute units; it will be cheap to produce, and use less power for similar instructions. While it will struggle in very specific workloads as mentioned above, for most scenarios, that will encompass most consumers very well and its performance will be on-par, if not better than Intel's offerings in these instances.

Configurations: Summit Ridge

'Summit Ridge' is the codename for high-performance graphics-less processors intended to replace the current Piledriver-based FX line-up, released in 2012 (with refreshes in 2013 and 2014). Initially, Summit Ridge will come in the form of 8 cores, with and without SMT enabled. I suspect that this is because yields are greater than originally anticipated, and therefore AMD is stockpiling defective silicon for a later release; these chips coming later will debut 4-core and 6-core variants, again, likely with SMT enabled and disabled to properly attack each market segment and price range. I don't expect any dual-core models of Summit Ridge to be released.

95-watt chips are to be released initially, but the AM4 platform supports up to 140 watts. While that doesn't necessarily mean we'll see Summit Ridge parts with these TDPs, the possibilities are there for any future releases should AMD want to target quad-core-like clock frequencies on 8-core parts. The additional power support should also help toward overclocking the 95-watt models.

It's currently unknown whether 4-core and 6-core parts will retain the full 16 MiB of level 3 cache. The FX-4350 of the previous generation high-performance processors retains the full 8 MiB from the octa-core FX line-up, while all hexa-core models retain the full 8 MiB as well. If this is the case, there will be a noticeable difference between the 4-core Summit Ridge processors, and the 4-core graphics-less Raven Ridge chips.

Update: — 6-core Ryzen processors do indeed retain the full 16 MiB of cache, but 4-core models don't. This means that Raven Ridge could potentially aid with Summit Ridge's yields if necessary.

  Caches  
Core Configuration L1 Instruction L1 Data L2 L3 SMT Graphics TDP
4C/4T 256 KiB 128 KiB 2 MiB 8 MiB No No 65 W
95 W
125 W
4C/8T Yes
6C/12T 384 KiB 192 KiB 3 MiB 16 MiB
8C/16T 512 KiB 256 KiB 4 MiB 65 W
95 W
125 W
140 W

For each of the following diagrams, light colors represent active parts of the core, while dark colors show disabled parts.

  • Slate — processor die
  • Red — processor core with dedicated level 1 instruction and data caches; 64 KiB and 32 KiB, per core, respectively
  • Blue — level 2 cache; 512 KiB per core
  • Green — level 3 cache; 8 MiB per cluster
'Summit Ridge' Quad-Core #1
'Summit Ridge' Quad-Core #1
'Summit Ridge' Quad-Core #2
'Summit Ridge' Quad-Core #2
'Summit Ridge' Hexa-Core
'Summit Ridge' Hexa-Core
'Summit Ridge' Octa-Core
'Summit Ridge' Octa-Core

Configurations: Raven Ridge

'Raven Ridge' is the codename for APUs based on the Zen architecture. They will replace current APUs using the Steamroller and Excavator architectures, released in 2014 and 2015, respectively. There were refreshes for Steamroller in 2015, and for Excavator in 2016. Less is known about Raven Ridge as it's a little further from release, but that's relative to 'Summit Ridge' which is launching within the next few months.

Raven Ridge APUs are expected to debut a larger integrated graphics chip, also based on the latest iteration of GCN, Vega. While Vega-based graphics cards will come equipped with HBM2, I actually expect the processor graphics to rely entirely on very fast DDR4-3200 system memory in the dual-channel configuration provided by the AM4 platform. This will provide these APUs with up to 51.2 GiB/s of memory bandwidth, which represents a 33% increase over Excavator-based Bristol Ridge APUs, and a 50% increase over Steamroller-based Godavari APUs. At the same time, Raven Ridge's integrated graphics will also come with 50% more unified shaders than both previous generations (768 versus 512), so the additional bandwidth will be a necessity. A new unified level 3 cache between the processor cores and the graphics cores will also help somewhat with that.

I should point out that the graphics configurations in the table below are entirely based on what we know about Raven Ridge, and also based on what we currently have with Bristol Ridge. For this reason, I don't think we'll see any dual-cores with less than 4 compute units (256 unified shaders), or quad-cores with less than 6 compute units (384 unified shaders). It's also entirely predictable that graphics-less versions of Raven Ridge will be available, and may even supplement the lack of quad-core Summit Ridge chips, if yields of 6-core and 8-core silicon is very good. (This also assumes that Summit Ridge quad-core processors don't retain the full size of level 3 cache from the octa-core models.)

  Caches  
Core Configuration L1 Instruction L1 Data L2 L3 SMT Graphics TDP
2C/2T 128 KiB 64 KiB 1 MiB 4 MiB No No 35 W
65 W
4 CU (256 USPs)
6 CU (384 USPs)
8 CU (512 USPs)
2C/4T Yes No
4 CU (256 USPs)
6 CU (384 USPs)
8 CU (512 USPs)
4C/4T 256 KiB 128 KiB 2 MiB 8 MiB No No 35 W
65 W
95 W
6 CU (256 USPs)
8 CU (512 USPs)
10 CU (640 USPs)
12 CU (768 USPs)
4C/8T Yes No
6 CU (256 USPs)
8 CU (512 USPs)
10 CU (640 USPs)
12 CU (768 USPs)

For each of the following diagrams, light colors represent active parts of the core, while dark colors show disabled parts.

  • Slate — microprocessor die
  • Red — microprocessor core with dedicated level 1 instruction and data caches; 64 KiB and 32 KiB, per core, respectively
  • Blue — level 2 cache; 512 KiB per core
  • Green — level 3 cache; 8 MiB
  • Orange — integrated graphics
'Raven Ridge' Dual-Core without Graphics
'Raven Ridge' Dual-Core without Graphics
'Raven Ridge' Dual-Core with 4-CU 'Vega' Graphics
'Raven Ridge' Dual-Core with 4-CU 'Vega' Graphics
'Raven Ridge' Dual-Core with 6-CU 'Vega' Graphics
'Raven Ridge' Dual-Core with 6-CU 'Vega' Graphics
'Raven Ridge' Dual-Core with 8-CU 'Vega' Graphics
'Raven Ridge' Dual-Core with 8-CU 'Vega' Graphics
'Raven Ridge' Quad-Core without Graphics
'Raven Ridge' Quad-Core without Graphics
'Raven Ridge' Quad-Core with 6-CU 'Vega' Graphics
'Raven Ridge' Quad-Core with 6-CU 'Vega' Graphics
'Raven Ridge' Quad-Core with 8-CU 'Vega' Graphics
'Raven Ridge' Quad-Core with 8-CU 'Vega' Graphics
'Raven Ridge' Quad-Core with 10-CU 'Vega' Graphics
'Raven Ridge' Quad-Core with 10-CU 'Vega' Graphics
'Raven Ridge' Quad-Core with 12-CU 'Vega' Graphics
'Raven Ridge' Quad-Core with 12-CU 'Vega' Graphics

Questions & Answers: AMD Zen

What is XFR?

Extended Frequency Range (XFR) is a feature of Zen processors that allows them to [automatically] dynamically adjust their maximum turbo frequencies, based on your chosen processor cooling method. For this reason, liquid-cooled machines will be able to achieve higher clock frequencies than aircooled systems. This feature is coexistent with the fact that you can manually overclock the processor.

Will Zen be competitive with Intel's latest offerings?

In most scenarios, yes.

Is Zen something you can comfortably recommend?

You should wait for reviews before making your final decision, but from what I know already, Zen is looking like a product I can get behind.

Will there be any overclock-capable AMD Ryzen processors?

Yes, all of them!

I intend to overclock. What should I look out for?

You'll definitely want a B350 or X370 chipset. For small-form-factor builds, you'll need the X300 chipset. You can pair these chipsets with any Ryzen processor you like.

Will the AM4 platform support multi-GPU setups?

Yes, but only in conjunction with X300 or X370 chipsets.

Will AM4 support both CrossFireX and SLI?

Yes.

Will Raven Ridge use Polaris or Vega for its integrated graphics?

Raven Ridge will be equipped with Vega-based graphics, with up to 12 compute units for 768 cores.

Any HBM for Raven Ridge?

Not yet.

What is the fastest type of system memory that AM4 supports?

Officially, that is DDR4-2667 in a dual-channel configuration, providing 42.656 GiB/s of bandwidth. However, Ryzen's integrated memory controller fully supports Extended Memory Profile (XMP), which allows the processor to accept memory speeds up to DDR4-4000, providing 64 GiB/s of bandwidth.

AMD touted an instructions-per-clock (IPC) gain of 40%. What is the final figure?

AMD's Lisa Su announced during the Ryzen press conference on February 22, 2017 that the final IPC increase over Excavator is, on average, 52%. However, in the benchmarking booth, Cinebench R15 single-threaded scores were recorded achieving 165 cb @ 4.00 GHz, which means in this particular (floating-point) benchmark, the IPC increase is a colossal 71.9% over the Excavator-based A12-9800 @ 4.00 GHz, which scores 96 cb.