Author: Donald Telian, SI Guys - Guest Blogger
Memory. Every system has it. Short-term, long-term, 2nd level, static, dynamic, EPROM, graphic, internal, cache – but, by far, from an SI perspective the most common memory interface is DDRx. Double-Data Rate (DDR) memory arrived in the late 1990s, and has remained the dominant interface for decades. It has proven to be extensible and, now in its 5th generation, DDRx has achieved a 20x increase in bandwidth.
DDR memory occupies an important place in SI history. It was the force that drove the transition to “source-synchronous” switching, in which both clock and data originate at the “source”. This brought higher-frequency data transfers, replacing a central clock that synchronized data at both the source and destination. Now that the clock travelled with the data, the stage was set for serial links to later hide the clock inside the data.
DDR’s new-found precision allowed us to latch data on both the rising and falling edges of the clock, hence the term “double-data rate”. As that reduces the clock’s frequency, it’s said that it also “reduces the signal integrity requirements on the circuit board”. But is that really true?
DDR, Signal Integrity, and PCBs
With dozens of drive strength and slew rate choices, internal and external terminations of various values turning on and off, and locally-sourced voltage switching references, one would think DDR is every SI Engineer’s dream. …or nightmare. Indeed, DDR’s complexity has provided job security for countless SI engineers. And I have a ton of respect for the brilliant SI minds who have engineered us into each new generation of DDR.
While SI engineers have worked diligently to keep DDR functioning correctly, Signal Integrity is not the driving force behind DDR. Instead, the best way to understand DDRx is to realize it’s all about memory density and price per bit. Viewed from that perspective, DDRx makes more sense. Complexity is handled by the controller where gates are readily available, and memory chips preserve every gate for storage. While this concession increases memory density, it presents a myriad of challenges for SI and layout engineers. By far, from an SI perspective, the messiest technology I’ve worked with has been DDR memory. Sorry, but it’s true.
DDR’s signaling requirements have had a dramatic effect on PCB layout. Glance at any modern layout and it’s easy to find the memory interface – tons of extra metal seemingly going nowhere. Source synchronous switching forced length matching, requiring serpentines on shorter nets. End result, a dense layout that resembles chow mein.
For non-memory interfacing, PCB layout began simplifying in 2002 when Intel declared: “We’re now at a point where it’s getting cheaper to put more gates behind a fast serial line than to lay down copper traces.” This statement was made in reference to PCI Express, which sparked the serial revolution. Now only two signals would be required on the PCB, while the ICs would require a significant amount of silicon to code, serialize, align, equalize, recover the clock, and more. And it worked. At this point, serial links have taken over every interface except memory - even though numerous serial memory standards have been proposed. Technology-wise I assumed serial would win memory too, but I was wrong (slide 13). Maybe someday.
Facing an Interface
If Signal Integrity (SI) is even remotely associated with your job description, you will eventually be asked to design or verify a DDRx interface. When that happens, the first question to ask is: “Should we simulate?”. If the answer is “no, not this time”, feel free to jump to the section “Making it Simpler” below.
If you are going to simulate your DDRx interface, here are a few things you’ll need to resolve first:
* Models. It’s never too early to start the hunt for models. In addition to buffer models, accurate package models are critical for DDRx simulation because package lengths can represent a significant portion of the “matching”. Resolve model acquisition and verification in parallel with the other items below.
* Up or Down. Are the DDRx components soldered (“down”) onto the same board as the controller? …or are they on a plug in DIMM module (“up”)? If DIMMs, how many? …and are they driven by the same or separate interfaces in the controller?
* Frequency. While the systems engineers are thinking about “Mega-Transfers per Second, or MT/s”, as an SI engineer you’ll need to know the operating frequencies. Since data transfers on both clock edges, the frequency of the Clock, Strobes and Data is typically half the transfer rate. And since Address and Command signals are latched only on the rising edge of the Clock, they typically operate at half the frequency of the data.
* Width and Matching. How many bytes wide is the interface, and which strobes go with which data bits? Is the matching done at the nibble, byte, or word level?
* Waveform Quality and/or Timing. Is simulation aimed at resolving waveform quality, timing, or both? Waveform quality looks at items such as overshoot, ringback, monotonicity and slew – for which numerous specs, boundaries, and violations are defined (slide 5). Timing analysis derives picosecond margins for items such as skews, setups and holds. While these tasks are complex they can be automated, so be sure to understand to what extent your tools include rigorous waveform processing and timing calculation. Be aware that both tasks are affected by your buffer drive strength/slew, termination, and ODT choices, so it’s a good idea to resolve those items first.
* Tools. If you are doing timing analysis, does your simulation environment attempt to simulate and measure the source-synchronous skews directly by simulating everything at once? Or, does it simulate the signals standalone and then mathematically derive skews? Both methods are complicated, yet you can expect the documentation that came with your simulator to help you understand which methodology is in use. If you do not already know this answer, plan some time (and a couple cups of coffee) to figure it out.
Making it Simpler
Some memory designs elect to resolve DDRx SI by simply matching signal lengths on the PCB. Matching rules might be inherited from others within your company, or they might come from your component vendors. Generic DDRx routing guidelines, derived from numerous sources, are provided in Table 1. To aid the layout task, Table 1 specifies matching precision in terms of length rather than time.
Using Table 1 values, the route process proceeds as follows:
1. Determine the natural length range of your DDRx signals. This might be derived from a list of un-routed “Manhattan lengths” provided by your layout tool.
2. Select a length for CLK within that range. All other signals will match this length to within 0.5”, so be sure it is both long and short enough.
3. Choose lengths for each of your DQS signals within the stated variation of CLK length, and set that as the target length for its associated DQ bits.
4. Route the Address/Command/Control signals to match CLK target length within the range stated, on the same layer as CLK.
5. Route each DQ group to match its associated DQS target length within the range stated, on the same layer as its DQS.
Table 1: Generic DDRx Layout Guidelines
When working with the values in Table 1, or any other set of guidelines, be sure to understand the package delay assumptions of your components. Some components carefully match package delays within their device, allowing the layout engineer to simply match the routes on the PCB. Other components publish a long list of “pin delays” that must be factored into the PCB layout matching. For these situations the PCB route lengths vary, and the “match” is realized at the system level – or PCB+Package length.
Where are the Failures?
DDRx interfaces have proven to be robust in the high-volume/low-cost world of consumer electronics, but design-time failures can surface in a few areas. Inadequate length matching is problematic, and causes unexpected and difficult to isolate timing violations and failures. From my viewpoint, SSO (Simultaneous Switching Outputs) has been the leading cause of failure, with noise on Vref in second place. And with its dense serpentines and same-layer routing, crosstalk is always a concern. The interface can also be power hungry, causing the failure mode to jump into the Power Integrity space. For help in isolating failures, see the flowchart in Figure 1 in this post.
The Future is Now
DDRx technology continues to advance by exploiting nearly every SI trick in the book. DDR4 and DDR5 now borrow equalization techniques from serial link SerDes, such as Tx FFE and Rx DFE and CTLE (slides 6 and 7). This introduces hybrid simulation approaches that examine both setup/hold timings and eye_mask/BER margins. This, in turn, introduces the use of IBIS-AMI models and analysis techniques to the world of DDRx; a transition that presents a myriad of new challenges (slide 13) and debate about how and if IBIS-AMI can be used effectively (slides 14-19, and 42,43).
With advanced equalization available for DDR5+, SiSoft engineers did an interesting study to quantify eye margin versus data rate versus number of DFE taps. Figure 1 shows their plot of eye height (Y axis) versus data rate (X axis), overlaying colors to represent the number of DFE taps (red=0, blue=2, green=4, gold=8). As shown, DFE appears to be required at the higher data rates (red=no DFE). Adding only 2 taps (blue) marginally solves the problem, suggesting additional taps might be required (green/gold). For more details on this analysis see SiSoft’s DesignCon 2018 paper, specifically pages 16-20.
Figure 1: DDR5 Eye Margin versus Data Rate and Number of DFE Taps
20+ years of DDRx technology continues to challenge SI engineers, tools and models. When the industry becomes proficient in designing DDRx, along comes DDRx+1 with tighter margins, more layout restrictions, and a couple new techniques to understand and apply. While serial memory interface standards have been proposed, the incrementing of DDRx seems like it won’t stop anytime soon. As such it stands as the exception among high-speed interfaces, requiring significant complexity on the PCB, increasing complexity inside ICs, and lots of SI due-diligence to make sure everything works right.
For more assistance with your DDRx design, consult SiSoft’s extensive published works on the subject and their collection of simulation-ready QSI Design Kits.