Simulating a Serial Link

Monday, April 15, 2019

And now one of my favorite topics: simulating a serial link. Good technology, good models, good margins, endless options. Yet how do I construct an accurate simulation? And what do I measure? Part of what makes this topic interesting is the variety of answers out there. And the almost infinite number of dials makes the task an engineer’s dream. In a world like that, how do I get started? …and how do I know when I’m done? How good is good enough?


The Big Picture

There are two ways to examine and hence simulate your serial link: passive and active. And there are two times to simulate it: pre-layout and post-layout. I suppose there’s a third time too: after it has already been built. If you’re doing that, you’re probably in trouble. So let’s not think about that here.


The good news is that the simulation process itself is serial, because you cannot simulate the active until you have first built and verified the passive. Some engineers, tools and specifications assert you are “done” once the passives are simulated and “compliant”. But since the beginning of serial links this has not been true. As the PCI Express (PCIe) 1.0 Specification stated: “Although loss vs. frequency is useful in understanding how to design effective interconnect, the timing and voltage margin measured in the TX and RX eye diagrams end up being the ultimate constraints of insertion loss.” In other words you will learn a lot simulating your passive channel, but you’re not really “done” until you have attached the active devices and measured performance from the Rx’s perspective. It’s the Rx that ultimately determines if a “good” channel is “good enough”. And an Rx’s ability to extract a signal out of a lossy channel has changed, and continues to change, radically over the years.


So I like to work on the passive channel to make it “good” and then simulate the active channel to confirm it’s “good enough”. Truth be told, I then like to optimize the channel’s firmware settings to make it “extremely good”. But, in practice, not everyone wants to take that step (see Step 7). Schedule-wise, the greatest benefits of serial link simulation are realized pre-layout. So let’s start there.


Building and Simulating the Passive Channel

To construct the passive model of your serial link, form a picture of the signal’s path from the Tx to Rx. Since the passives in the Tx and Rx devices themselves are not likely under your control, follow the path from the Tx device’s solder pad to the Rx device’s solder pad. With a rough sense of the passive PCB elements (traces, vias, connectors, etc.), the next thing you’ll need is the data rate. Use this to determine the feature sizes you care about, and hence need to model. In today’s world you will likely care about every via, so the question tends to be if you should bother with smaller ~20 mil structures such as solder pads. Not sure which structures matter? When in doubt, put them in. Remember, you are building a model which can easily be simulated with and without any structure. To learn which elements matter, add the small items in and measure if/how performance changes.


These days the tools guide you through the physical construction of your trace and via models. Behind the scenes, internal field solvers create subcircuit models the simulator will use. When entering Dk/Df (sometimes called Er/Lt), use values for the frequency point(s) of interest. Because these parameters are of increasing importance, we’ve seen good improvement in their availability and accuracy. While trace impedance is important, loss gets most of the attention and must be managed appropriately (Step 2). Thankfully, there are common materials that have both accurate and low Df/Lt values.


Thinking about loss, in my opinion the science that hasn’t quite settled down is copper roughness (problem #2). It can be challenging to wade through the various algorithms and variables, and they differ from tool to tool. As the factors can influence loss on a scale similar to Df, I expect roughness (or is it really lamination chemistry?) breakthroughs and standardization in the next 10 years. So watch this space.


The next step is to connect all the relevant elements together. As you will want to build and verify the channel iteratively, don’t worry if some elements are missing. Just place a short or a very small resistor in its place and add it later. TDR and TDT simulations are a great way to verify elements and the larger channel are modeled correctly. As you add and test more elements, continue to practice Bogatin’s Rule #9: “Never perform a measurement or simulation without first anticipating the results you expect to see.” If an element is short, it should have a short TDT delay. If you expect an element to have a low impedance, TDR will verify that. And so on.


While most structures can be modeled within your pre-layout canvas, some – such as connectors – will arrive as S-Parameters from the appropriate vendor. Even though today’s quality is much better than the early days of link simulation, it is still a good idea to verify S-Parameter models on their own before placing them in your channel. It only takes one bad S-Parameter model to completely destroy your simulation.


Once you’ve assembled and verified the relevant pieces, your channel will look something like Figure 1. Figure 1 shows a 3-PCB system implemented with a minimum set of elements. Minimizing elements means we allow vias to get us to a single route layer and connectors to get us to the next PCB, and no more (Step 1). This is the passive channel. What looks like Tx/Rx devices on the ends are really “ideal” elements that tell the simulator where the signal goes in and where it comes out. SiSoft offers Design Kits for the various serial standards to help you quickly get a typical channel implementation placed on the canvas, complete with automated passive and active performance measurements.

Figure 1: Example 3-PCB Passive Channel Model, Pre-layout

If you’re studying channel implementation trade-offs and options, you will have a few channel options drawn and/or various elements to sweep. Sweeping allows you to test design options as well as manufacturing tolerances. Throughout this process, work diligently on the passive channel to minimize discontinuities and manage loss to within acceptable boundaries (Steps 1 and 2). Your company, Serial Specification, and/or Compliance Kit can provide metrics on what is acceptable. At this stage examine both TDR and loss plots to work on discontinuities and loss, respectively. Even though the active device models haven’t been added yet, you can add a lot of value by tuning the passive elements to resolve appropriate performance, materials, and other parameters for the layout process. As you’re working in a simulator rather than hardware, this is the best time to adjust parameters, examine options, and make trade-offs.


As you focus on the sub-elements in the channel, determine which elements are the most critical and hence present the greatest risk. Regarding trace and via impedance and loss, one of the tensions you’re managing is that traces are easy to simulate correctly yet hard to manufacture precisely while vias are the opposite (i.e., harder to simulate yet easy to manufacture correctly, see page 16). Now is the time to envision a measurement-correlation strategy and plan to verify these items after the PCB is built. Measurements are affordable and you will learn something important. As much as I love simulators, their roots are in measurement. Just do it.


With the passive characteristics of your channel modeled, understood and optimized it’s time to simulate the active channel.


Simulating the Active Channel

Simulating the active channel requires the addition of SerDes models at each end of your passive model. These models add active Tx/Rx behaviors and equalization, and are provided by your component vendor in IBIS-AMI format. In (increasingly rare) cases when an AMI model is not available, you can build one from the component’s datasheet using your simulation tool’s AMI template model. If you’re designing a system that connects to a wide range of devices compliant with a serial interface standard, you might want to use an AMI that models the specification’s limits. This type of model is found in that standard’s Compliance Kit. Whatever the case, the excuse of “we couldn’t get a model” is no longer valid. Indeed, there are almost too many options.


There are two types of active simulation: Statistical and Time Domain. These are sometimes referred to as “Init” and “Getwave” respectively, after the names of their associated AMI algorithms. There are a variety of opinions regarding which option is best or most important (slides 23-33). Generally accepted practice is to deploy (the faster) Statistical analysis to explore and narrow the design space, and then switch to (the more time-consuming) Time Domain analysis to examine the link with finer detail. In practice, many models support only one option so you may not have a choice. See slides 9 through 21 to better understand the model types, how they interact with each other, and what becomes inaccurate as you combine them in different ways.


Applying equalization in your Tx and/or Rx will radically change your channel’s performance. As such, start out with equalization disabled so you can learn how the analog portions of the SerDes affect the channel. When you enable equalization, you will soon discover too many options – many of which are redundant – so start with each component’s default settings. You will then learn the default settings are far from optimal, so don’t be afraid to experiment with changing the SerDes Equalization Settings (SES). Looking at the un-equalized and equalized pulse responses can be a good place to start (slide 24, and this paper). The good news is that, assuming you did a decent job optimizing your passive channel, you can continue to improve the equalization settings and hence performance after the layout is done. Indeed, even after the hardware is built.


When performing active simulation, your performance target will typically be an eye opening. Open eyes tell us signal amplitude is high enough and wide enough for the Rx to observe the correct logic level. Acceptable height and width values are stated for either the serial standard or the Rx device itself. They might be expressed as a keep-out “mask” your eye should never touch at an acceptable probability. As there is often little or no eye to measure at the input to your Rx, increasingly eye shape is examined at the output of the Rx’s equalization. While that location is deep in the bowels of the IC, it is easily accessed by the model and the simulator. Indeed, it is probably the measurement point assumed.


While I spent my first 10 years of serial simulation looking at eye diagrams, I rarely see them anymore – except to cross-check that simulation is working correctly. Instead, eye height and width are extracted by the simulator and placed in plots similar to those shown in Figure 2. As you simulate the myriad of options such as length, route layers, materials, and equalization it becomes impossible to examine eye diagrams one at a time, so using plots like these becomes imperative. In the plots in Figure 2, each dot represents one simulated eye; height on the Y axis versus width on the X axis. The eyes in the lower left quadrant are bad or “closed” while eyes in the upper right quadrant are good or “open”. As such, trends revealing what is working and what isn’t are easily filtered and identified.


Figure 2: Eye Height vs Width Plots, with BER


While minimum eye heights and widths have served as a decent performance measure, I suspect we may move away from them over time. Thinking about the way an Rx works, it actually responds (i.e., “switches”) based on energy. In other words, if you apply a significant voltage quickly it will switch. Conversely, it will also respond to a lower voltage if you leave it there long enough. In a DDRx context, this effect is captured by the so-called “slew-rate derating tables”. In practice, necessary eye height and width are not static quantities but can be traded; meaning more voltage allows you less width and vice versa. Figure 2 provides an example of how this works.


The black box in the left-most plot in Figure 2 delineates the keep-out zone to achieve a minimum eye height of 160 mV and width of 60 mV. Must all my channels satisfy both these values to work correctly? For the system shown in Figure 2 the answer was “no” because the figure of merit was Bit Error Ratio (BER), delineated by the black diagonal line that slices off part of the box. Solving to a BER allows you to trade height and width, and is a better indicator of system performance. I believe the black diagonal trade-off exists in most serial links, but unfortunately not all models and simulators provide reliable BER simulation.


The plots in Figure 2 show a two-step process of improving BERs using design and equalization corrections further detailed in this paper. The important point is that designing to an eye height/width “box” would have been difficult to achieve and over-design. In practice, sometimes the same technology is used for higher performance through only a change in methodology. And that is the case here. For many serial links, this trade-off is waiting to be more fully exploited.


Active simulation is the best way to understand and tune your link; bringing you one step closer to how it will operate after hardware it built.


In Conclusion

Serial links are now all over your PCBs and systems, and as such have become the primary context for PCB simulation. In discussing the simulation process, I’ve chosen to highlight the items most relevant given the current state of technology. Additional information is available in the links provided.

High-speed serial transmission is technology so robust it has already scaled three orders of magnitude. At this point in time, serial link simulation is so accessible I believe more engineers and projects should try it out. In doing so, you will find methods to add margin and/or performance to your system – many of which can only be discovered using simulation. Why not start today?

Donald Telian, SiGuys - Guest Blogger 4/15/2019

Add your comments:

Items in bold indicate required information.