Author: Donald Telian, SI Guys - Guest Blogger
This is the final installment in our “7 Steps” series. If you’re just now tuning in, Part 1 detailed the primary problems we need to solve when implementing serial links: loss and discontinuities. In Part 2 we examined the pragmatic side of how those problems are handled during the layout process. Now in Part 3 we focus on issues that arise in manufacturing as well as opportunities related to software configuration – both important topics to address before we wrap up this series on the 7 Steps.
To review, here are the 7 Steps to Successful Serial Link Layout:
- Minimize Discontinuities
- Manage Loss
- Route Using Best Practices
- Route Using Double-digit Data Rate Best Practices
- Remove Unacceptable Stubs
- Prevent Problems Related to Fabrication
- Engage Firmware Team to Configure SerDes
Ideally, these steps would be accompanied by detailed simulation done in parallel, both pre-layout and post-layout. And yet, in practice, this is not always what happens. As such, these steps are designed to guide any hardware team successfully through link implementation – with or without simulation. And yes, the steps you’re about to examine were largely “learned the hard way”. So fasten your seat belt, because the ride gets wilder as we proceed through the remaining 2 steps.
Step 6: Prevent Problems Related to Fabrication
PCB fabrication and manufacturing bring a myriad of issues that – by definition – you will never catch during design and analysis of your link implementation. These items must be learned by experience. As such, this step may be the most valuable.
Fiberglass weave effects of various dielectric cloths are problematic and get most of the attention in discussions about fabrication issues. Thankfully, this issue has been decently investigated, understood and documented. Basically, some of the dielectric (non-conducting) layers of your PCB might have a loose weave causing the medium surrounding your signal to be non-homogeneous (Figure 1, at left). In practice, this may cause one side of your diff-pair to see a different dielectric constant than the other side and hence different propagation times (i.e., “p/n skew” in time but not length). This is problematic, and sometimes called “mode conversion” because a portion of the intended “differential” (opposite) nature of your signal is being converted to a “common” (similar) signal. Here’s an example (page 13) of a diff-pair length-matched to within 1 mil that fabricated as though it was miss-matched by 100 mils. Why? One signal fabricated over a strand in the fiberglass cloth that had a different dielectric constant. I’ve seen 200 mils of skew make a 10 Gbps channel fail, while 100 mils makes it marginal. So pay attention to weave-related issues.
Weave problems are handled in two ways: diagonal routing or dielectric selection (in other words: time or materials). To succeed with the “diagonal routing” solution, you must understand how your PCB will be oriented on its fabrication panel in relation to the weave orientation and then route diagonally so that your signals will never run parallel with the weave. Sounds complicated? It can be, so many prefer the “dielectric selection” option. The good news here is that newer “spread glass” and/or tighter-weave materials can be chosen to minimize the problem. Ask your fab vendor for details. Others endeavor to avoid the problem by ensuring their diff-pair spacing differs from the weave spacing. But that can be problematic based on the problem explained next.
I still remember the day we sliced open a PCB to measure fabricated signal dimensions in effort to understand some characterization data we couldn’t explain. We had already re-measured the signals electrically, thinking that we were seeing problems with our measurement equipment or process. We were wrong. Oddly enough, photomicrography of the PCB’s cross-section revealed different fabricated trace dimensions than those we had placed into the layout files. How did that happen? Looking into the issue further, we discovered that specifying “controlled impedance” PCBs often causes your fabrication vendor to change (re-image) your trace widths according to mapping tables tuned for their materials and process.
PCB vendor re-imaging can introduce loss and discontinuities you didn’t expect. For example, in layout you may adjust your uncoupled breakout trace widths to compensate for the impedance change seen when the traces come together (couple). While this is reasonable, re-imaging might make the discontinuity at the gather point worse instead of better as shown in this example (page 23). Re-imaging can also affect loss because trace width is directly related to conductor loss, as illustrated in Step 2. So understand that what you see in your layout system is not always what you get in fabrication, and discuss re-imaging with your fab vendor to comprehend how it might affect your design.
One of the largest PCB fabrication variable tolerances is laminated pre-preg thickness when compared to its design spec. If you look at the dielectric layers in most PCB stackups you will notice they alternate between “core” and “pre-preg”. Core layers are solid, while pre-preg layers are “pre-impregnated” (hence the name) with a resin bonding agent which might change in thickness when the laminated layers are glued together. As such, manufactured thickness of pre-preg varies much more significantly than core. Figure 1 shows 24 pre-preg thicknesses across different PCBs (green, purple) versus their design spec (red), measured precisely using cross-sectioning and photomicrography.
Figure 1: Pre-preg Thickness Variation, measured using photomicrography
The data in Figure 1 shows an average pre-preg variation of 0.5 mils with a standard deviation of 0.5 mils, hence a 1+ mil variation in pre-preg thickness is often observed. This variation may cause a 20% variation in manufactured microstrip trace impedance – particularly when 3 to 4 mils of pre-preg is used to separate traces from a reference plane. This presents a substantial discontinuity, so microstrip traces should be avoided or used very carefully. Refer to Table 3 on page 16 for measured data on these and other manufactured PCB variables and tolerances you should be aware of.
Perhaps you’ve encountered some surprises of your own in the process of manufacturing PCBs? If so, please add them in the Comments section below so we all can learn.
Step 7: Engage Firmware Team to Configure SerDes
At this point, if you’ve done everything in Steps 1 to 6 perfectly you have substantially handled the hardware side of your serial link implementation. Perhaps the most surprising thing about serial links is, due to the fact their performance is increasingly dependent on and dictated by equalization, their software configuration might have a larger impact on signal quality than their hardware implementation. Don’t believe me? Have a look at my previous post on Fixing Signal Integrity Issues in Software.
SerDes Equalization Settings (SES) in both your Tx and Rx must be configured correctly for your serial link to perform well. As such, spend time investigating how the SES are being set and to what values. Sometimes the system derives them at boot-up during link training, or perhaps they might be hardcoded by the IP core instantiated in your FPGA. More often than not, they are simply the “default” values selected in a hard-to-find software routine that has no idea of your hardware implementation. This typically means the SES are programmed to handle a pathological worst-case hardware implementation, causing your system to be (often severely) over-equalized. I use the word “severely” because too much of a good thing (equalization) can make a short link fail. As such, I’ve fixed a number of failing links by simply backing off the equalization in software. For this reason, using SES defaults should be a deliberate choice you make after determining what they are and confirming they make sense in your system design.
Unfortunately, accessing the SES can be difficult. If you’re working at a medium- to large-size company, the engineer with access to those registers might be in another building, in another state, or even on the other side of the planet. Likely your Project Manager can connect you with them, but don’t be surprised if they’re busy with other things. They have hundreds of other bits to set and routines to juggle. In their world the SES registers are among the most obscure and mysterious, so it seems best to leave them at the default values chosen by the component vendor. The hardware/software divide is very real – a fact I’ve never felt more acutely than when attempting to configure the SES. So plan extra time for the first part of this Step: “Engage the Firmware Team…”. If you’re working at a smaller company, the good news is this task is likely simpler. In fact, you may even be the person responsible for both the hardware and the software.
Interested in learning more about the power available in your SES? A couple years back I collaborated with SiSoft engineers on a methodology for co-optimizing serial link hardware and software. Our findings were captured in a DesignCon presentation and a Webinar that explain how to optimally balance EQ at both ends of a link (i.e., in the Tx and Rx). For example a Tx injects amplitude into the system that most EQ removes, so it may be better to leave most of the signal recovery to the Rx. This principle is often ignored, and it’s not uncommon for an Rx to have to undo excessive equalization applied at the Tx – thus wasting power at both ends. A poll during the Webinar revealed only 28% of the engineers attending think about minimizing Tx equalization to preserve amplitude. This is not surprising because not too long ago equalization was only in the Tx, while today the majority of EQ capability exists in the Rx.
Those who engage in optimizing Tx and Rx equalization will discover opportunities for significant performance improvement. For example we demonstrated an average 188% improvement in eye heights (slides 11-14) in channels where performance is dictated more by discontinuities than loss. As serial links have become prevalent all over your PCB, channels have become shorter causing this to become the dominant characteristic. Figure 2 shows that optimizing the SES (green) on these shorter channels (black, to Channel #24) achieves 134% improvement in eye height (slide 18) when compared with PCIe SES preset P7 (blue). Applying the techniques to a system design, we found we could improve performance by more than 60% amidst links required to become 25% longer (pages 3-5). So don’t assume something can’t be done until you’ve examined your SES options.
Figure 2: Optimizing SerDes Equalization Settings (SES) for Eye Height Improvement
Optimizing Tx and Rx equalization is so powerful intuition tells me it could be the backbone of a future 2x-4x data rate increase. Note that the technique is just as impactful for newer signaling techniques such as PAM4, where it shows a 3x to 5x improvement in eye heights (slide 19). Indeed, optimizing the SES might be the solution you’re looking for – and you can even use the SES to fix a failing link.
These final 2 Steps examined how manufacturing and software can affect the performance of your serial link implementation, forcing us to journey into territory that is not always intuitive. While the manufacturing issues might be referred to as “gotchas”, perhaps we can view the software opportunities as “getchas” – as they can likely getcha a significantly better eye opening. That journey may lead you back into the realm of simulation, however many succeed doing link configuration tuning in the lab after hardware is built.
So thanks for walking with us through the 7 Steps. The goal has been to provide a process you can use that helps you succeed with your serial link implementations. As you move through the process on your designs, don’t forget to share your questions and findings in the Comments section below.