# A 36Gb/s ACCI Multi-Channel Bus using a Fully Differential Pulse Receiver

Lei Luo<sup>\*</sup>, John Wilson, Stephen Mick, Jian Xu, Liang Zhang, Evan Erickson and Paul Franzon Department of Electrical and Computer Engineering, North Carolina State University, Box 7914, Raleigh, NC 27695 USA \*Currently with Rambus Inc., Chapel Hill, NC 27514 E-mail: <u>lluo@rambus.com</u>, {jmwilson, semick, jxu6, lzhang3, elericks, paulf}@ncsu.edu

Abstract-A new differential pulse receiver is demonstrated for AC Coupled Interconnect (ACCI), which enables the highest data rate, at 6Gb/s/channel (36Gb/s aggregate), for capacitively coupled systems using pulse signaling. The system works across FR4 printed circuit board (PCB) interconnect lengths of up to 30cm with coupling capacitors from 95fF to 165fF, while dissipating only 1.97mW/Gbps for the entire differential transceiver (0.83pJ/bit for the transmitter & 1.23pJ/bit for the receiver).

## I. INTRODUCTION

The increasing need for high-bandwidth and low-power chip-to-chip communication demands that alternative signaling methods be investigated. The use of capacitively coupled I/O and pulse signaling was proposed as one solution to this problem and has been investigated by numerous groups [1-6]. Capacitive coupling can be created by using on-chip metal-insulator-metal (MIM) capacitors, as reported in this paper, or by creating an inter-chip/package capacitor as reported in [7]. Capacitive coupling has also been used in 3D-ICs stacks for inter-chip communication [2-3]. The signaling methods are identical when a small series capacitor is used; however, the challenge of creating a capacitor between chip and package is avoided by using on-chip MIM capacitors.

The work on board level capacitively coupled interconnect [1,5,6,8] and stacked ICs [2-3] have been reported as alternatives to physical pin/solder bumps for high density, low power chip-to-chip communications. Although transceiver design of capacitive coupling has been reported extensively, the data rate of capacitively coupled links is still lower than traditional high-speed serial links and the potential crosstalk associated with return-to-zero (RZ) pulse signaling has not been reported. An evaluation of the crosstalk for a system bus using pulse signaling shows that performance comparable to NRZ signaling can be obtained with proper inter-differential pair spacing and the use of a fully differential receiver. There are two significant contributions reported in this work. First, we present a fully differential pulse receiver, which has better common-mode noise margin and reduced power dissipation. Second, a 6-bit wide differential ACCI bus achieves an bandwidth of 36Gb/s for chip-to-chip communication, while subject to crosstalk and switching noise from the simultaneous operation of six channels, across a range of transmission lines lengths and coupling capacitors sizes.

Fig. 1 shows the schematic view of an ACCI link, the waveforms at the TX output, T-line input, T-line output, RX input and RX output. An ACCI link has a band-pass response and it converts NRZ data into RZ-pulses. Smaller coupling capacitors (CC) are preferred, to get higher density AC I/O and improved immunity to ESD (electro-static discharge) events. High data rates and longer T-lines are preferred to accommodate more applications. The key challenge here is to design a high-speed pulse receiver to recover NRZ data from the low swing RZ-pulses in a noisy bus environment. The results presented in this paper used single side termination on the RX side only, as shown in Fig. 1. Depending on the channel characteristics (i.e. discontinuities, stubs, connectors, vias, etc) it may be necessary to use termination at both ends.



Fig. 1: (a) ACCI circuit view and (b) Simulated waveforms

## II. TRADITIONAL CAPACITIVE COUPLING VS. ACCI

AC coupling is used in many high-speed I/O applications. However, the value of the coupling capacitor is significantly larger than those used in ACCI systems (>1000x). This large value of capacitance, when combined with an encoded data stream (e.g. 8B/10B), is used to create signal behavior that is equivalent to having DC coupling. The main issue is to minimize the DC wander that arises from the high-pass filtering of the series capacitor. In ACCI, a small series capacitor is used to convert a full swing NRZ data stream into a bipolar pulse data stream [5-8]. This is accomplished by signaling through a series capacitor that is small enough to create a pulse with a duration that is less than or equal to the period of the maximum signaling rate. The edge rate of the transmitter's output stage combined with the value of the series coupling capacitors, and the impedance of the transmission line (T-line), determine the amplitude and duration of the transmitted pulse [5]. While noise immunity and receiver sensitivity impose a minimum on the amplitude and duration of the received pulse, defining noise margins for pulse signaling [8]. These criteria set a lower bound on the minimum coupling capacitor size and transmitter drive strength for a range of transmission line lengths [5,7].

Traditional high-speed I/Os communicate across lossy, low impedance transmission lines created on printed circuits boards. Typically, these systems consume significant power and require complex active equalization schemes to compensate for the frequency dependent losses of the transmission line. In ACCI, the small series capacitor provides DC isolation and presents a high impedance that isolates the low impedance transmission line from the transmitter's output stage. This enables the use of a simple transmitter circuit (e.g. cascaded inverters) that does not consume static power and does not have to be matched to the low impedance of the transmission line. This isolation has a high-pass frequency response that also helps to significantly reduce the impact of ESD events, because the majority of the energy in the discharge event is below the corner frequency presented by the series coupling capacitor. The use of a simple transmitter circuit and the elimination of ESD protection circuitry both help to greatly decrease the required die area for ACCI systems, thereby, reducing cost.

The small series capacitors also provide another significant benefit. In multi-Gbs high-speed I/O systems, the transmission lines have a low-pass frequency response due to skin effect and dielectric loss, causing dispersion, which "smears" the signals in neighboring periods together. This smearing of information between timing periods is referred to as intersymbol-interference (ISI). Typically, active circuitry is used to equalize for these effects, but these circuits consume significant power and die area. ACCI mitigates ISI by using a combination of passive filtering and a pulse receiver. The passive filtering comes from the high-pass frequency response of the small series coupling capacitors and the pulse receiver is used to restore received data to full swing NRZ levels. For a detailed discussion of these trade-offs, refer to [5,7].

## III. DIFFERENTIAL CURRENT MODE PULSE RECEIVER

A 120mV<sub>ppd</sub> low swing pulse receiver was demonstrated in [5]. However, this pulse receiver used a complementary input stage and is not good for rejecting common-mode noise. The focus of this work is on the operation of an ACCI bus, which needs to work in a more realistic (i.e. noisy) environment. As shown in Fig.2, the differential pulse receiver presented here includes three stages: a bias generator that sets up the RX input DC level, which is otherwise not available since the coupling capacitors blocks any DC level from the T-line side, followed by a simple source coupled inverter that amplifies pulse signal, and finally a source coupled latch that recovers the NRZ data from the pulse signal. The receiver used in this demonstration requires no clocking to recover data.





Simulated common-mode noise margin for this fully differential pulse receiver and the complementary pulse receiver reported in [5] are shown in Fig. 3. The worst common-mode noise margin is 140mV for the fully differential pulse receiver, comparing with 60mV for the complementary pulse receiver [5]. The improved common-mode noise rejection performance makes this differential pulse receiver more suitable for realistic environments.



Fig. 3: Common-mode noise margin comparison: complementary pulse receiver versus differential pulse receiver

### IV. CROSSTALK IMPACT OF PULSE SIGNALING

RZ pulses have higher bandwidth components than NRZ signals, as shown in Fig. 4(a), and will impact crosstalk effects relative to NRZ. To analysis this, a group of coupled microstrip lines are modeled using a 2D field-solver to extract the line-to-line coupling capacitance and inductance. Line

width W, coupled line space W, and dielectric thickness H are all set to 5mil. The neighboring line spacing, S2 is set to 1\*H, 2\*H and 4\*H (H=5mils), as shown in Fig. 4(b). Spice simulations are performed to compare the crosstalk effect of RZ-pulses to NRZ signals, as shown in Fig. 4(c). The top two curves show the common-mode (CM) crosstalk noise generated by RZ pulses and by traditional NRZ signals. The bottom two curves show the differential-mode (DM) crosstalk noise generated. From the simulations, the crosstalk noise generated by RZ-pulse signals and by traditional NRZ signals are similar and acceptable as long as the space between two coupled lines is greater than 2\*H (5mils). The receiver can reject most of the common-mode noise. But the differential noise is more critical because it will directly reduce the noise margin at the input of the receiver. Fig. 4(c) shows that as long as the neighboring T-lines are spaced more than 2\*H (5mils), the differential crosstalk noise will be controlled to within 3% of the aggressor's swing. To achieve high isolation from both the CM and DM crosstalk generated by RZ-pulse signals, 4\*H (20mils) spacing is needed.



Fig. 4: (a) Comparison of power spectrum density for NRZ & RZ pulses, (b) cross-sectional view of coupled micro-strip lines, (c) comparison of cross talk for buses using NRZ & RX-Pulse signaling

## V. MULTI-CHANNEL BUS TEST SETUP

A test chip with twelve TX and RX circuits was fabricated in the TSMC 0.18 $\mu$ m CMOS process. Each TX and RX layout occupies an area of 40 $\mu$ m×20 $\mu$ m & 60 $\mu$ m×60 $\mu$ m, respectively. All the TX, RX, coupling capacitor landing pads, and power supply pads are located around the periphery of the chip. The coupling capacitors' vary from 85fF to 175fF to determine the range of valid operation. The inner chip area is for flip chip testing of capacitive coupling links (not reported in this paper) on both a MCM-D and laminated organic packaged.

For testing, two test chips were wire-bonded to a four layer PCB. The ACCI link includes a Metal-Insulator-Metal (MIM) capacitor on each chip and the coupled microstrip T-line on a FR4 PCB. Fig. 5 shows one test case with six-bit wide, 30cm long 50 $\Omega$  T-lines with a pair-to-pair space of 20mil (4\*H). Measurements show that all six channels operating simultaneously at 36Gbps, while subject to the crosstalk noise and switching noise generated from the RZ signaling.



Fig. 5: Test PCB with six-channel bus & 0.18µm CMOS die photo

# VI. MEASUREMENT RESULTS

Measurements were performed using a PRBS-127 input and demonstrated functionality at speeds of up to 6Gb/s/channel for six channels operating simultaneously, while subject to the switching and crosstalk noise of the adjacent channels. Fig. 5 shows the measured RZ pulse signals (C1, C2) just before the RX side coupling capacitor. It also shows the differential mode (M1) and common mode (M2) of the two pulse signals. The total noise on the signal lines can be seen in Fig. 5(b). Visible are both common-mode components and distinct single ended components. The receiver can reject significant common-mode noise, however, the differential mode noise must be controlled by using adequate line spacing as discussed in the previous section. Shown in Fig. 6(a,b) are measurements at the receiver output of the recovered NRZ data transient waveforms and eye diagrams for the differential and single ended signals. The shmoo plot shown in Fig. 6(c) indicates, with grey and white blocks, the simulated pass and fail areas for a range of CC and T-Line values. The blocks marked with "P/F" characters show the actual measured pass/fail values. The measurement results agree with the simulation results, showing two possible failure regimes. One is the ISI limited area, which is limited by large coupling capacitors or a longer T-line. The other is the swing limited area, which is limited by small coupling capacitors or a longer T-line. For a detailed discussion of these trade-offs, refer to [5,7].



Fig. 5: Measured 6Gbps/Channel received pulse data subject to crosstalk and noise at receiver input: (a) system measurement point, (b) measured differential, common-mode, and single ended signals



Fig. 6: Measured 6Gbps/Channel recovered NRZ data at receiver output:
(a) system measurement point, (b) transient waveforms & eye diagrams for PRBS-127 data, (c) simulated & measured shmoo plot

| TABLEI      |         |
|-------------|---------|
| PERFORMANCE | SUMMARY |

| Supply Voltage         | 1.8V                    |
|------------------------|-------------------------|
| Technology             | 0.18µm CMOS Process     |
| Data Rate              | 36 Gb/s (6Gb/s/channel) |
| Interconnect Length    | Up to 30cm on a FR4 PCB |
| Power Dissipation (TX) | 0.83mW/Gb/s/channel     |
| Power Dissipation (RX) | 1.13mW/Gb/s/channel     |
| Active Area (TX & RX)  | 40μm×20μm & 60μm×60μm   |

#### VII. BENEFITS OF TECHNOLOGY SCALING

Simulations using a 90nm CMOS technology show operation up to 12.5Gb/s for T-line lengths of 10cm to 20cm using 100fF to 200fF coupling capacitors. The 90nm technology reduces power consumption to 0.56mW/Gb/s for the entire transceiver, which 28% of the power consumption in the 0.18µm CMOS technology.

## VIII. CONCLUSION

A multi-channel ACCI bus providing chip-to-chip communication at 36Gbps is demonstrated over a six-bit wide bus, with coupling capacitors as small as 95fF, using a 30cm microstrip line on FR4. The fully differential receiver enables the system to operate while subject to crosstalk and switching noise from the adjacent channels. The total transceiver power dissipation is only 12mW per I/O at 6Gb/s/channel (PRBS-127), or 1.97mW/Gb/s for the differential transceiver (0.83pJ/bit for the transmitter and 1.23pJ/bit for the receiver). This work shows that capacitively coupled links can achieve the same data rates as conductive links with lower power dissipation. It also shows the crosstalk and switching noise due to the RZ signaling is comparable with traditional NRZ signaling, with proper line spacing. Table I summarizes the performance details of the demonstration.

#### ACKNOWLEDGEMENT

This work was supported by AFRL contract F29601-03-3-0135, SRC task 1094, and NSF under grant CCR-0219567. The authors would like to thank Steve Lipa for his valuable discussions on this work. We would also like to thank Jay Diepenbrock of IBM's Integrated Supply Chain for the use of IBM's test equipment.

## REFERENCES

- T. Gabara and W. Fischer, "Capacitive Coupling and Quantized Feedback Applied to Conventional CMOS Technology," *IEEE J. Solid State Circuits*, pp. 419-427, March 1997.
- [2] S. Kühn, et al., "Vertical signal transmission in three-dimensional integrated circuits by capacitive coupling," *ISCAS* '95, pp 37-40, 1995.
- [3] K. Kanda et al., "1.27Gb/s/pin 3mW/pin Wireless Superconnect (WSC) Interface Scheme," *International Solid-State Circuits Conference*, pp. 186-187, February 2003.
- [4] R. Drost, R. Hopkins and I. Sutherland, "Proximity Communication," *IEEE Custom Integrated Circuits Conference*, pp. 469-472, September 2003.
- [5] L. Luo, et al., "3Gb/s AC coupled chip-to-chip Communication using a low swing pulse receiver," *IEEE J. Solid-State Circuits*, pp. 287-296, January 2006.
- [6] J. Kim, J. Choi, C. Kim, F. Chang and I. Verbauwhede, "A Low Power Capacitive Coupled Bus Interface Based on Pulsed Signaling," *IEEE Custom Integrated Circuits Conference*, pp 35-38, October 2004.
- [7] J. Wilson, et al., "Fully Integrated AC Coupled Interconnect using Buried Bumps," *IEEE Electrical Performance of Electronic Packaging*, pp 7-10, Oct. 2005.
- [8] J. Kim, I. Verbauwhede and F. Chang, "A 5.6-mW 1-Gb/s/pair pulsed signaling transceiver for a fully AC coupled bus," *IEEE J. Solid-State Circuits*, Vol. 40, pp. 1331–1340, June 2005.