# SCIENCE CHINA

# Information Sciences

#### • RESEARCH PAPERS •

June 2011 Vol. 54 No. 6: 1293–1299 doi: 10.1007/s11432-011-4218-7

# A low-jitter low-power monolithically integrated optical receiver for SDH STM-16

CHEN YingMei\*, WANG ZhiGong & ZHANG Li

Institute of RF- & OE-ICs, Southeast University, Nanjing 210096, China

Received January 29, 2010; accepted June 9, 2010; published online March 24, 2011

Abstract A high-scale integrated optical receiver including a preamplifier, a limiting amplifier, a clock and data recovery (CDR) block, and a 1:4 demultiplexer (DEMUX) has been realized in a 0.25  $\mu$ m CMOS technology. Using the loop parameter optimization method and the low-jitter circuit design technique, the rms and peak-to-peak jitter of the recovered 625 MHz clock are 9.4 and 46.3 ps, respectively, which meet the jitter specifications stipulated in ITU-T recommendation G.958. In response to 2.5 Gb/s PRBS input data ( $2^{23}-1$ ), the recovered and frequency divided 625 MHz clock has a phase noise of -83.8 dBc/Hz at 20 kHz offset and the 2.5 Gb/s PRBS data has been demultiplexed into four 625 Mb/s data. The power dissipation is only 0.3 W under a single 3.3 V supply (excluding output buffers).

Keywords optical receiver, jitter, preamplifier, limiting amplifier, clock and data recovery, demultiplexer

Citation Chen Y M, Wang Z G, Zhang L. A low-jitter low-power monolithically integrated optical receiver for SDH STM-16. Sci China Inf Sci, 2011, 54: 1293–1299, doi: 10.1007/s11432-011-4218-7

# 1 Introduction

Optical communication systems, which are used in backbone networks or wide area networks (WANs), are expected to play an important role in realizing the future multimedia society. These systems must be compact, economical to produce, and efficient in terms of power consumption. Given these requirements, researchers have been developing small-size and low-power optical receiver (OR) modules for the regenerators and line terminals of systems such as SDH or SONET.

The CDR is one of the key components of an OR, which must have retiming, reshape and regeneration (3R) operation [1, 2]. A phase-locked loop (PLL) is a preferable approach to develop a CDR, in which the phase of the clock in the decision circuit is automatically synchronized to the center of each bit. Also, the PLL can be integrated into a single integrated circuit (IC), greatly reducing temperature drift and phase relationship problems.

A number of optical receiver ICs integrated with preamplifier, limiter, clock and data recovery (CDR), and demultiplexer (DEMUX) have been realized [3–5]. However, they generally require large chip area and provide poor jitter characteristics due to the integration of different function blocks in the IC.

This paper advances a method of designing Charge Pump Phase-Locked Loops (CPPLL), in which the phase detector can detect the phase in generating early-late phase logic and its special function is to demultiplex the data acting as a 1:2 demultiplexer. Therefore, the three blocks containing clock recovery, data decision and 1:4 demultiplexer are integrated monolithically.

<sup>\*</sup>Corresponding author (email: njcym@seu.edu.cn)

## 2 Low-jitter design technique

The jitter transfer function of the CDR, which shows the input jitter suppression characteristics, can be expressed by substituting the phase-transfer function for the jitter transfer function:

$$H(j\omega) = \frac{\omega_n^2 + j(2\xi\omega_n - \frac{\omega_n^2}{K})\omega}{\omega_n^2 - \omega^2 + j2\xi\omega_n\omega}.$$
 (1)

When the loop-filter is a lag-lead type (the series and shunt resister are  $R_1$  and  $R_2$  respectively and the shunt capacitance is C), the natural angular frequency  $\omega_n$  and the damping factor  $\xi$  are expressed as

$$\omega_n = \sqrt{\frac{K}{\tau_1 + \tau_2}}, \quad \xi = \frac{1}{2}\omega_n \left(\tau_2 + \frac{1}{K}\right),\tag{2}$$

where  $\tau_1 = C(R_1 + R_2)$  and  $\tau_2 = CR_2$ .  $\xi \omega_n$  can be expressed as a constant  $Z_W$ . To meet the jitter transfer specification for STM-16 stipulated in ITU-T recommendation G.958,  $Z_W$  has to be lower than 6 MHz.

Once  $Z_W$  and the lower limit of  $\xi$  are fixed and  $\tau_2 \gg 1/K$ , the lower limit of  $\tau_2$ , which is one of the loop filter time constants, is also fixed by (3):

$$\tau_2 = 2\xi^2 / Z_W. \tag{3}$$

Also the pull-in range can be expressed as (see [6])

$$\Omega_p = 2Z_W \sqrt{2\left(\frac{\tau_1}{\tau_2} + 1\right)}. (4)$$

It is recommended that the jitter peaking be lower than 0.1 dB, which fixes the lower limit of  $\xi$  at  $\xi > 4$ . The open loop gain K which includes the product of phase detector sensitivity  $K_{\rm PD}$ , VCO modulation factor  $K_{\rm VCO}$  and so on, is limited in the circuit design. Thus  $\tau_1$  can be inferred from eq. (2).

#### 3 Receiver architecture

The architecture of the optical receiver is shown in Figure 1, and the incoming optical signal is first converted into electrical pulses by the photo diode. A preamplifier and a limiting amplifier are implemented to improve the input sensitivity, and then the 2.5 Gb/s input data is introduced to the four parallel latch chains generating the six samples. The sampling circuit acts as phase detector and 1:2 demultiplexer simultaneously. This half-bit-rate architecture of the CDR/DEMUX IC is based on the concept reported in [7] and has already realized in our previous work [8]. The combination of the PD and the 1:2 DEMUX adopts a 1.25 GHz half-bit-rate clock provided by VCO.

Additionally, the early-late phase logic circuit consists of four XORs and two selectors. It detects whether the clock with respect to the incoming data transition is early or late. It has simpler logic and less transistor than those of other work [2]. This early-late phase logic circuit provides the Early-Late control signal for the charge pump.

To convert the two 1:2 demultiplexed 1.25 Gb/s signals a2, and c2 into commercially available 625 Mb/s DEMUX ICs, an additional 2:4 demultiplexer is included, which produces four 625 Mb/s output signals D0, D1, D2, D3, and the 625 MHz output clock Clk.

#### 4 Building blocks

## 4.1 Preamplifier and limiting amplifier

A preamplifier and a limiting amplifier constitute the front-end circuit of the optical receiver. The preamplifier converts the small current pulse signal from the photo diode to a certain voltage pulse signal and it must have low noise characteristic as the input signal is faint. The preamplifier must provide a high enough gain to acquire high sensitivity and low bit error rate (BER). On the other hand, it should



Figure 1 Receiver architecture.

have appropriate bandwidth to enable the receiver to operate at given speed level and correspond to a small noise. In engineering application, the bandwidth is chosen at 0.8 times of the data rate. The elementary trans-impedance preamplifier structure is shown in Figure 2. The feedback resistor  $R_{\rm f}$  provides voltage shunt negative feedback and it converts current into voltage.

The preamplifier cannot attain high enough gain on the condition of low noise and appropriate bandwidth, and the signal should be amplified further to obtain required amplitude. The limiting amplifier converts the output signal into a high enough and invariable output voltage to drive the later clock recovery and data decision circuit. The limiting amplifier contains five broad-band amplifiers to obtain large enough dynamic range. An effective way to increase the bandwidth is to introduce an inductive component into the loads of each gain stage. The cell of the limiting amplifier with active inductor loads is illustrated in Figure 3. According to the small-signal model of an active inductor, its equivalent input impedance can be written as

$$Z_{\rm in} = \frac{1 + R_g s C_{gs}}{g_m + s C_{gs}}. ag{5}$$

Thus, the equivalent inductance L and serial resistance R can be obtained from eq. (5). The bandwidth of the maximum flatness gain can be about 1.72 times as large as that of the same stage without inductor compensation.

# 4.2 VCO and charge pump

VCO adopts six stages ring oscillators and produces two clock signals C0 and C90 with a 90 degree phase shift, as shown in Figure 1. The control of the VCO is split into a coarse input and a fine input. The fine control is established by the phase detector and the coarse control is a provision for the change of temperature and technology. Figure 4 shows the delay cell of the VCO, which has a differential structure in order to reduce the power-supply-injected phase noise.

Figure 5 describes the schematic of the differential charge pump where differential pairs M1-M2 and M3-M4 are driven by the early-late phase logic circuit. The Early and Late signals activate only pull-down currents, and the pull-up currents are passive. Thus, when both Early and Late are low, a common-mode feedback circuit must counteract the pull-up currents to maintain a proper level. The feedback network







Figure 3 Broad-band amplifier with active inductor loads.







Figure 5 Schematic of differential charge pump.

consisting of M5-M9 sets the output common mode level at  $V_{GS5}+V_{GS9}$ . An additional advantage is that the differential implementation alleviates the mismatch and charge-sharing problems [9].

# 4.3 Low-jitter design procedure

Loop parameter optimization scheme is implemented to achieve low jitter performance, and the parameters of the loop filter are specified in accordance with eqs. (2)–(4). Furthermore, the capacitor C of the loop filter can be decreased to so a small a value as to be integrated in chip by reducing the open loop gain K as from eq. (2). The whole receiver chip has no external components and is highly integrated. The loop parameter optimization can comply with this flow to suppress jitter and obtain compacted architecture.

In addition to achieving jitter reduction through the loop parameter optimization, noise generation in the noise sources and noise leakage between circuits in the IC must be reduced as much as possible. The method of differential circuits is adopted to reduce the disturbing common-mode noise and the jitter generation of the IC. To reduce the noise leakage, the function areas of the chip are separated as far as possible from each other and substrate contacts are inserted in between them [10].



Figure 6 Chip micrograph of the receiver.





Figure 7 The jitter of the recovered 625 MHz clock.



Figure 9 The waveforms with one  $2.5~\mathrm{Gb/s}$  input and four  $625~\mathrm{Mb/s}$  output data.

#### 5 Experimental results

The 2.5 Gb/s optical receiver IC is realized in TSMC standard 0.25  $\mu$ m single-poly 5-metal (5M1P) CMOS technology via the multi-project wafer (MPW) service of our institute. The cutoff frequency  $f_{\rm T}$  of this process is about 18 GHz. A micrograph of the fabricated chip is shown in Figure 6. The chip size, including the bonding pads, is only 0.97 mm  $\times$  0.97 mm, and its power dissipation is 650 mW with a single 3.3 V power supply.

The performance of the fabricated receiver IC is measured on-wafer together using a pulse pattern generator (Advantest D3186), a broad bandwidth oscillograph (Agilent DCA 86100A), a cascade probe station and 40 GHz microwave probes. The recovered 625 MHz clock spectrum with a 9 mV 2.5 Gb/s  $2^{23}-1$  pseudorandom bit sequences (PRBS) input data exhibits a phase noise of -83.8 dBc/Hz at 20 kHz off the center frequency. Figure 7 shows that the measured rms and peak-to-peak jitter of the 625 MHz clock are 9.4 and 46.3 ps, corresponding to 0.0059 and 0.029 UI, respectively. It meets the jitter specifications of the ITU-T recommendation G.958 which stipulates the rms, and peak-to-peak jitters are lower than 0.01 and 0.1 UI, respectively.

An eye diagram of the well regenerated and demultiplexed 625 Mb/s data signal of the output D0–D3 is shown in Figure 8. It indicates that the eye opening of the output data was sufficiently wide and free of the chip functions error while contrasting the eye diagram with STM4/OC12 mask.

Figure 9 shows the waveforms of one 2.5 Gb/s input datum and four pieces of 625 Mb/s output data, where Di is the input data and D0, D1, D2, and D3 are the output data. It is shown that one serial datum has been correctly demultiplexed into four parallel lower speed data.

The amplitude of the input data Di in Figure 9 is 8 mV, or even smaller, as the minimum amplitude is limited by the 2 mV noise floor of the source generator (Advantest D3186). Corresponding to 8 mV input voltage, the input current is 20  $\mu$ A. Thus the actual input sensitivity of the receiver should be less

|                  | This work                                | Ref. [11]                                | Ref. [12]                          | Ref. [13]                               |
|------------------|------------------------------------------|------------------------------------------|------------------------------------|-----------------------------------------|
| Process          | 0.25 μm CMOS                             | 0.5 um Si-bipolar                        | 0.7 um Si-bipolar                  | 0.35 μm CMOS                            |
|                  | $(f_T = 18.6 \text{ GHz})$               | $(f_T=40 \text{ GHz})$                   | $(f_T=27 \text{ GHz})$             | $(f_T=14 \text{ GHz})$                  |
| Power supply     | $650~\mathrm{mW}$                        | 1.1 W                                    | $500~\mathrm{mW}$                  | $100~\mathrm{mW}$                       |
| Core supply      | $300~\mathrm{mW}$                        | $680~\mathrm{mW}$                        | 350 mW (no demux)                  | $100~\mathrm{mW}$                       |
| Supply voltage   | +3.3  V                                  | +3.3  V                                  | +3.3  V                            | +3 V                                    |
| Area             | $0.97~\mathrm{mm}\times0.97~\mathrm{mm}$ | $3~\mathrm{mm}$ $\times$ $5~\mathrm{mm}$ | $2~\mathrm{mm}\times2~\mathrm{mm}$ | $1.65~\mathrm{mm}\times1.5~\mathrm{mm}$ |
| Gate count       | 800                                      | 400                                      | 350                                | 150                                     |
| Functions        | PreAmp+Limited                           | Limited Amp+                             | Limited Amp+                       | PreAmp+                                 |
|                  | $_{\rm Amp+CDR+1:4DEMUX}$                | CDR+1:8DEMUX                             | CDR                                | Limited Amp                             |
| Clock RMS jitter | $0.0059~\mathrm{UI}$                     | $0.008~\mathrm{UI}$                      | $0.025~\mathrm{UI}$                | $0.12~\mathrm{UI}$                      |

Table 1 Comparison of performance with other work

than 20  $\mu$ A. This is due to the fact that the sensitivity is limited by the SNR of the input signal.

The receiver exhibits a pull-in range of 80 MHz, and the tuning range of the VCO coarse control is 400 MHz, covering the frequency deviation caused by fluctuations in technology, temperature and voltage. In addition to a low jitter, this receiver achieves a small chip area and power that is substantially lower than that of other designs. The comparison list in Table 1 shows that the primary performance of this work is superior to previous research [11–14].

#### 6 Conclusions

A complex ( $\sim 800$  transistors and 1 mm square) mixed-signal receiver chip with on-chip preamplifier, limiting amplifier, clock and data recovery, and 1:4 demultiplexer was successfully realized in a standard 0.25 µm CMOS technology. The half-rate phase detector provides a bang-bang characteristic while retiming and demultiplexing the data with no systematic phase offset. The receiver converts a 2.5 Gb/s  $2^{23}-1$  PRBS NRZ signal into four 625 Mb/s signals. In addition to a low jitter, this receiver achieves a small chip area and power that is substantially lower than that of similar circuits. The high integration, low jitter and low power dissipation of this receiver IC and low cost of the CMOS process hold great promise for realizing this chip in optical communication system.

#### Acknowledgements

This work was supported by the National Natural Science Foundation of China (Grant No. 60976029).

### References

- $1\quad \text{Razavi B. Challenges in the design high-speed clock and data recovery circuits. IEEE Commun Mag, 2002, 40: 94–101}$
- 2 Rogers J E, Long J R. A 10-Gb/s CDR/DEMUX with LC delay line VCO in 0.18- $\mu$ m CMOS. IEEE J Solid State Circuits, 2002, 37: 1781–1789
- 3 Sato F, Tezuka H, Soda M, et al. A 2.4 Gb/s receiver and a 1:16 demultiplexer in one chip using a super self-aligned selectively grown SiGe base (SSSB) bipolar transistor. IEEE J Solid State Circuits, 1996, 31: 1451–1457
- 4 Soda M, Shiori S, Morikawa T, et al. A 2.5-Gb/s one-chip receiver module for Gigabit-To-The-Home (GTTH) system. In: Proceedings of the IEEE 1999 Custom Integrated Circuits Conference, San Diego, USA, 1999. 273–276
- 5 Soliman S, Yuan F, Raahemifar K. An overview of design techniques for CMOS phase detectors. In: 2002 IEEE International Symposium on Circuits and Systems, Scottsdale, USA, 2002. 457–460
- 6 Kishine K, Ishii K, Ichino H. Loop-parameter optimization of a PLL for a low-jitter 2.5-Gb/s one-chip optical receiver IC with 1:8 DEMUX. IEEE J Solid State Circuits, 2002, 37: 38–50
- 7 Hauenschild J, Dorschky C, von Mohrenfels T W, et al. A plastic packaged 10-Gb/s BiCMOS clock and data recovery 1:4-Demultiplexer with external VCO. IEEE J Solid State Circuits, 1996, 31: 2056–2059
- 8 Chen Y M, Wang Z G, Zhang L, et al. Monolithic IC of SDH STM-16 optical receiver core circuits. In: Proceedings of 2005 International Conference on Communications, Circuits and Systems, Hong Kong, China, 2005. 27–30

- 9 Razavi B. Monolithic Phase-Locked Loops and Clock Recovery Circuits: Theory and Design. New York: IEEE Press, 1996
- 10 Makie-Fukuda K, Kikuchi T, Hotta M. Measurement of digital noise in mixed-signal integrated circuits. VLSI Circuits Symp Dig Tech, 1993, 3: 23–24
- 11 Kishine K, Ishii K, Hirose M, et al. A low-jitter, low-power 2.5-Gb/s one-chip optical receiver IC with 1:8 DEMUX. In: Proceedings of Bipolar/BiCMOS Circuits and Technology Meeting (BCTM), Minneapolis, USA, 1999. 177–180
- 12 Pallotta A, Centureli F, Trifiletti A. A low-power clock and data recovery circuit for 2.5 Gb/s SDH receivers. In: Proceedings of the 2000 International Symposium on Low Power Electronics and Design, Rapallo, Italy, 2000. 67–72
- 13 Chen W Z, Lu C H. Design and analysis of a 2.5-Gbps optical receiver analog front-end in a 0.35- $\mu$ m digital CMOS technology. IEEE Trans Circuits Syst, 2006, 53: 977–983
- 14 Tian J, Shi Y, Zheng Y D, et al. 1.25 Gb/s low power CMOS limiting amplifier for optical receiver. In: Proceedings of 7th International Conference on Solid-State and Integrated Circuits Technology, Beijing, China, 2004. 1469–1471