<table>
<thead>
<tr>
<th>Title</th>
<th>A Study on Modeling and Design Methodology for High-Performance On-Chip Interconnection (Dissertation)</th>
</tr>
</thead>
<tbody>
<tr>
<td>Author(s)</td>
<td>Tsuchiya, Akira</td>
</tr>
<tr>
<td>Citation</td>
<td>Kyoto University (京都大学)</td>
</tr>
<tr>
<td>Issue Date</td>
<td>2005-11-24</td>
</tr>
<tr>
<td>URL</td>
<td><a href="https://doi.org/10.14989/doctor.k11961">https://doi.org/10.14989/doctor.k11961</a></td>
</tr>
<tr>
<td>Right</td>
<td></td>
</tr>
<tr>
<td>Type</td>
<td>Thesis or Dissertation</td>
</tr>
<tr>
<td>Textversion</td>
<td>author</td>
</tr>
</tbody>
</table>

Kyoto University
A Study on Modeling and Design Methodology for High-Performance On-Chip Interconnection

Akira Tsuchiya
A Study on Modeling and Design Methodology for High-Performance On-Chip Interconnection

Akira Tsuchiya
Abstract

This thesis discusses among the modeling and the design methodology of on-chip high-performance interconnection. Due to the advances in LSI fabrication process, chip performance is continuously improving. As the performance of LSIs improves, on-chip signaling has been becoming a bottleneck of the whole chip performance. The performance of on-chip interconnects has been discussed but the main focuses are the signal propagation delay and the power dissipation. On the other hand, multi-core architecture is considered to be a mainstream of high-performance microprocessor design. Network-on-Chip (NoC) is also discussed. NoC is an architecture integrating a network of processors into a single chip. According to these technology trends, the throughput of on-chip interconnection is becoming an important metrics in circuit design. The core components of on-chip signaling are a driver, a receiver and an interconnect. In this research, modeling and performance estimation methods for each component are proposed.

First, a parameter extraction from a physical structure of on-chip interconnects is discussed. Conventionally, on-chip interconnects are modeled by a lumped RC model. However as the operating frequency becomes higher, the inductance of on-chip interconnects become significant. To evaluate such interconnect characteristics, return current distribution is a crucial problem. In LSIs, huge amount of interconnects are integrated and it is difficult to specify return paths from those wires. This thesis proposes a screening method to select wires that have to be considered as return path. Conventionally, only the parallel wires are considered as the candidate of return path. This research evaluates the interconnect characteristics by real-chip measurement and field-solvers. Experimental results reveal that wide and dense orthogonal wires such as power/ground rails can be return paths and affect the interconnect characteristics. The effect of silicon substrate is also discussed. Modeling of lossy substrate is one of the difficulties in modeling on-chip interconnects. This research shows that the power/ground wires interrupt the coupling between the signal interconnects in upper layers and the substrate. From the measurement results, the effect of lossy substrate is negligible if there are power/ground grid in the lower interconnect layer.

Next, an interconnect modeling method is proposed. The characteristic of on-chip interconnects depends on the frequency due to the skin-effect, the proximity-effect and the return current distribution. Interconnect models that can treat frequency-dependence are already developed. On the other hand, a conventional frequency-independent model is used in the early stage of the circuit design. Therefore improving the accuracy of the conventional frequency-independent model is important issue as well as developing the frequency-dependent models. This thesis proposes a method to determine a single extraction frequency based on the transfer characteristics of transmission-lines.

Then an analytical performance estimation of on-chip interconnects is proposed. In interconnect
design, there are trade-offs among the interconnect length, operating frequency and the attenuation of the interconnect. This research approximates the waveform on on-chip transmission-lines by a piecewise-linear (PWL) waveform model. By deriving the eye-opening analytically, the proposed method provides trade-off curves among design parameters and indicates in which region the single-end transmission-line or the differential pair should be used.

A driver and a receiver circuit also play an important role in on-chip signaling. The driver and the receiver circuits are roughly classified into static CMOS inverters and current-mode logic (CML) buffers. Impedance matching is a common practice in the driver design. This research proposes a driver sizing for static CMOS drivers. The proposed method considers the loss of transmission-lines and improves the signal propagation delay by using a stronger driver than the impedance-matched driver. Then a performance estimation of CML buffers is proposed. The proposed method focuses the pole frequency and estimates the maximum operating frequency.

By merging the analytical performance estimation of on-chip interconnects and the performance estimation of driver/receiver circuits, this thesis proposes a performance estimation of high-speed signaling systems composed with a driver, an interconnect and a receiver. The contributions of the proposed method are performance estimation and trade-off analysis between single-end signaling and differential signaling in the early stage of circuit design. The proposed method is also applicable to performance prediction in future fabrication process.
Acknowledgments

I would like to thank many people who helped make my stay at Kyoto University a great experience. First of all, I would like to express my gratitude to Professor Hidetoshi Onodera in Kyoto University. He has been my advisor from undergraduate research and his excellent vision and leadership provided a great research environment. His appropriate advices lead me to many successful investigations. I am highly grateful and honored to have a wonderful opportunity to study in his group.

I appreciate Dr. Masanori Hashimoto in Osaka University for his technical suggestions and thorough paper reviews. He gave me much motivation and inspiration. I thank associate professor Dr. Kazutoshi Kobayashi in Kyoto University for many advices from fundamentals of LSIs to computer environments. I am grateful to Dr. Kenichi Okada in Tokyo Institute of Technology. Detailed technical discussions with him are always helpful.

Kyoto University is a wonderful place to study. I would like to thank Professor Shinji Tomita and Professor Takashi Matsuyama for their profitable advices on writing this thesis. Their suggestions throughout reviewing raise the maturity of this thesis. I am grateful to Professor Yukihiro Nakamura and the members of his laboratory for interaction in technical discussion and daily life. I thank Dr. Takashi Hisakado in Kyoto University for providing a good discussion about electromagnetics. The discussion deepened my understanding of classical electrodynamics. I thank the members in Onodera Laboratory for their contributions through active discussions. Especially, Mr. Akinori Shinmyo assisted driver and receiver design discussed in Chapter 5. I feel grateful to Mr. Daisuke Hiramatsu and Mr. YuuyaGotou for the discussions about on-chip interconnects. I also thank Mr. Yoichi Yuyama for giving several suggestions from a different angle.

I would like to express my appreciation to Professor Kazuya Masu and Mr. Hiroyuki Ito in Tokyo Institute of Technology for discussing on-chip interconnects and measurement techniques. Mr. Toshiki Kanamoto in Renesus Technology Corp. also gave me many suggestions and comments. I am thankful to their contributions.

The VLSI chip in this study has been fabricated in the chip fabrication program of VLSI Design and Education Center(VDEC), the University of Tokyo. I would like to tender my acknowledgments to VDEC for giving me many chances of chip fabrication. I appreciate the financial support from the Japan Society for the Promotion of Science.

Finally, I would like to thank my parents Masateru and Hiroko for their constant support and caring.
Contents

Abstract i

Acknowledgments iii

1 Introduction 1

1.1 On-chip interconnects in high-performance LSIs 1

1.1.1 Performance trend and the interconnect bottleneck problem 1

1.1.2 Transmission-line effect 2

1.1.3 Parameter extraction of on-chip interconnects 4

1.1.4 Frequency-dependence of interconnect characteristics 7

1.1.5 Termination of interconnects 9

1.1.6 On-chip buffers 11

1.1.7 Signaling schemes to improve the performance of on-chip communication 14

1.2 Survey of related works 14

1.2.1 Modeling and design of on-chip interconnects 14

1.2.2 Driver and receiver design for high-speed signaling 15

1.2.3 Signaling methods for on-chip interconnection 15

1.3 Contributions of this thesis 16

1.4 Organization of this thesis 16

1.4.1 Modeling of on-chip interconnects 16

1.4.2 Interconnect RL extraction at a single representative frequency 17

1.4.3 Analytical performance estimation of on-chip transmission-lines 17

1.4.4 Driver/Receiver design for high-speed signaling 18

1.4.5 Design methodology of on-chip high-speed signaling 18

2 Modeling of on-chip interconnects 19

2.1 Introduction 19

2.2 Return path selection for loop RL extraction 21

2.2.1 Effect of the skin-effect on return current distribution 21

2.2.2 Indicator for return path selection 21

2.2.3 Flow of the proposed method 23

2.2.4 Experimental results 23

2.3 Effect of orthogonal interconnects 29
2.3.1 Test structure .................................................. 30
2.3.2 Measurement and simulation results .......................... 30
2.4 Effect of lossy substrate ......................................... 31
2.4.1 Test structure .................................................. 31
2.4.2 Measurement results ........................................... 33
2.5 Summary .......................................................... 35

3 Interconnect RL extraction at a single representative frequency 37
3.1 Introduction ......................................................... 37
3.2 Conventional extraction frequencies ............................ 38
3.3 Representative frequency for uniform transmission-lines .... 39
  3.3.1 Transfer characteristic of open-ended transmission-lines 39
3.4 Experimental results of uniform transmission-lines .......... 42
  3.4.1 Experimental conditions and the metrics of accuracy .... 42
  3.4.2 Pulse pattern versus accuracy ................................ 45
  3.4.3 Transition time versus accuracy ............................. 49
  3.4.4 Interconnect length versus accuracy ....................... 51
  3.4.5 Overall results of uniform transmission-lines .......... 53
  3.4.6 Tolerance to extraction frequency variation ............... 53
3.5 An extended method to determine an extraction frequency ... 54
  3.5.1 Transfer characteristic of generic transmission-lines 54
  3.5.2 Flow of the proposed method ................................ 55
3.6 Experimental results ............................................. 58
  3.6.1 H-tree topology .............................................. 59
  3.6.2 Stub-Bus topology .......................................... 60
  3.6.3 Results of overall experiments ............................. 61
3.7 Summary .......................................................... 63

4 Analytical performance estimation of on-chip transmission-lines 65
4.1 Introduction ......................................................... 65
4.2 Analytical estimation of interconnect performance .......... 66
  4.2.1 Figure of merit for signaling performance ................ 66
  4.2.2 Assumptions on derivation .................................. 67
  4.2.3 Piecewise-linear waveform model ............................ 67
  4.2.4 Derivation of eye-opening voltage ......................... 69
4.3 Verification of analytical estimation ........................... 71
  4.3.1 Simulation setup .............................................. 71
  4.3.2 Eye-diagram vs. PWL waveform model ....................... 72
  4.3.3 The effect of attenuation and crosstalk noise ............ 72
  4.3.4 Bit rate vs. eye opening voltage ............................ 73
  4.3.5 Attenuation vs. eye opening voltage ....................... 75
  4.3.6 Verification by circuit simulation .......................... 76
4.4 Trade-off analysis of on-chip interconnects .................... 77
<table>
<thead>
<tr>
<th>Section</th>
<th>Title</th>
<th>Pages</th>
</tr>
</thead>
<tbody>
<tr>
<td>4.5</td>
<td>Design Guideline for resistive termination</td>
<td></td>
</tr>
<tr>
<td>4.5.1</td>
<td>Termination for maximizing the eye-opening voltage</td>
<td></td>
</tr>
<tr>
<td>4.5.2</td>
<td>Sensitivity to the variation of resistance</td>
<td></td>
</tr>
<tr>
<td>4.6</td>
<td>Summary</td>
<td></td>
</tr>
<tr>
<td>5</td>
<td>Driver/Receiver design for high-speed signaling</td>
<td></td>
</tr>
<tr>
<td>5.1</td>
<td>Introduction</td>
<td></td>
</tr>
<tr>
<td>5.2</td>
<td>CMOS driver sizing for lossy transmission-lines</td>
<td></td>
</tr>
<tr>
<td>5.2.1</td>
<td>Modeling of on-chip interconnects</td>
<td></td>
</tr>
<tr>
<td>5.2.2</td>
<td>Driver output impedance and its impact on signal waveform</td>
<td></td>
</tr>
<tr>
<td>5.2.3</td>
<td>Equivalent driver output resistance</td>
<td></td>
</tr>
<tr>
<td>5.2.4</td>
<td>Driver Sizing for Lossless Transmission-Lines</td>
<td></td>
</tr>
<tr>
<td>5.2.5</td>
<td>Application to lossy transmission-lines</td>
<td></td>
</tr>
<tr>
<td>5.2.6</td>
<td>Ringing caused by impedance mismatch</td>
<td></td>
</tr>
<tr>
<td>5.3</td>
<td>Bandwidth of static CMOS drivers</td>
<td></td>
</tr>
<tr>
<td>5.3.1</td>
<td>Tapered static CMOS drivers</td>
<td></td>
</tr>
<tr>
<td>5.3.2</td>
<td>Bandwidth of tapered CMOS buffers</td>
<td></td>
</tr>
<tr>
<td>5.4</td>
<td>CML driver/receiver design based on the pole frequency</td>
<td></td>
</tr>
<tr>
<td>5.4.1</td>
<td>Tapered CML buffer</td>
<td></td>
</tr>
<tr>
<td>5.4.2</td>
<td>Pole frequency of tapered CML buffers</td>
<td></td>
</tr>
<tr>
<td>5.4.3</td>
<td>Performance estimation based on the pole frequency</td>
<td></td>
</tr>
<tr>
<td>5.4.4</td>
<td>Output amplitude and the performance of CML buffers</td>
<td></td>
</tr>
<tr>
<td>5.4.5</td>
<td>Performance prediction of CML buffers</td>
<td></td>
</tr>
<tr>
<td>5.4.6</td>
<td>Design of CML receivers</td>
<td></td>
</tr>
<tr>
<td>5.5</td>
<td>Summary</td>
<td></td>
</tr>
<tr>
<td>6</td>
<td>Design methodology of on-chip high-speed signaling</td>
<td></td>
</tr>
<tr>
<td>6.1</td>
<td>Introduction</td>
<td></td>
</tr>
<tr>
<td>6.2</td>
<td>Interconnects under study</td>
<td></td>
</tr>
<tr>
<td>6.3</td>
<td>Performance of single-end signaling</td>
<td></td>
</tr>
<tr>
<td>6.4</td>
<td>Performance of differential signaling</td>
<td></td>
</tr>
<tr>
<td>6.5</td>
<td>Comparison between single-end signaling and differential signaling</td>
<td></td>
</tr>
<tr>
<td>6.5.1</td>
<td>Required noise margin of single-end signaling and differential signaling</td>
<td></td>
</tr>
<tr>
<td>6.5.2</td>
<td>Single-end signaling versus differential signaling</td>
<td></td>
</tr>
<tr>
<td>6.6</td>
<td>Summary</td>
<td></td>
</tr>
<tr>
<td>7</td>
<td>Conclusion</td>
<td></td>
</tr>
<tr>
<td>Bibliography</td>
<td></td>
<td></td>
</tr>
<tr>
<td>Publication list</td>
<td></td>
<td></td>
</tr>
</tbody>
</table>
List of Tables

3.1 Range of parameters and representative frequencies. ..................... 43
3.2 Maximum errors when the period of input pulse changed. .................. 46
3.3 Maximum errors when the transition time changed. ........................ 49
3.4 Maximum errors when the interconnect length changed. ..................... 51
3.5 Maximum errors in overall experiments. .................................... 53
3.6 Errors in the delay time and the transition time on at the node E of the H-tree. 60
3.7 Errors in the delay time and the transition time at the node I of the stub-bus. 61
3.8 Statistical summary of overall experiments. ................................. 62
# List of Figures

1.1 Delay for Metal 1 and global wiring versus feature size. ........................................ 2
1.2 Effective resistivity predicted in Ref. [1]. ............................................................ 3
1.3 RLC-model of on-chip interconnects. ....................................................................... 3
1.4 Transmission-line effects on waveform. ..................................................................... 3
1.5 Multi-layer on-chip interconnects. ............................................................................ 4
1.6 Parasitics of multi-layer interconnects. ................................................................. 4
1.7 Return current distribution in low frequency. ......................................................... 5
1.8 Return current distribution in high frequency. ....................................................... 5
1.9 Return current distribution in parallel ground wires. ............................................ 6
1.10 Frequency characteristics of the loop resistance and inductance. ....................... 6
1.11 Errors in extracted value. (at 100GHz) .................................................................. 7
1.12 Frequency-dependence of resistance and inductance. (co-planar structure, signal line width 4μm, ground line width 10μm, spacing 2μm) ........................................... 8
1.13 The impact of frequency-dependence on transition waveform. (interconnect structure is shown in Fig. 1.12, $R_d = 150\Omega$) ..................................................... 10
1.14 Energy per bit and the resistance of the terminator (5mm length, $Z_0 = 100\Omega$, 20Gbps signaling). ................................................................. 11
1.15 Static CMOS inverter. ............................................................................................ 12
1.16 Transfer characteristics of static CMOS inverter. ............................................... 12
1.17 CML differential buffer. ....................................................................................... 13
1.18 Transfer characteristics of CML differential buffer. ........................................ 13

2.1 Return current distribution with and without skin effect. .................................... 22
2.2 Flow of the proposed method. ................................................................................ 24
2.3 Extraction error of the loop impedance and the proposed indicator $\Delta U$ (uniform structure, at 1MHz). ................................................................. 25
2.4 Extraction error of the loop impedance and the proposed indicator $\Delta U$ (uniform structure, at 100GHz). ................................................................. 25
2.5 Frequency characteristics of selected interconnects (uniform structure). .......... 26
2.6 Considered ground wires for RL extraction (uniform structure). ....................... 26
2.7 Evaluated interconnect structure. ........................................................................... 27
2.8 Extraction error of the loop impedance and the proposed indicator $\Delta U$ (realistic structure, at 1MHz). ................................................................. 27
LIST OF FIGURES

2.9 Extraction error of the loop impedance and the proposed indicator ΔU (realistic structure, at 100GHz). ......................................... 28
2.10 Considered ground wires for RL extraction (realistic structure). .......................................................... 28
2.11 Frequency characteristics of selected interconnects (realistic structure of Fig. 2.10). ........................................ 28
2.12 Extraction cost and the additional cost by the proposed method. .................................................. 29
2.13 A signal wire and parallel/orthogonal power/ground wires (top view). ........................................ 30
2.14 Test structure and eddy current in orthogonal interconnects. .................................................. 30
2.15 Self-inductance of a co-planar line with orthogonal P/G wires (W = S = 4μm, measured). ........................................ 31
2.16 Relationship between the self-inductance and the wire width of the orthogonal wires (density= 30%, field-solver). ........................................ 31
2.17 Relationship between the self-inductance and the density of the orthogonal wires (W = 24μm, field-solver). ........................................ 32
2.18 Inductance vs. the width and the density of orthogonal wires (at 10GHz, field-solver). ........................................ 32
2.19 The cross section and the top view of test structures. .................................................. 32
2.20 Structure of ground wires in M1 (top view). ........................................................................ 33
2.21 Micrograph of a test structure (without M1 wires). .................................................. 33
2.22 Self-resistance (spacing S = 2μm). .................................................. 34
2.23 Self-resistance (spacing S = 19μm). .................................................. 34
3.1 RLC ladder circuit model. ................................................................................. 39
3.2 Frequency spectrum and the significant frequency f_{sig}. .................................................. 39
3.3 Waveform at near-end and far-end. (interconnect structure is shown in Fig. 1.12, Z₀ = 55Ω, R₀ = 10Ω) ........................................................................ 41
3.4 Open-ended transmission-line and equivalent series resonator. .................................................. 41
3.5 Transfer characteristics of a transmission-line shown in Fig. 1.12, interconnect length is 5mm. .................................................. 42
3.6 Frequency spectrum of waveform at the far-end. ........................................................................ 43
3.7 Cross-sections of interconnects. ........................................................................ 44
3.8 Equivalent circuit of coupled transmission-line. ........................................................................ 44
3.9 Experimental circuit for transient analysis. ........................................................................ 44
3.10 Definition of delay time, peak-to-peak voltage and crosstalk. .................................................. 45
3.11 Voltage peak-to-peak when the period of pulse changed. .................................................. 46
3.12 Delay when the period of pulse changed. ........................................................................ 47
3.13 The waveform at the far-end of the aggressor. ........................................................................ 47
3.14 The waveform at the far-end of the victim. ........................................................................ 48
3.15 The waveform driven by transistors. (Transistor W/L = 720) ........................................................................ 48
3.16 Voltage peak-to-peak when the transition time is changed. .................................................. 49
3.17 Delay time when the transition time is changed. ........................................................................ 50
3.18 Crosstalk noise peak-to-peak when the transition time changed. .................................................. 50
3.19 Voltage peak-to-peak when the interconnect length changed. .................................................. 51
3.20 Delay time when the interconnect length changed. ........................................................................ 52
3.21 Normalized delay time when the interconnect length changed. .................................................. 52
3.22 Crosstalk noise peak-to-peak when the interconnect length changed. .................................................. 53
### LIST OF FIGURES

<table>
<thead>
<tr>
<th>Figure</th>
<th>Description</th>
<th>Page</th>
</tr>
</thead>
<tbody>
<tr>
<td>3.23</td>
<td>Extraction frequency versus errors.</td>
<td>54</td>
</tr>
<tr>
<td>3.24</td>
<td>Transmission-line.</td>
<td>55</td>
</tr>
<tr>
<td>3.25</td>
<td>Conceptual diagram of the proposed method.</td>
<td>56</td>
</tr>
<tr>
<td>3.26</td>
<td>Voltage gain estimated by the equivalent load impedance.</td>
<td>58</td>
</tr>
<tr>
<td>3.27</td>
<td>H-tree topology.</td>
<td>59</td>
</tr>
<tr>
<td>3.28</td>
<td>Waveform at the node E of H-tree.</td>
<td>60</td>
</tr>
<tr>
<td>3.29</td>
<td>Stub-bus topology.</td>
<td>61</td>
</tr>
<tr>
<td>3.30</td>
<td>Extraction frequencies by the proposed method.</td>
<td>61</td>
</tr>
<tr>
<td>3.31</td>
<td>Waveform at the node I of stub-bus.</td>
<td>62</td>
</tr>
<tr>
<td>4.1</td>
<td>An example of eye-diagram and the figure of merit.</td>
<td>67</td>
</tr>
<tr>
<td>4.2</td>
<td>Circuit model of a transmission-line with resistive termination.</td>
<td>68</td>
</tr>
<tr>
<td>4.3</td>
<td>PWL waveform model.</td>
<td>68</td>
</tr>
<tr>
<td>4.4</td>
<td>Cross section of the interconnect.</td>
<td>72</td>
</tr>
<tr>
<td>4.5</td>
<td>Experimental circuit.</td>
<td>72</td>
</tr>
<tr>
<td>4.6</td>
<td>An eye-diagram and its PWL waveform model (20Gbps).</td>
<td>73</td>
</tr>
<tr>
<td>4.7</td>
<td>The effect of crosstalk noise over the eye-opening.</td>
<td>74</td>
</tr>
<tr>
<td>4.8</td>
<td>The attenuation vs. interconnect width (at 10GHz).</td>
<td>74</td>
</tr>
<tr>
<td>4.9</td>
<td>Bit rate vs. eye opening (10mm length, Z₀ = 100Ω).</td>
<td>75</td>
</tr>
<tr>
<td>4.10</td>
<td>Attenuation vs. eye opening (at 20Gbps).</td>
<td>76</td>
</tr>
<tr>
<td>4.11</td>
<td>Cross section of the co-planar structure.</td>
<td>76</td>
</tr>
<tr>
<td>4.12</td>
<td>Eye-opening voltage versus the normalized impedance of the termination (with various attenuations, 20Gbps input, 10mm length)</td>
<td>77</td>
</tr>
<tr>
<td>4.13</td>
<td>Eye-opening voltage versus the normalized impedance (with various bit rate, n = 0.4, 10mm length)</td>
<td>78</td>
</tr>
<tr>
<td>4.14</td>
<td>Bit rate vs. maximum interconnect length with various receiver sensitivity (Vₑq).</td>
<td>79</td>
</tr>
<tr>
<td>4.15</td>
<td>Bit rate vs. maximum interconnect length with various attenuation. (high n means low attenuation.)</td>
<td>79</td>
</tr>
<tr>
<td>4.16</td>
<td>Eye diagram of 80Gbps signaling on 10mm differential interconnect.</td>
<td>80</td>
</tr>
<tr>
<td>4.17</td>
<td>Optimal normalized impedance versus signal bit rate.</td>
<td>81</td>
</tr>
<tr>
<td>4.18</td>
<td>Sensitivity to the variation of the resistance (attenuation n = 0.6, 10mm length interconnect)</td>
<td>82</td>
</tr>
<tr>
<td>5.1</td>
<td>Interconnect structure for R, L, C extraction</td>
<td>85</td>
</tr>
<tr>
<td>5.2</td>
<td>The model of transmission-line</td>
<td>85</td>
</tr>
<tr>
<td>5.3</td>
<td>Result of impedance matching based on the driver modeling of [2]. (A 0.13µm process, characteristic impedance 84Ω, 5mm long wire, pMOS W/L = 442, nMOS W/L = 158)</td>
<td>86</td>
</tr>
<tr>
<td>5.4</td>
<td>Result of impedance matching by Eq. (5.6) (A 0.13µm process, characteristic impedance 84Ω, 5mm long wire, pMOS W/L = 345, nMOS W/L = 143)</td>
<td>88</td>
</tr>
<tr>
<td>5.5</td>
<td>Input response on lossy and lossless transmission-line</td>
<td>88</td>
</tr>
<tr>
<td>5.6</td>
<td>Result of gate sizing for a lossy transmission-line (A 0.13µm process, characteristic impedance 86Ω, 5mm long wire, pMOS W/L = 369, nMOS W/L = 150)</td>
<td>90</td>
</tr>
</tbody>
</table>
LIST OF FIGURES

5.7 Result of gate sizing for a lossy transmission-line in a 50nm process (characteristic impedance 86Ω, 5mm long wire, pMOS W/L = 133, nMOS W/L = 62) ........................................... 90
5.8 Normalized voltage amplitude of the wave which reflects at near-end (normalized by the supply voltage, i = 1) ........................................................................ 92
5.9 A tapered static CMOS buffer. ........................................................................ 92
5.10 Experimental circuit for eye-diagram evaluation. ............................................. 93
5.11 Relationship between the taper factor and the operating frequency (X = 3). .......... 94
5.12 Relationship between the taper factor and the operating frequency (X = 9). .......... 94
5.13 Relationship between the taper factor and the operating frequency (X = 27). ........ 94
5.14 Tapered driver. ............................................................................................... 95
5.15 Gain curve of tapered driver. .......................................................................... 96
5.16 Experimental circuit for eye-diagram evaluation. ............................................. 97
5.17 Clock frequency versus eye-opening voltage (X = 3). ....................................... 98
5.18 Clock frequency versus eye-opening voltage (X = 9). ....................................... 98
5.19 Clock frequency versus eye-opening voltage (X = 27). ....................................... 98
5.20 The pole frequency of a CML buffer versus the taper factor. ......................... 99
5.21 The pole frequency and the total bias current versus the number of stages (X = 9). ... 99
5.22 Output amplitude and the bandwidth of CML buffer (X = 9, 4 stages). ............ 100
5.23 Performance prediction of CML driver. ......................................................... 101
6.1 Sectional structure of the interconnects under study. ....................................... 104
6.2 Meaning of the voltage (2n − 1). .................................................................... 105
6.3 Performance limitation of the single-end signaling with various required eye-opening. 105
6.4 Performance limitation of the single-end signaling with various attenuation parameters. 105
6.5 Performance limitation of the differential signaling. ....................................... 107
6.6 Comparison between the single-end signaling and the differential signaling. ......... 108
Chapter 1
Introduction

1.1 On-chip interconnects in high-performance LSIs

Due to advances in LSI (Large Scale Integration) fabrication technology, the operating frequency and the functionality of LSI chip is increasing. Reference [3] reports a multi-core processor that integrates 234 million transistors on a 15mm square chip with the operating frequency of 4.6GHz. A roadmap predicts that the global clock frequency will be 9.5GHz at 2010 [4]. A big challenge in this era is high-speed and large capacity signal transmission and long-distance interconnects are considered to be the bottleneck of the whole system. The performance of whole chip strongly depends on the performance of transistors and that of interconnects as well. The performance of transistors continuously improves as the fabrication process scales. However the performance of metal interconnects does not improve drastically compared with that of transistors. Therefore the design of on-chip interconnects has become a crucial problem in high-performance LSIs.

This section introduces the issues of on-chip interconnection.

1.1.1 Performance trend and the interconnect bottleneck problem

Conventionally, the chip performance is limited by the performance of transistors. However the performance of transistors improve rapidly by technology scaling. On the other hand, the performance of on-chip interconnects does not improve much by technology scaling. Copper wire and low-k dielectrics have been introduced to improve the performance by decreasing the resistance and the capacitance, but these are not conclusive solution because the physical structure cannot be changed drastically. Therefore in the near future, the performance of on-chip interconnects limits the whole chip performance. Figure 1.1 is a performance trend of gate delay and interconnect delay predicted by Ref. [4]. In this prediction, The interconnect length of Metal 1 wire is assumed to be scaled as the technology scaling and that of global wire is assumed to be a constant because the chip size is considered not to be scaled in the future. From Fig. 1.1, the gate delay and the RC delay of Metal 1 wire decreases as technology scaling whereas the delay of the global wiring increases. Repeater insertion can mitigate the delay but consume power and chip area. From this trend, global wiring becomes a bottleneck in the near future.

The main cause of the interconnect bottleneck problem is scaling down of the interconnect size. As
decreasing the width and the spacing of interconnect, the resistance and the capacitance increases and thus the RC delay increases. The RC delay can be kept by keeping the size of interconnects, however the required number of interconnects increases as the system integration progresses. Therefore the interconnect bottleneck is considered as a critical problem in future LSIs.

Moreover, Ref. [1] reports that the effective resistivity will increase due to the effect of barrier metal and the electron scattering at the surface and the sidewall of wire. Barrier metal is a coating of the copper metal to prevente copper atoms penetrating into the dielectric. The predicted resistivity is shown in Fig. 1.2. Considering these effects, the performance of on-chip interconnects degrades and the bottleneck problem becomes more serious. Therefore the breakthrough of this problem has been a hot topic.

1.1.2 Transmission-line effect

As the frequency of a signal that propagates through interconnects becomes higher, the effect of the inductance becomes significant. Also, by fattening interconnects and using new wiring materials, resistance is decreased and becomes comparable to reactance [5]. Therefore high-performance interconnects, that is, long fat and thick interconnects designed for high-speed signaling cannot be modeled as the conventional RC interconnect model. On-chip high-performance interconnects are treated as the RLC-model shown in Fig. 1.3.

Figure 1.4 shows a waveform by lumped RC interconnect model and a waveform by distributed RLC model. In the RLC-model, the signal propagates through the interconnect as electromagnetic wave. The voltage changes stepwise because of reflection. The reflection causes overshoot, undershoot and ringing problems. As shown in Fig. 1.4, transmission-line effects cause overshoot, undershoot and ringing. In consequence, the signal propagation time and the signal transition time changes and the voltage spike cause large crosstalk. Therefore it is necessary to consider transmission-line effect in circuit design.
1.1. *On-chip interconnects in high-performance LSIs*

Figure 1.2: Effective resistivity predicted in Ref. [1].

![RLC-model of on-chip interconnects.](image)

Figure 1.3: RLC-model of on-chip interconnects.

![Transmission-line effects on waveform.](image)

Figure 1.4: Transmission-line effects on waveform.
Conditions that the consideration of the inductance is needed

There are several methods to characterize the importance of transmission-line effect [5–7]. For example, from Ref. [6], when an interconnect length \( l \) satisfy

\[
\frac{t_r}{2v} = \frac{t_r}{2 \sqrt{LC}} < l < \frac{2}{R} \sqrt{\frac{L}{C}},
\]

transmission-line effect should be considered. Parameter \( t_r \) is the transition time of an input pulse. The lower limit is decided by signal transition time. Equation (1.1) means if twice of time-of-flight is smaller than the signal transition time, transmission-line effect could be ignored. The upper limit is depends on the attenuation.

1.1.3 Parameter extraction of on-chip interconnects

When on-chip interconnects are discussed, one of the difficulties is the modeling of on-chip inductance. The interconnect characteristics depends on its physical structure, physical constants and the frequency of the signal. In LSIs, interconnects are aligned in multi-layer as shown in Fig. 1.5. Figure 1.6 shows the parasitics of multi-layer interconnects. The interconnects have the resistance and the self-inductance. There are coupling capacitances among the interconnects and the magnetic coupling is modeled as the mutual inductances. Due to the process scaling, on-chip interconnects are becoming more thin, narrow and dense. In a 65nm process, the minimum wiring pitch of global wiring is predicted to be 290nm. The small sectional area increases the wire resistance and the narrow spacing between adjacent wires increases the coupling capacitance. As the operating frequency becomes higher, the inductance also becomes significant and affects the circuit behavior [5–8].

Fundamentals of return current distribution

In characteristic extraction, ground (GND) plays an important role. An on-chip power/ground net is not the ideal supply and ground because it has non-zero impedance. Thereby the return current spreads widely and varies depending on circuit behavior. As shown in Fig. 1.7 return current distribution depends on the resistance of power/ground wires in low frequency. If the power/ground wires have the
1.1. On-chip interconnects in high-performance LSIs

![Diagram](image)

**Figure 1.7:** Return current distribution in low frequency.

**Figure 1.8:** Return current distribution in high frequency.

same resistance value, return current uniformly distributes. In high frequency, the inductive coupling becomes significant and the return current concentrates to the nearest power/ground wire as shown in Fig. 1.8. The return current distribution strongly affects to the resistance value and the inductance value [5, 9].

Figure 2.1 shows a frequency characteristic of return current distribution. The y-axis is the return current of each ground wire when the signal wire is excited by a 1mA AC current source. The interconnect structure is shown in Fig. 2.1. There is a 1μm width signal wire and five ground wires are located with the pitch of 10μm in the same layer with the signal wire. The resistance and the inductance are extracted by a field-solver [10]. At low frequency, the same amount of return current flows in each ground wire because, in this case, the resistances of all ground wires are the same. As frequency becomes higher, return current concentrates to the closest ground wire (G1) because the reactance dominates the resistance. At near 100GHz, the return current distribution saturates to the distribution that is determined by the reactance only. Please note that all the return current does not concentrate to the nearest interconnect no matter how high frequency is.

As the return current distribution changes, the loop resistance and the loop inductance of the interconnect also vary depending on frequency. The variations of the loop resistance and the loop inductance are shown in Figure 1.10. The interconnect structure is the same as Figure 2.1. As frequency becomes higher, the return current concentrates to the closest ground wire, as shown in Figure 2.1. Therefore the loop resistance becomes higher and the loop inductance becomes smaller. Skin effect and proximity effect are also important factors that make the interconnect impedance frequency-dependent. The frequency dependence by the return current with and without the skin effect and proximity effect are compared. At high frequency, the resistance increases drastically by skin effect. However the decrease in inductance caused by skin and proximity effects is smaller than that caused by the return current concentration. In low frequency region, the skin-effect is negligible and it does not affect the return current distribution. At high frequency, skin-effect becomes significant. When skin-effect is significant, the return current distribution depends on the inductance of interconnects. Skin-effect decrease inductance but the variation of inductance value is small. Therefore skin-effect does not affect the return current
distribution also in high frequency region. The assumption that the return current distribution can be estimated without large accuracy degradation is valid even if ignoring the skin effect and the proximity effect.

**Number of P/G wires and extraction accuracy**

As mentioned in Reference [9], the number of power/ground wires considered in extraction affects the inductance value. This is because the return current distribution is different from the actual one if only few power/ground wires are considered. Figure 1.11 shows the number of ground wires considered versus the extraction error at 100GHz. The extracted value with 25 ground wires is assumed as a correct value, and Y-axis shows the error from the correct value. Figure 1.11 indicates that the extracted value only with the nearest ground contains more than 30% error. Return current is frequently misun-
1.1. On-chip interconnects in high-performance LSIs

![Graph showing errors in extracted value](image)

Figure 1.11: Errors in extracted value. (at 100GHz)

nderstood such that all current flows in the nearest ground wire at high frequency. However Figure 1.11 demonstrates that considering only the nearest ground causes extraction error even at high frequency such as 100GHz.

Figure 1.11 indicates that several ground wires have to be considered for accurate RL extraction. There is a large error when only few ground wires are considered. As the number of ground wires considered becomes larger, the extracted RL values approach to the accurate value. However, tens or hundreds of ground wires have to be solved to obtain the accurate RL values. Therefore it is generally difficult to predict how much error is included in the extracted RL values with limited number of P/G wires. Moreover, the on-chip interconnect structure is complicated. There are very wide power/ground wires and sometimes shielding wires are located to prevent crosstalk. In lower layer, huge amount of narrow P/G wires are aligned. Those interconnects have different width, thickness and length as shown in Fig. 1.5.

Effect of lossy substrate

Additionally, the current path is provided by metal wires. In digital circuits, heavily doped silicon is used for the substrate. Heavily doped silicon has conductivity and the conductivity stabilizes the electric potential of the substrate. On the other hand, lossy substrate affects the characteristics of the interconnects because return current and/or eddy current flow through the lossy substrate [11–13]. All these factors can be considered by performing a real-chip measurement or numerical evaluation by 3D field-solvers. However a chip measurement and a 3D numerical simulation require much time and cost. To achieve efficient extraction, it has to be known in which situation each factor should be considered.

1.1.4 Frequency-dependence of interconnect characteristics

In Ref. [14], the necessity of frequency-dependent model is discussed. Reference [14] compares a frequency-dependent model with an equivalent circuit extracted at DC from viewpoint of signal delay,
Crosstalk noise and so on. The interconnect structures that Ref. [14] discusses have weak frequency dependence, and hence the authors conclude that DC extraction is good enough for signal delay evaluation. However in actual chips, there are wide interconnects whose frequency dependence becomes significant in lower frequency of 1 to 2 GHz. Experimental results show that DC extraction causes considerable error for such interconnects. As for crosstalk noise, Ref. [14] reports that frequency dependent model is necessary. However the authors examine only DC extraction and frequency dependent model. Therefore it is not clear whether crosstalk can be estimated using a certain representative frequency for RLC extraction. And it concludes that DC extraction is enough for signal delay, but frequency-dependent model is required for noise estimation. When frequency-dependence is weak, DC extraction is enough. But in case that interconnect characteristics vary in more low frequency, e.g. 1 or 2GHz, DC extraction may cause error. Furthermore, in Ref. [14], only frequency-dependent model and DC extraction are discussed and the representative frequency is not discussed. Crosstalk noise is also discussed and it is shown that the proposed method can estimate crosstalk noise well.

Frequency-dependence of interconnect characteristics is mainly caused by skin- and proximity-effect and return current distribution. The characteristics variation is strongly related with the interconnect structure as well as the frequency. Skin- and proximity-effects are remarkable on wide and thick interconnects because skin depth becomes comparable to the interconnect size in relatively lower frequency.

Figure 1.12 shows an example of resistance and inductance characteristics. The resistance and inductance values are calculated by a field-solver [15]. The assumed interconnect structure is co-planar, and the width of the signal line is 4μm, the width of the ground line is 10μm and their spacing is 2μm. In this case, the resistance increases by 38% from DC to 10GHz, and the inductance decreases by 14% from DC to 10GHz. The resistance and the inductance start changing at relatively low frequency of 1 to 2GHz, and thus frequency-dependence is not negligible to model interconnects in current high-performance circuits any longer.

Figure 1.12: Frequency-dependence of resistance and inductance. (co-planar structure, signal line width 4μm, ground line width 10μm, spacing 2μm)
Impact of frequency-dependence on signal waveform

Generally, interconnects in LSIs are expressed by lumped RLC for circuit design. As explained in the previous section, the frequency-independent RLC ladder circuit in Fig. 3.1 is used to model on-chip interconnects. A number of frequency-dependent models are proposed [16–18]. In this chapter, the model of Ref. [18] is considered as a golden frequency-dependent model. It is implemented in a circuit simulator [19] as v-element model. Although frequency-dependent models such as Ref. [18] can provide accurate waveforms, the frequency-dependent model does not tell the designers which frequency component is important in circuit design. Conversely, if it is known which frequency component dominantly forms and affects the waveforms at the far-end, frequency-independent model is available. However the frequency spectrum spreads widely and depends on circuit behavior and interconnect characteristics, and hence it is difficult to specify the most representative frequency from the frequency spectrum. The goal of this research is to determine the representative frequency for modeling interconnects at a single frequency.

Figure 1.13 shows the impact of frequency-dependence on transient analysis. The simulated circuit is shown in Fig. 1.13. The interconnect shown in Fig. 1.12 is driven by a voltage source and a resistor \( R_d \) that correspond to a CMOS driver whose output impedance is 150\( \Omega \). The solid line labeled “FD” shows the voltage waveform at the far-end by the frequency-dependent model. In this paper, “FD” is used as the abbreviation of “Frequency-Dependent model”. The dashed lines labeled “DC” and “\( f_{\text{sig}} \)” are the results of frequency-independent models shown in Fig. 3.1. “DC” means the RLC ladder model extracted at DC, and “\( f_{\text{sig}} \)” corresponds to RLC extraction at the significant frequency. The number of ladder is 51. As you see, both waveforms of the conventional frequency-independent models (“DC” and “\( f_{\text{sig}} \)” ) are far from that of frequency-dependent model (“FD”). In the signal propagation delay time and the signal transition time (from 0.2V to 0.8V), the errors of “DC” are -28% in delay and -13% in transition time. The errors of “\( f_{\text{sig}} \)” are 19% in delay and 10% in transition time. When the resistance R and the inductance L are extracted at DC, the extracted resistance is too low, and, the resistance extracted at significant frequency is too high. From the above observations, it is expected that a certain frequency between DC and significant frequency provides the waveform that is close to the waveform of the frequency-dependent model. If the representative frequency can be determined systematically, interconnects can be modeled by a single frequency. In the following section, the way to determine the representative frequency to model interconnects at a single frequency is discussed.

1.1.5 Termination of interconnects

Generally, the termination of on-chip interconnects are the input capacitance of the receiver circuit. The input capacitance of CMOS circuits is negligibly small and the termination can be considered as open-ended. However for high-speed signaling, resistive termination is used to achieve impedance matching and suppress the multiple reflection. The effects of resistive termination are explained.

Signal waveform

The effect of resistive termination on the signal waveform is demonstrated. Conventionally, the signal waveform swings from ground level to the supply voltage because on-chip interconnects are open-
ended lines. By using resistive termination, the amplitude of the signal waveform becomes smaller than the supply voltage. It may be a demerit but resistive termination improves the eye-diagram in high bit rate region. Figure 4.9 is the eye-opening voltage versus the signal bit rate. The interconnect is a co-planar structure and its characteristic impedance is 100Ω. The interconnect length is 10mm. The driver output impedance is adjusted to 100Ω. The supply voltage is 1V. In the case of the open-ended transmission-line, the eye-opening at low bit rate is large and close to the supply voltage. However as the bit rate becomes higher, the eye-opening degrades very rapidly. On the other hand, if the receiver side of the interconnect is terminated by a 100Ω resistor, about 100mV eye-opening is obtained at 100Gbps signaling. Thus the resistive termination is necessary for high-speed signaling.

**Power dissipation**

The serious problem in using resistive termination is the increase of the static power dissipation. When the resistor is connected between the signal line and the ground, static current flows through the terminator. Figure 1.14 shows the energy per bit of 20Gbps signaling. The interconnect is 5mm long and the characteristic impedance is 100Ω. The solid line is the simulation result of energy per bit when the resistive termination is used. In Fig. 1.14, the energy per bit of open-ended line is shown by the dashed-line. As shown in Fig. 1.14, resistive termination increases energy per bit. As the resistance of terminator becomes small, the power dissipation increases. When the terminator achieves impedance matching (100Ω), energy per bit increases by 38% from the case of open-ended.

When a resistive termination is used, the energy dissipation increases as shown in Fig. 1.14. On the other hand, resistive termination does not necessarily improve the eye-diagram. Therefore resistive termination should be used carefully.

Figure 1.13: The impact of frequency-dependence on transition waveform. (interconnect structure is shown in Fig. 1.12, \( R_d = 150\Omega \))
1.1. On-chip interconnects in high-performance LSIs

1.1.6 On-chip buffers

The driver and the receiver circuits are classified into static CMOS buffers and current-mode logic (CML) buffers. Static CMOS inverters are commonly used because of their advantages in power dissipation. The static power dissipation of static CMOS inverters themselves is negligible if the leakage current is small. Static CMOS inverters are also area-efficient and are suitable for bus drivers [20]. Also, static CMOS inverters can realize the voltage swing from the ground level to the supply voltage level. This means static CMOS inverters have a large noise margin. On the other hand, CML buffers are used for high-speed signaling [21–25] CML buffers can operate at higher frequency and have tolerance to the common mode noise.

Static CMOS inverters

A conventional CMOS inverter is shown in Fig. 1.15 and its input-output characteristics is shown in Fig. 1.16. When the input voltage $V_i$ is at ground level, the pMOS pulls up the output voltage to the supply voltage level. When the input voltage is raised and exceeds the logical threshold voltage, the nMOS pulls the output voltage to the ground level. The static CMOS inverter is area-efficient and the static power dissipation is negligible if the leakage current is small. The large noise margin is also the advantage of CMOS inverters. As shown in Fig. 1.16, CMOS inverters can swing the output voltage from the ground level to the supply voltage level. However CMOS inverters have some disadvantages. One is the operating speed. CMOS inverters use a pMOS transistor to pull up the output voltage, and the slower speed of a pMOS transistor than an nMOS transistor degrades the maximum operating frequency. Secondly, CMOS inverters suffer from the crosstalk noise and ground bounce. When switching, a CMOS inverter makes a current surge on the power/ground net.

The design freedom of the static CMOS driver is the sizes of pMOS and nMOS transistors. As the transistor size becomes large, large amount of current can flow through the transistors. A large transistor improves the operating speed while it increases the power dissipation. At the same time, the

Figure 1.14: Energy per bit and the resistance of the terminator (5mm length, $Z_0 = 100\Omega$, 20Gbps signaling).
gate capacitance becomes large. The gate capacitance is usually a dominant part of the load capacitance of the prior logic gates. Therefore the gate sizing is one of the important topics in LSI design [26, 27].

As a driver of on-chip transmission-line, an important parameter is the output impedance. When driving transmission-lines, the relationship between the output impedance and the characteristic impedance of the transmission-line is important. In Section 5.2, a sizing method to drive on-chip transmission-lines is proposed.

**CML buffers**

To achieve higher operating frequency, CML buffers have become a design option of driver/receiver circuits. Figure 1.17 shows a basic CML buffer. The CML buffer is based on the differential architecture. The main components of the CML buffer are two pull-up resistors $R_D$, two nMOS transistors for switching and a current source $I_{\text{tail}}$. NMOS transistors control the current flow of each side of the differential pair according to the differential input. CML buffers can operate in high frequency because no pMOS transistor is used and the nMOS transistors are always in saturation region. Figure 1.18 shows a transfer characteristic of the CML buffer. As the differential input $(V_{\text{in}1} - V_{\text{in}2})$ varies, each output voltage varies from $(V_{\text{DD}} - R_D I_{\text{tail}})$ to the supply voltage $V_{\text{DD}}$. Thus the range of the differential output voltage $(V_{\text{out}1} - V_{\text{out}2})$ is from $-R_D I_{\text{tail}}$ to $R_D I_{\text{tail}}$. By assigning each condition to 0 and 1, the CML buffer can transmit differential signal.

The CML buffer shown in Fig. 1.17 is the basic of differential amplifier [28] and a design guideline to use as a driver is already discussed [20]. From Fig. 1.17, design parameters of a CML buffer are the pull-up resistance, nMOS transistor size and the bias current.

The pull-up resistance $R_D$ should be simply the same resistance as the characteristic impedance of the transmission-line to achieve impedance matching.

The size of the nMOS transistor is determined by the constraint of CML buffer operation. As shown in Fig. 1.18, a certain input voltage is needed to swing the output voltage from $(V_{\text{DD}} - R_D I_{\text{tail}})$ to $V_{\text{DD}}$. When all of the bias current flows through either nMOS transistor, the output voltage swing reaches $R_D I_{\text{tail}}$. From the square low of nMOS drain current, the relationship between the bias current and the
minimum input voltage $\Delta V_{\text{in}, \text{min}}$ is expressed by

$$I_{\text{tail}} = \frac{W}{2L\mu C_{\text{ox}}} \Delta V_{\text{in}, \text{min}}^2,$$

(1.2)

where $\mu$ is the mobility, $C_{\text{ox}}$ is the gate capacitance per unit area, $W$ and $L$ are the gate width and the gate length respectively. The mobility $\mu$ and the gate capacitance $C_{\text{ox}}$ are determined by the fabrication process and the gate length $L$ is set to the minimum value in the fabrication process. Therefore the freedom in the nMOS transistor is only the gate width $W$. To drive the next stage, the output voltage has to be larger than the minimum input voltage.

$$\Delta V_{\text{out}} \geq \Delta V_{\text{in}, \text{min}}.$$

(1.3)

From Eq. (1.2) and Eq. (1.3), the gate width $W$ is determined by

$$W \geq \frac{2L I_{\text{tail}}}{\mu C_{\text{ox}} \Delta V_{\text{out}}^2}.$$

(1.4)

Equation (1.3) is the lower limit of the output voltage. The upper limit of the output voltage depends on the threshold voltage $V_{\text{th}}$. For high-speed operation, nMOS transistors should operate in the saturation. From this constraint, the maximum output voltage is derived as [20]

$$V_{\text{out}} = R_D I_{\text{tail}} \leq V_{\text{th}}.$$

(1.5)

The last design parameter $I_{\text{tail}}$ is determined from the output voltage swing. The final stage of CML driver has to drive the transmission-line and the receiver. In on-chip transmission-line, loss of the transmission-line is not negligible. Therefore the lower bound of the output voltage $\Delta V_{\text{out}}$ is larger than the minimum input voltage of the receiver. By considering the attenuation in the transmission-line, the output voltage $\Delta V_{\text{out}}$ is expressed by

$$\Delta V_{\text{out}} \geq \frac{\Delta V_{\text{receiver}, \text{min}}}{\exp(-\alpha l)},$$

(1.6)

where $\alpha$ and $l$ are the attenuation constant and the length of the transmission-line respectively.
1.1.7 Signaling schemes to improve the performance of on-chip communication

As mentioned above, interconnect bottleneck is becoming a serious problem. Conventionally, repeater insertion is a common method to reduce the interconnect delay [26]. As the required performance increases, several high-performance on-chip communication is developed [29–33]. As a result, options in interconnect design are increasing, e.g., single-end or differential signaling, static CMOS logic or current mode logic, and many special techniques such as wave pipelining. Designers have to choose the best suite from those design options. However there are a lot of metrics such as latency, throughput, interconnect resource and power. Trade-off among these metrics should be considered in circuit design.

1.2 Survey of related works

This section introduces related works to distinguish the contributions of this thesis.

1.2.1 Modeling and design of on-chip interconnects

One of the research category about on-chip interconnect is parameter extraction. In low frequency and short interconnects, inductance is negligible and capacitance is an important parameter. Analytical methods to extract capacitance from the structure of interconnects are proposed [34, 35]. However analytical methods can be applied to limited structures, thus field-solvers are developed to treat general structures [15, 36]. As the operating frequency becomes higher, the effect of inductance becomes significant [5–7, 37, 38] and inductance extraction and modeling methods are discussed [17, 39–46]. In high frequency, resistance also changes according to skin- and proximity-effect [16, 47]. For an on-chip interconnect structure, partial-element equivalent-circuit (PEEC) model [48] is suitable and PEEC based methods and field-solvers are developed [10, 15].

In resistance and inductance extraction, return current distribution is a crucial problem [49–51]. Especially return current distribution is complicated in on-chip interconnects because there are no robust ground wires. Therefore in LSIs, return current spreads from adjacent interconnects to distant power/ground wires and the current distribution depends on the frequency. References [9, 52] points out the effect of the return current distribution on the accuracy of extraction and the impact on time-domain waveform. In high frequency above GHz, the effect of silicon substrate becomes important as well as metal wires [11–13, 53–59]. Orthogonal interconnects that are ignored so far also affect the interconnect characteristic [60, 61]. To analyze such factors more accurately, 3D field solvers that solve Maxwell equations are developed [62].

How to model extracted characteristics is also an important issue [63]. Due to the skin- and proximity-effect and return current distribution, the characteristics of on-chip interconnects depends on frequency. The interconnect model which can treat the frequency-dependence is necessary for exact circuit simulation and several methods are developed [14, 17, 18, 64, 65]. However the conventional frequency-independent model has an advantage that a number of design methods and techniques have developed so far. Therefore improving the modeling accuracy of the frequency-independent model is also important to circuit design.
1.2. Survey of related works

1.2.2 Driver and receiver design for high-speed signaling

On-chip drivers and receivers are roughly classified into static CMOS logic and current mode logic (CML).

In the case of the static CMOS driver, impedance matching between the output impedance of the driver and the characteristic impedance of the transmission-line is a common practice [26, 66–70]. Impedance matching avoids multiple reflection and minimizes the voltage overshoot. However in on-chip transmission-lines, the attenuation is significantly strong and the signal amplitude decreases while propagating. In Section 5.2, a method that improves signal propagation delay by considering the attenuation of transmission-line is proposed. The proposed method in Section 5.2 realizes the signal propagation in the velocity of electromagnetic wave.

On the other hand, a CML driver and receiver are used for high-speed signaling. CML is originally used in bipolar logic [71–73] and it is not common in CMOS circuits because CML requires static current flow and the power dissipation is larger than that of static CMOS logic. However to achieve high-speed operation, CML has become a choice even in CMOS circuits. CMOS CML is mainly used for clock distribution, high-speed data transmitter, multiplexer and demultiplexer [21–25, 74–76]. Several design guidelines are proposed [20]. However in Ref. [20], only the signal propagation delay is focused. The trade-off among the delay, bandwidth and power dissipation is not discussed well. Section 5.4 proposes a design guideline of CMOS CML driver/receiver focusing the bandwidth.

As a technique to improve the performance of the driver and the receiver, pre-emphasis and equalization are common for the transmitter and the receiver of ethernet, optical cable and so [77]. These techniques are also discussed for on-chip communication [33, 78]. However for on-chip interconnection, the pre-emphasis and equalization are considered not to be efficient [33]. Thus this thesis focuses the fundamental driver and receiver circuits.

1.2.3 Signaling methods for on-chip interconnection

Conventionally, design of long distance signaling is discussed from the viewpoint of the signal propagation delay. A common technique to decrease delay is repeater insertion. The delay in interconnects is proportional to the product of the resistance and the capacitance [79]. Thus the delay is proportional to the square of the interconnect length. Repeater insertion technique decreases the delay by dividing the long interconnect. A repeater insertion considering the inductance is also proposed [6].

As growing the system integration, on-chip global interconnects becomes the serious bottleneck of the system performance. In high-performance chips, large capacity and high throughput on-chip interconnection is necessary [80, 81]. The architecture of microprocessors is becoming multi-core architecture [3]. Network-on-Chip (NoC) that integrates micronetwork on a chip is also discussed [82]. In this era, the communication among cores and functional blocks is one of the big challenges. To breakthrough the interconnect bottleneck, several methods have been proposed. One approach is improving the the performance and the reliability of on-chip bus by the signal coding [83–87]. The other approach is developing alternative signaling methods. References [29, 31] propose wave-pipeline for on-chip communication. Wave-pipeline improves the throughput by sending signals in shorter period than the system clock. A method to propagate signal at the velocity of light [32] and signaling by short pulse [30, 88] are also proposed. Reference [89] discusses the limitation of the bit
rate capacity of metal interconnects. On-chip optical interconnects are discussed as a breakthrough of the performance limitation [90–94]. Thus circuit designers have many choices in the design of on-chip interconnection, for example, conventional repeater insertion, differential signaling with CML, special method such as wave pipeline and so on. Each methods and techniques are discussed but it is not clear in what situation each method should be used.

1.3 Contributions of this thesis

The goal of this research is providing a comprehensive design guideline for on-chip high-performance interconnection. As described above, there are a number of difficulties in on-chip interconnect design. One strongness of this thesis is the discussion based on real-chip measurements. The evaluation of the interconnect characteristics is made by real chip measurements or simulations by field-solvers, which verifies the correlation between the model and the phenomena in the real world. The other contribution is trade-off analysis. In circuit design, there are a number of trade-offs. By using simplified model, this thesis provides trade-off analysis of on-chip interconnects and the on-chip driver/receiver. The proposed trade-off analyses enable to know which signaling method should be used or whether the required performance can be realized or not in the early stage of design flow. The proposed performance models also provide performance prediction in future process. This thesis is a comprehensive study of on-chip interconnect design including discussion about the physical structure, interconnect modeling for circuit design and the design guideline of on-chip signaling.

1.4 Organization of this thesis

This section explains the organization of this thesis and the overview of each chapter. Chapter 2 discusses parameter extraction and clears which elements have to be considered. Next, a modeling method for frequency-dependent interconnects is proposed in Chapter 3. Then Chapter 4 proposes an analytical performance estimation of on-chip transmission-lines. In Chapter 5, buffer circuits for the driver and the receiver of on-chip interconnects are discussed. Finally, Chapter 6 consolidates the performance estimation method in Chapter 4 and Chapter 5 and shows a performance estimation of on-chip signaling system.

1.4.1 Modeling of on-chip interconnects

Chapter 2 discusses the interconnect characteristic and the physical structure. There are three topics. The first topic is return path selection. The return current distribution affects loop characteristic of interconnects. To extract exact RL value, all of return paths have to be considered. However it is impossible because there are huge number of P/G wires in LSIs. As more wires are considered, the extraction accuracy improves but the extraction cost increases undesirably. Therefore necessary and sufficient P/G wires have to be selected to perform accurate and efficient RL extraction. However there is no systematic method of return path selection. The proposed method focuses the energy dissipated at P/G wires and utilizes it for screening return paths. Experimental results reveal that the proposed
method enables accurate and computationally efficient RL extraction with considering return current distribution.

The second topic is the effect of orthogonal interconnects. Conventionally, orthogonal interconnects are not considered as a candidate of return path because magnetic coupling is weak. However, from real chip measurements and simulations by a field-solver, it is revealed that the wide and dense orthogonal wires can be a path of current flow and the effect on the interconnect characteristics is not negligible.

The third topic is the effect of the substrate. A conducting substrate affects characteristics of on-chip transmission line. However, in many cases on actual chips, there are P/G wires between the signal wire and the substrate that may shield the substrate coupling. Measurement and simulation results of on-chip transmission lines with narrow yet many power/ground wires in a lower layer are shown. Experimental results show that narrow power/ground wires in a lower layer in parallel to the signal wire, which are common in LSI power distribution network, shield substrate coupling and suppress substrate loss. On the other hand, orthogonal power/ground wires in a lower layer hardly mitigate substrate coupling.

### 1.4.2 Interconnect RL extraction at a single representative frequency

Chapter 3 discusses the modeling of on-chip interconnect. The extracted interconnect characteristics have to be transformed to a model suitable for the circuit design. This chapter proposes a method to determine a single frequency for interconnect RL extraction. Resistance and inductance of interconnects depend on frequency, and hence the extraction frequency strongly affects the modeling accuracy of interconnects. The proposed method determines an extraction frequency based on the transfer characteristic of interconnects. By choosing the frequency where the transfer characteristic becomes maximum, the extracted RL values achieve the accurate modeling of the waveform. It is experimentally verified that the proposed method provides accurate transition waveforms over various interconnect topologies.

### 1.4.3 Analytical performance estimation of on-chip transmission-lines

Chapter 4 proposes an analytical performance estimation of on-chip interconnects. On-chip global interconnects are considered to be a bottleneck of high-performance LSIs. However, the limitation of on-chip interconnects has not been examined sufficiently. This chapter proposes a performance estimation of on-chip global interconnects based on derived analytic expressions and detailed circuit simulation. Trade-off curves among bit rate, interconnect length, and eye opening both for single-end and for differential signaling are derived. The results show that differential signaling improves signaling performance several times compared with conventional single-end signaling.

As an application of the analytical performance estimation, a design guideline for the resistive termination of on-chip high-performance interconnects is proposed. Resistive termination for on-chip interconnects is one of the fundamental techniques to achieve high-speed signal transmission on LSIs. Resistive termination can improve the bandwidth of on-chip interconnects and the power dissipation. Therefore, a design guideline for resistive termination is necessary. This chapter proposes a method to determine the termination of on-chip interconnects. The termination derived by the proposed method provides minimum sensitivity to process variation as well as maximum eye-opening in voltage.
1.4.4 Driver/Receiver design for high-speed signaling

Chapter 5 discusses the driver and receiver circuit for on-chip interconnects. First a transistor sizing for static CMOS driver is proposed. Driver sizing to achieve impedance-matched with the interconnect is a common method for transmission-line drivers. However an impedance-matched driver has a possibility to increase the signal propagation delay because of the strong attenuation of on-chip interconnects. By taking the attenuation into consideration, the proposed tuning achieves signal propagation at the velocity of electromagnetic wave.

The latter part of this chapter, a design guideline for a CMOS CML driver is proposed. CML is an architecture for high-speed operation and has been becoming an option of on-chip signaling. This chapter proposes a bandwidth-driven design of a CML driver. The proposed method focuses the pole of the tapered driver and find the strong correlation with eye-diagram. Analytical trade-off between the bandwidth and the power dissipation is provided and it reveals the cost-performance ratio of increasing the number of stages. The proposed method also provides performance prediction in future processes.

1.4.5 Design methodology of on-chip high-speed signaling

Chapter 6 merges the proposed methods in Chapter 4 and Chapter 5. The consolidated method provides the trade-off analysis between the interconnect length and the maximum bit rate. The performance estimation indicates when the differential signaling with CML buffers is needed and what factor of signaling system limits the entire performance. The proposed method is based on the analytical methods and it can predict the performance of signaling systems from a few parameters such as interconnect length, attenuation, input pulse shape and so on.
Chapter 2

Modeling of on-chip interconnects

2.1 Introduction

This chapter discusses how to model on-chip interconnects. Fundamentally, the factors that decide the behavior of on-chip interconnects are the physical structure and physical constants, e.g., width, length, thickness, resistivity, permittivity and so on. In circuit design, the behavior of interconnects has to be expressed by resistance $R$, inductance $L$, conductance $G$ and capacitance $C$. As the operating frequency increases, on-chip interconnects become to have significant effects on the whole chip performance. Therefore in high-performance design, accurate modeling of on-chip interconnects is needed. Modeling error may cause crucial problems such as impedance mismatch and estimation error in delay and signal transition time. In this chapter, methods to achieve accurate interconnect modeling from the structure and physical constants are discussed.

Conventionally, on-chip interconnects are modeled by the resistance and the capacitance [79]. If the skin-effect and proximity-effect is negligible, the interconnect resistance is determined by the area of the cross section, the length and the resistivity. The capacitance depends on the spatial relationships among adjacent interconnects and the analytical methods [34, 35] and field-solvers [15, 36] are developed to evaluate the capacitance value. The effects of the inductance have been ignored because the reactance $\omega L$ is much smaller than the resistance $R$ when the operating frequency is in MHz region. The shunt conductance is also ignored because the dielectric loss tangent of the ILD (interlayer dielectrics) is small. The typical ILD in LSI is silicon-dioxide and its dielectric loss tangent is 0.00068 that is thirtieth part of that of FR4. FR4 (Flame Retardant Type 4) is a resin commonly used for printed circuit boards (PCB) and the dielectric loss tangent is 0.02. According to advances in LSI fabrication technology, the microprocessor operates in multi-GHz clock frequency. In such high-speed circuits, the effect of on-chip inductance becomes significant and has a remarkable impact on circuit design e.g. timing analysis and noise estimation [5–7, 95–97]. However it is hard to extract accurate inductance of on-chip interconnects. When extracting inductance value, the current flow is important because the inductance value is defined by current loop and magnetic flux. If there is a ground conductor whose impedance can be approximated as zero, the signal wire and the ground conductor make a current loop and the inductance can be easily defined. However in LSIs, the impedance of the power/ground wires is not negligible and the current flow depends on the circuit behavior. Moreover, the conductivity of
the silicon substrate of LSIs is typically 10 S/m and the substrate can be a current path. Therefore it is difficult to specify the current loop in LSIs. This chapter has three major topics. The first topic is how many power/ground wires should be considered in inductance extraction. The second one is the current flow in the orthogonal interconnects. The last one is the effect of silicon substrate.

Conventionally, RL extraction is performed with considering the nearest one or two P/G wires [51]. Extraction with one or two P/G wires assumes that the almost all of the return current concentrates to the nearest P/G wires. This assumption is valid if wide power/ground wires are adjacent to the signal wire. However in reality, the return current widely spreads even if the frequency is above multi-GHz. Reference [9] points out the problem that the extraction ignoring the return current distribution causes serious error in the extracted inductance value. The return current distribution has to be considered when discussing high-performance interconnects [52]. However, there are tremendous number of P/G wires in LSIs. It is impossible to consider such huge number of wires because of the computational cost. To perform accurate and quick extraction, P/G wires that contribute to dominant return current have to be chosen. However no systematic selection method has been proposed so far.

In Section 2.2, a method to screen adequate P/G wires is proposed. As increasing the number of P/G wires in extraction, the extraction error decreases. Energy dissipation is used as an indicator to decide how many P/G wires should be considered. Experimental results show that the energy dissipation of the modeled interconnect system correlates closely with the extraction error. The proposed method iteratively evaluates the energy dissipation as increasing the number of return-paths under consideration. Experimental results show that the return-current distribution can be calculated without considering skin-effect, which considerably helps to reduce computational cost to screen P/G wires. The proposed method saves the unnecessary extraction cost imposed by the consideration of negligible P/G wires, while maintaining the extraction accuracy since it indicates the necessary and sufficient P/G wires with small additional computational cost.

Section 2.3 discusses characteristics of on-chip interconnects with orthogonal power/ground wires. Conventionally, orthogonal interconnects are not thought to affect inductance and resistance [5]. However recent designs have wide and dense power/ground wires to achieve robust power delivery. Reference [60] reports that orthogonal ground shields decrease shunt conductance and dispersion by frequency-dependence of interconnect characteristics. However Ref. [60] analyzes a micro-strip structure that has a signal metal wire with a grounded substrate. This interconnect structure is hardly used for on-chip interconnection, because there are grounded metal wires in parallel to the signal wire. Thus it is not clear whether the orthogonal power/ground wires are negligible or not in on-chip interconnect modeling. In this section, the interconnect characteristics are evaluated by real chip measurement and field-solver. Experimental results show that the wide and dense orthogonal wires act as return paths and affect the inductance value of the signal wire.

Section 2.4 discusses the effect of silicon substrate. A conducting substrate is one of difficulties in modeling on-chip interconnects. The effect of silicon substrate and its modeling are discussed so far [11, 55, 98, 99]. Substrate coupling in a co-planar interconnect structure on resistive substrate has been studied. However in real chips, there are other power/ground wires and signal wires between the signal wire and the substrate. Reference [100] reports that interconnects in lower layers affect the inductance of the transmission-line. The interconnects in lower layers are expected to shield the coupling to the substrate. However the interconnects in lower layers have various dimension, direction
2.2. Return path selection for loop RL extraction

and wire density. Therefore it is not clear which wires in lower layers shield substrate coupling. This section reports the measurement results of transmission-lines with narrow ground wires in lower layer which represent P/G wires of standard cells. Measurement results show the effect of substrate loss depends on the structure of wires in lower layer. If the direction of ground wires in lower layer are orthogonal to the signal wire, these orthogonal wires have no shielding effect and substrate loss is significant. On the other hand, if the ground wires in lower layer are parallel to the signal wire, substrate loss is suppressed. From the comparison with the result of numerical analysis by a 3D field solver, substrate loss is negligible when parallel ground wires exist in lower layer. The experimental results reveal that considering parallel P/G wires in lower layer is important and substrate loss is not significant if parallel P/G wires exist between transmission-lines and the substrate. The contribution of this work is to show which wires in lower layer affect substrate coupling.

2.2 Return path selection for loop RL extraction

In this section, a method to select adequate wires for extracting the frequency characteristics is proposed. First the problem discussed in this section is introduced. Then the proposed method is described and experimental results are shown.

2.2.1 Effect of the skin-effect on return current distribution

This section shows the effect of the skin-effect on return current distribution. As explained in Section 1.1.3, return current distribution depends on the resistance and the inductance of wires. In high frequency, the resistance and the inductance of each wire depends on frequency because of skin- and proximity-effect. Figure 2.1 shows the return current distribution with and without the skin-effect. The interconnect structure is the same as that of Fig. 1.9. In Fig. 2.1, the solid lines show the result with considering skin- and proximity-effect. The dashed lines are that without considering skin- and proximity-effect. As you see, the current distribution ignoring skin- and proximity-effect is almost the same as the result with considering skin- and proximity-effect. Therefore the current distribution can be estimated without considering skin- and proximity-effect.

2.2.2 Indicator for return path selection

An indicator to figure which ground wires should be considered is proposed. The method proposed in this section calculates the energy dissipated at the ground wires when a signal wire is excited. The accurate estimation of the dissipated energy is a necessary condition that the return current distribution is well estimated. In nature, the loop current flows in the path where the dissipated energy becomes the smallest. As the number of ground wires increases, the freedom of the return current paths increases, and hence the dissipated energy must decrease monotonously as the number of power/ground wires increases. Finally the dissipated energy approaches to a certain value. Therefore the configuration of ground wires whose energy dissipation is close to the saturated value corresponds to accurate return current distribution.
Chapter 2. Modeling of on-chip interconnects

Figure 2.1: Return current distribution with and without skin effect.

First, the PEEC model of the interconnects is evaluated. As mentioned in Section 2.2.1, skin effect is ignored because skin- and proximity-effects are secondary factors that determine return current distribution and less important. The interconnect resistances are determined by interconnect length, cross section and metal resistivity. The partial-self-inductance is determined similarly. The partial-mutual-inductance between paired wires is determined by the positional relationship of the pair of wires. Thus a PEEC model can be easily constructed by analytical methods [39].

From the PEEC model, the resistance matrix and the inductance matrix are obtained and the return current distribution can be calculated analytically. For low frequency region, ground wires are indexed in the ascending order of resistance. For high frequency region, ground wires are indexed in order of the distance from the signal wire, i.e., the closest ground wire to the signal wire is labeled 1. The return current flowing in the i-th ground is written as $i_i$.

Next, the energy dissipation at ground wires is calculated incrementally. At the beginning, the proposed method evaluates the signal wire only with the closest ground wire. In this case, all return current flows in the closest ground wire. Then the proposed method calculates the energy with two ground wires. When the number of considered ground wires is $n$, the energy consumption is written as $U_n$.

$$U_n = \sum_{j}^n R_{jj} i_j,$$

where $R_{jj}$ is the resistance of the $j$-th ground wire. The difference of the energy is defined as

$$\Delta U_n = \frac{U_{n-1} - U_n}{U_{n-1}}.$$  

The difference $\Delta U_n$ means the energy variation when $n$-th ground wire is added. If the difference $\Delta U_n$ becomes small, the extracted RL values are expected to converge to the accurate value. Therefore the proposed method adds ground wires until the difference $\Delta U_n$ becomes small enough. In extraction, the proposed method uses the ground wire set that makes the difference $\Delta U_n$ small enough.
2.2. Return path selection for loop RL extraction

As explained so far, return current flows such that the energy of the system becomes minimum. Section 2.2.4 will show the experimental results and the results show that $\Delta U$ indicates the upper limit of extraction error. To calculate $\Delta U$, PEEC model has to be created at each time the number of ground wires increases. However in the proposed method, $\Delta U$ is calculated ignoring the skin- and proximity-effect. Ignoring the skin- and proximity-effect saves the extra computational cost to decide which ground wires have to be considered. The computational cost will be discussed in Section 2.2.4.

2.2.3 Flow of the proposed method

In this section, a return path screening method is proposed. As discussed so far, several return paths have to be considered for accurate RL extraction. The proposed method determines which ground wires should be considered as return paths by evaluating the variance of the energy consumed at the ground wires.

As mentioned in Section 1.1.3, the return current distribution depends on frequency. Therefore the contribution of each ground wire to return current path is also frequency dependent. In low frequency, less resistive wires strongly affect the return current distribution. In high frequency, on the other hand, ground wires that have strong inductive coupling with the signal wire have a great impact on return current distribution. Therefore the return current distribution is frequency dependent as shown in Fig. 2.1. To handle this frequency dependence, two configurations are merged; one is for low frequency, and the other is for high frequency.

The flow of the proposed method is summarized in Figure 2.2. The proposed method increments the number of considered P/G wires and judge whether enough P/G wires are selected or not by the value of $\Delta U$. In the case of low frequency, the energy difference $\Delta U_i$ by adding a ground wire in order from low resistance wires to high resistance ones is evaluated. As mentioned in Section 2.2.1, at the low frequency the resistance of P/G wires is a dominant factor to the return current distribution and the return current concentrates to low resistance wires. The inductances are ignored because the resistance is much larger than the reactance. In the case of high frequency, the energy difference $\Delta U_i$ by adding a ground wire in order from the wires that have the largest inductive coupling coefficient $k$ is evaluated. The inductive coupling coefficient is defined by $M/\sqrt{L_1L_2}$ where $M$ is the mutual-inductance, $L_1$ and $L_2$ are the self-inductance. The coupling coefficient depends on the geometry of two wires and easily calculated from the inductance matrix of PEEC model. The resistances are ignored in high frequency range because the reactance is much higher than the resistance. By combining these two sets, the proposed method obtains an adequate ground wire set that can cover low to high frequency.

2.2.4 Experimental results

This section shows some experimental results to verify that the proposed method can select adequate return paths. Then the computational cost of the proposed method is discussed. It is shown that the calculation cost to select the return paths is much smaller than that of RL extraction.
Chapter 2. Modeling of on-chip interconnects

Figure 2.2: Flow of the proposed method.

Case of uniform interconnect structure

First, the case of a simple uniform ground wires is shown. The interconnect structure is the same as that of Fig. 2.1.

Figure 2.3 shows the extraction error and the proposed indicator $\Delta U$ in low frequency region, where the resistance of wires is dominant to the return current distribution. Figure 2.4 shows the result in high frequency where the inductance is dominant to the return current distribution. In these figures, $\Delta Z$ is shown for comparison. $\Delta Z$ is the loop impedance difference and it is defined by

$$
\Delta Z_n = \frac{Z_{n-1} - Z_i}{Z_{n-1}}
$$

(2.3)

as well as the energy difference $\Delta U$. From Fig. 2.3, the indicator $\Delta U$ has strong relationship with the extraction error. The extensive experiments show that the value of $\Delta U$ is the upper bound of the estimation error in resistance and inductance, which means that the threshold value of $\Delta U$ can be set while maintaining the required estimation accuracy. For example, Fig. 2.3 shows that 10 ground wires are needed for 10% error extraction. The real extraction error is 10% in loop resistance and 12% in loop inductance when 10 ground wires are considered. The proposed indicator $\Delta U$ also provides a good indication for error convergence in high frequency region. For example, Fig. 2.4 shows that 5 ground wires are enough to achieve 10% extraction error. The real extraction error is 5.3% in loop resistance and 3.2% in loop inductance. The proposed method based on $\Delta U$ can select adequate return paths. On the other hand, although the convergence tendency of $\Delta Z$ is close to the extraction error and $\Delta Z$ could be used as an indicator, it is difficult to set the threshold value, because the ratio of the extraction error and $\Delta Z$ varies drastically.

From Fig. 2.3 and Fig. 2.4, which ground wires should be considered can be decided. If the required accuracy in extraction is 10%, the proposed method selects 10 ground wires. The frequency characteristics considering 10 ground wires are shown in Fig. 2.5. In this result, skin-effect is ignored to show the effect of the return current distribution clearly. The maximum error of the proposed method
2.2. Return path selection for loop RL extraction

Figure 2.3: Extraction error of the loop impedance and the proposed indicator \( \Delta U \) (uniform structure, at 1MHz).

Figure 2.4: Extraction error of the loop impedance and the proposed indicator \( \Delta U \) (uniform structure, at 100GHz).
is 13% and it is close to the required accuracy. This result verifies that the proposed method realizes the adequate return path selection. For comparison, the case that only the nearest ground is considered and the case that $\Delta Z$ is used as the indicator are shown in Fig. 2.5. Figure 2.6 shows the selected wires by each selection method. In Fig. 2.6, “Conventional” shows the result of the case that only the nearest ground wire is considered as the return path. $\Delta Z$ shows the result when ground wires are selected until $\Delta Z$ becomes 0.1. The maximum error of the conventional method is 92% and that of $\Delta Z$ is 28%. The errors are much larger than the result of the proposed method and this result confirms the necessary of considering the return current distribution.

**Realistic case**

In real chips, the interconnect structure is a complicated 3D structure. A bus structure is assumed and the cross section is shown in Fig. 2.7. There are 4$\mu$m width ground wires at the pitch of 100$\mu$m. These wires represent P/G wires and shielding wires in bus structure. In the lower layer, there are orthogonal interconnects but they do not affect return current distribution. In the further lower layer, 1$\mu$m width ground wires are located with the pitch of 10$\mu$m. These width and pitch correspond to P/G wires in standard cell.

Figure 2.8 shows the relation between the number of considered return paths and the extraction error in low frequency region. Here the resistances of ground wires dominantly decide the return current distribution, and the return current tends to flow in thick wires. Figure 2.8 shows the proposed $\Delta U$ gives a good indication for error convergence. On the other hand, although the convergence tendency of $\Delta Z$ is close to the extraction error and $\Delta Z$ could be used as an indicator, it is difficult to set the threshold value, because the ratio of the extraction error and $\Delta Z$ varies in disorder. In addition, the calculation of $\Delta Z$ requires more computation than that of $\Delta U$, and hence the proposed method adopts
2.2. Return path selection for loop RL extraction

![Interconnect structure diagram]

Figure 2.7: Evaluated interconnect structure.

![Extraction error graph]

Figure 2.8: Extraction error of the loop impedance and the proposed indicator $\Delta U$ (realistic structure, at 1MHz).

$\Delta U$ as an indicator.

Figure 2.9 shows the results in high frequency region. In high frequency region, the return current distribution depends on inductive coupling. The return current concentrates to nearer ground wires. Therefore the proposed method selects return paths from the nearest ground wire. By the case of Fig. 2.7, the return current concentrates to the thin wires in lower layer. Figure 2.9 shows the convergence of $\Delta U$ is close to the extraction error.

Figure 2.10 shows the selected wires when the target $\Delta U$ is set to 10% in Fig. 2.8 and Fig. 2.9. In Fig. 2.10, “Conventional” shows the result of the case that the nearest two ground wires are considered as return paths. The conventional method selects 2 wires and the proposed method select 16 wires. The extracted loop characteristics are shown in Fig. 2.11. The extraction error of the proposed method is less than 10% and the error agrees with the required accuracy. The conventional method causes over 90% error in loop resistance and over 30% in loop inductance.
Figure 2.9: Extraction error of the loop impedance and the proposed indicator $\Delta U$ (realistic structure, at 100GHz).

Figure 2.10: Considered ground wires for RL extraction (realistic structure).

Figure 2.11: Frequency characteristics of selected interconnects (realistic structure of Fig. 2.10).
2.3. Effect of orthogonal interconnects

Figure 2.12: Extraction cost and the additional cost by the proposed method.

Computational cost

The proposed method needs a certain extra cost to select return paths. The computational cost of the proposed method is evaluated. Figure 2.12 shows the relation between the number of considered wires and extraction cost. The interconnect structure is Fig. 2.7 and a field-solver [10] runs on a 750MHz SPARC workstation. The solid line is the extraction cost only, and the dashed line labeled “extraction + return path selection” is the sum of the extraction time and the time to select return paths by the proposed method. The dashed line labeled “increase of extraction time” shows the ratio of the additional cost to the extraction cost. Figure 2.12 shows the extraction cost increases rapidly as the number of wires increases. On the other hand, the additional cost by the proposed method is relatively small and grows slowly as increasing the number of wires. This is because the proposed method ignores skin-effect when it selects return paths. When the number of wires is large, the additional cost by the proposed method is only several percent of the extraction cost. The extra cost to use the proposed method is much smaller than the extraction cost with conservatively considering many interconnects. When the number of wires is small, the ratio of the additional cost to the extraction cost may be over 30%. But in this case, the absolute value of the additional cost is several seconds at most. From above discussion, the proposed method can select adequate return paths with negligible additional computational cost.

2.3 Effect of orthogonal interconnects

In the previous section, parallel power/ground wires are discussed. This section discusses the effect of orthogonal interconnects on the characteristics of the signal wire.
Chapter 2. Modeling of on-chip interconnects

Figure 2.13: A signal wire and parallel/orthogonal power/ground wires (top view).

Figure 2.14: Test structure and eddy current in orthogonal interconnects.

2.3.1 Test structure

Figure 2.14 shows the structure of the evaluated interconnects. In the top layer, three wires form G-S-G co-planar structure. The width of each wire is 4\(\mu\)m and the spacing between wires is 2\(\mu\)m. The interconnect length is 800\(\mu\)m. In the lower layer, orthogonal wires are aligned and all orthogonal wires are connected to the ground wire in the top layer. The width \(W\) and the spacing \(S\) of the orthogonal wires are changed. An indicator “density” is defined for the structure of the orthogonal wires. The indicator “density” is expressed as \(W/(W+S)\), which means the ratio of the area occupied by the orthogonal wire.

2.3.2 Measurement and simulation results

Conventionally, the orthogonal wires are considered not to affect the inductance of the signal wire because the magnetic coupling between the signal wire and the orthogonal wires is weak. However when the frequency is high or the width of the orthogonal wires is large, the eddy current flows in the orthogonal wires as shown in Fig. 2.14. The eddy current couples with the signal current and affects the interconnect characteristics. Figure 2.15 shows the self-inductance from the measurement result of the test structure with \(W = S = 4\mu\)m (density=50\%). The solid-line labeled “Co-planar” is the result without orthogonal wires (density=0\%) and the dashed-line labeled “Microstrip” is the result with a ground plane in the lower layer (density=100\%). In low frequency up to 20GHz, the inductances of the co-planar with and without orthogonal wires are close. However in high frequency, the orthogonal wires affect the inductance and the difference between the co-planar with and without the orthogonal wires is about 15\% at maximum. Figure 2.16 shows the frequency characteristics with various widths of orthogonal wires from 2.4\(\mu\)m to 24\(\mu\)m. The density of the orthogonal wires is fixed to 30\%. Those structures are evaluated by a 3D field-solver [62]. As width \(W\) becomes larger, the effect of the orthogonal wires becomes more significant and appears at relatively low frequency. When the width of the orthogonal wire is 24\(\mu\)m, the inductance difference by the orthogonal wires becomes over 25\% at 50GHz. Figure 2.17 shows the inductance characteristics with various wire densities. Wire width \(W\) is fixed to 24\(\mu\)m. As the wire density becomes higher, the inductance with the orthogonal
wires closes to that of the micro-strip structure. The maximum difference from the inductance of the co-planar structure is about 45% at 50GHz. Figure 2.18 shows the self-inductance at 10GHz vs. the width and the density of the orthogonal wires. As shown in Fig. 2.18, the inductance is approximately linear to the width and the density. The maximum difference is about 40%.

![Figure 2.15: Self-inductance of a co-planar line with orthogonal P/G wires (W = S = 4μm, measured).](image1)

![Figure 2.16: Relationship between the self-inductance and the wire width of the orthogonal wires (density= 30%, field-solver).](image2)

### 2.4 Effect of lossy substrate

This section discusses the effect of lossy substrate and shows the power/ground wire of standard cells prevents the coupling between the signal wire and the substrate.

#### 2.4.1 Test structure

The cross section and the top view of the test structures are shown in Fig. 2.19. In the top layer (M5), three wires construct G-S-G co-planar structure. Each line width is 4μm and the spacing between signal and ground is $S = 2μm$ or $S = 19μm$. The length of co-planar line is 600μm. In the lowest layer (M1), grounded wires that represent power/ground wires in standard cell are aligned. M1 wires are located uniformly in the area of $180μm \times 600μm$, as shown in Fig. 2.19. Figure 2.20 shows the detailed structure of M1 wires. The structure of M1 wires is divided into parallel and orthogonal main branches. In “parallel” structure, M1 ground wires are parallel to the co-planar wires in M5. “Orthogonal” has M1 wires that are orthogonal to the co-planar wires. The wire width of M1 wires is 1.2μm, and the spacing is 1.2μm or 9.6μm. The dimension of M1 wires is decided assuming a 0.18μm standard cell library. All M1 wires are connected to each other at the end of wires, and connected to the ground pad. For comparison, test structures with a ground plane in M1 layer and test structures without M1 wires are also evaluated.

Figure 2.21 shows a micrograph of a test structure. The interconnect characteristics are evaluated by using a network analyzer.
Figure 2.17: Relationship between the self-inductance and the density of the orthogonal wires ($W = 24\mu m$, field-solver).

Figure 2.18: Inductance vs. the width and the density of orthogonal wires (at 10GHz, field-solver).

Figure 2.19: The cross section and the top view of test structures.
2.4. Effect of lossy substrate

2.4.2 Measurement results

Some measurement results are shown. Figure 2.22 shows the self-resistance of the test structure whose spacing $S$ is 2$\mu$m. The curve labeled “plate” is the result that the ground plane is placed in M1 layer, and “w/o M1” is that result that no ground wires in M1. The measurement results are shown in lines with points. The dotted lines labeled “simulated” mean the simulation result by a 3D field-solver [62] without considering substrate conductivity (substrate conductivity is set to 0). If the measurement result (labeled “measured”) agrees with the simulation result, the substrate effect is not significant. From Fig. 2.22, the measured results are close to the simulation results. This means that substrate effect is not significant where the spacing $S$ is 2$\mu$m.

Figure 2.23 shows the resistance where the spacing $S$ is 19$\mu$m. In the results of “w/o M1” and “orthogonal”, the simulation result underestimates the self-resistance by about 30%. This is because the simulation result ignores the substrate effect. On the other hand, the simulation results agree with the measured results in the case of “parallel” and “plate”. Parallel M1 wires behave as current return path and shields magnetic coupling between the signal wire and the substrate. Therefore if parallel ground wires or a ground plane exists in the lower layer, substrate effect on interconnect characteristics is negligible.

From the above discussion, parallel ground wires in M1 layer prevent the substrate coupling even if the signal-ground spacing is 19$\mu$m. On the other hand, orthogonal wires cannot shield substrate coupling. If there are P/G wires in lower layer and they are parallel to the signal wire, substrate loss is not significant and considering the wires in lower layer is important for accurate modeling even though P/G wires are narrow.
Chapter 2. Modeling of on-chip interconnects

Figure 2.22: Self-resistance (spacing $S = 2\mu m$).

Figure 2.23: Self-resistance (spacing $S = 19\mu m$).
2.5 Summary

A return path screening method for interconnect RL extraction is proposed. The proposed method evaluates the energy dissipated at ground wires, and judges whether the energy dissipation in the ground wires is small enough or not, because the return current flows in the paths with the minimum energy consumption in nature.

The return current distribution strongly depends on frequency. At low frequency, the resistance is dominant and the inductance becomes significant as frequency becomes higher. The proposed method calculates two windows for the resistance-dominant low-frequency region and for the inductance-dominant high-frequency region. By merging these two windows, the proposed method provides an adequate ground wire configuration that enables accurate extraction at all frequencies.

The proposed method can also save the extraction cost. The extraction cost by a 3D field-solver increases exponentially as the number of considered wires increases. The proposed method can select the adequate ground wires with negligible small extra computational cost. Therefore the proposed method enables the accurate and efficient extraction considering return current distribution.

The experimental results show that the orthogonal wires are not negligible in multi-GHz region. The effect of the orthogonal wires on the self-inductance is linear to the width and the density. The maximum errors in inductance by ignoring the orthogonal wires are over 40% at 10GHz. Real chip measurements and the results of a field-solver reveal that orthogonal power/ground wires should be considered in the evaluation of the interconnect characteristics.

Substrate coupling and the ground wires in lower layer are discussed. From measurement results, shielding effect of the ground wires in lower layer depends on the direction of wires. The ground wires that are parallel to the signal wire prevent the magnetic coupling and suppress the substrate loss. When ground wires in lower layer are orthogonal to the signal wire, the self-resistance is almost the same as the result of the interconnect without M1 wires. Therefore the substrate loss is negligible if dense power/ground grid such as P/G for standard cells exists in the lower layer.
Chapter 3

Interconnect RL extraction at a single representative frequency

3.1 Introduction

In this chapter, a method to determine a single frequency for interconnect RL extraction is proposed. One difficulty of interconnect modeling is frequency dependency of the characteristics. As shown in the previous chapter, interconnect characteristics, especially resistance and inductance depend on frequency because of skin-effect, proximity-effect and return current distribution [5]. In frequency-dependent interconnects, the behavior of interconnects depends on frequency e.g. attenuation and phase velocity dispersion. In digital circuits, pulse waveforms are commonly used. The frequency spectrum of pulse waveforms widely spreads from DC to frequency several times as high as clock frequency. Therefore to model the behavior of interconnects precisely, designers have to take the frequency characteristics into consideration. Moreover, the input pulse pattern is not entirely periodic. The frequency spectrum varies depending on the width of pulse and the period. The minimum pulse width and period are determined by system clock. But on signal line, the pulse pattern depends on the circuit behavior.

To treat frequency-dependent interconnects, several frequency-dependent circuit models are proposed [16–18, 64]. However, the frequency-dependent models are not still tractable in circuit design because frequency-independent interconnect models have been widely used. Moreover, using frequency-dependent model increases computational cost for simulation, and requires an extra cost for creating the model. If interconnect characteristic can be modeled well by a certain frequency, such extra cost is saved and the design techniques based on frequency-independent interconnect model can be applied, such as circuit reduction, buffer insertion and timing analysis [5, 26, 95]. Furthermore, frequency-independent RLC values can intuitively predict fundamental interconnect characteristics such as characteristic impedance. However, determination of a single extraction frequency is difficult.

In this chapter, an extraction frequency based on the transfer characteristic of interconnects is proposed. It is commonly adopted to determine the extraction frequency from the shape of an input signal waveform, especially from the rise time, focusing on the spectrum of the input signal [5]. This is nat-
ural and reasonable when the incident waveform to the near-end (driver output) of the interconnects is discussed. On the other hand, the main interest of circuit designers is the analysis of the waveform at the far-end (receiver input). As signals are propagating through an interconnect, high-frequency components easily attenuate. The dominant frequency components that determine the far-end waveform are different from those for the near-end waveform. From observation of frequency spectrum at the far-end of interconnects, the transfer characteristic of interconnects is playing an important role in the waveforms at the far-end of interconnects. Therefore the proposed method focuses on the transfer characteristic of interconnects and select the frequency for interconnect RL extraction where the transfer characteristic becomes maximum.

First the transfer characteristic of uniform open-ended transmission-lines is discussed. In the case of uniform transmission-lines, the transfer characteristic becomes maximum at the resonance frequency where the quarter wavelength is equal to the interconnect length. Therefore the resonance frequency is obtained from the interconnect length and the velocity of the electromagnetic wave. Experimental results reveal that RLC ladder circuit extracted at the resonance frequency provides accurate modeling for the waveforms at the far-end. The errors in the voltage amplitude, signal propagation delay and the amplitude of crosstalk noise are less than 8%, whereas the extraction at DC or at the significant frequency cause more than 30% error. Then an extended method that can handle nonuniform or branching interconnects is proposed. If the interconnect is branching or nonuniform, the transfer characteristic of each segment is different. The proposed method in this chapter gives a respective extraction frequency to every segment of an interconnect instead of enforcing a single extraction frequency on the entire interconnect. The proposed method systematically determines the extraction frequencies successively from the sinks to the source by replacing the downstream interconnect with the equivalent load impedance. Experimental results validate that the equivalent circuit of interconnects extracted at the proposed frequency can achieve the most accurate waveform modeling compared with the conventional extraction frequencies. Experimental results show the maximum errors are less than 10% in signal delay and signal transition time. The contribution of this chapter is that the proposed method realizes accurate transient analysis using frequency-independent interconnect model. The proposed method is effective when the topology and the length of interconnects are fixed, for example, post-layout extraction.

3.2 Conventional extraction frequencies

In digital circuits, a trapezoidal pulse that contains multiple frequency components is a common waveform. To model long interconnects that have transmission-line characteristics, an RLC ladder circuit in Fig. 3.1 is used. This ladder model cannot consider the frequency-dependence of interconnect RL values. In order to derive frequency-independent model of Fig. 3.1, a single extraction frequency have to be chosen.

There are several representative frequencies of pulse waveform. One of them is the clock frequency. The width of the clock pulse is the minimum width of the signal pulse. When the clock period is $T_p$, the frequency $f_p = 1/T_p$ is one of the peaks in the frequency spectrum of the input pulse. The other one is significant frequency $f_{\text{sig}}$. The significant frequency $f_{\text{sig}}$ is expressed by signal transition time $t_i$ and defined such that the signal energy from DC to $f_{\text{sig}}$ becomes 85% of all signal energy. An example of the frequency spectrum and the significant frequency is shown in Fig. 3.2. The pulse width $T_w$ is
3.3 Representative frequency for uniform transmission-lines

In this section, a transfer characteristic of open-ended uniform transmission-lines is discussed. From the transfer characteristic of transmission-lines, the important frequency component at the far-end is determined. More generic case is discussed in Section 3.5.

3.3.1 Transfer characteristic of open-ended transmission-lines

As mentioned in Section 3.2, conventional extraction frequencies are based on the input pulse shape. It means that the conventional methods focus on the frequency components at the near-end of interconnects. However, the far-end waveform is more important for circuit designer because the waveform directly affects signaling delay. The far-end waveform becomes totally different because of attenuation.
Chapter 3. Interconnect RL extraction at a single representative frequency

and reflection. The proposed method aims to express accurate far-end waveforms. Figure 3.3 shows step responses obtained with a FD model and a ladder extracted at significant frequency $f_{sig}$. The interconnect structure is shown in Fig. 1.12 and the output impedance of the driver $R_d$ is 10Ω. As shown in Fig. 3.3, the ladder extracted at $f_{sig}$ models the incident wave of interconnects well, but a remarkable error occurs at the far-end. This error is mainly caused by overestimation of attenuation. The balance of driver resistance $R_d$ and characteristic impedance $Z_0$ determines the incident wave. Characteristic impedance is expressed as

$$Z_0 = \frac{\sqrt{R + j\omega L}}{G + j\omega C} \approx \frac{\sqrt{L}}{\sqrt{C}}. \tag{3.1}$$

Approximately, characteristic impedance is proportional to square root of inductance $\sqrt{L}$. The attenuation of interconnect affects the waveform at the far-end. If the shunt conductance $G$ is negligible, the attenuation constant $\alpha$ is expressed as

$$\alpha = \text{Re} \left[ \frac{\sqrt{R + j\omega L}(G + j\omega C)}{2\sqrt{L}} \right] \approx \frac{R}{2 \sqrt{L}}. \tag{3.2}$$

In LSIs, the shunt conductance $G$ is negligible because the dielectric loss tangent of silicon-dioxide is small. The typical value is 0.0006. Attenuation constant is roughly proportional to resistance $R$ and square root of inductance $\sqrt{L}$. From the above equations, variation of resistance strongly affects waveform propagation. Moreover, as shown in Fig. 1.12, the variation of resistance is larger than that of inductance. At 34GHz of Fig. 1.12, inductance decreases by about 21% from DC and resistance increases by about 110% from DC. The inductance decreases because of proximity-effect and the decrease in internal-inductance. Therefore the inductance value saturates at high frequency. On the other hand, resistance increases exponentially as frequency become higher. Therefore the estimation of resistance is critical to model far-end waveforms. The attenuation strongly depends on interconnect structure such as interconnect length. From above discussion, the interconnect structure has to be considered when determining an extraction frequency.

To determine an extraction frequency for the modeling of the far-end of interconnects, the dominant frequency component at the far-end have to be specified. From the theory of open-ended transmission-line resonators, when the quarter wavelength $\lambda/4$ is equal to interconnect length $l$, transmission-lines are equivalent to a series resonator shown in Fig. 3.4. When the interconnect length is $l$, the resonance frequency $f_{res}$ is expressed by

$$\frac{\lambda}{4} = l \Rightarrow f_{res} = \frac{\nu}{\lambda} = \frac{\nu}{4l}, \tag{3.3}$$

where $\nu$ is the velocity of electromagnetic wave. The velocity $\nu$ is calculated by the expression $\nu = c/\sqrt{\epsilon_r \mu_r}$ where $c$ is the velocity of light in free space, $\epsilon_r$ is the relative permittivity and $\mu_r$ is the relative permeability. When the frequency is $f_{res}$, the impedance of series resonator becomes minimum and the attenuation of frequency component $f_{res}$ is minimum. This nature is used for quarter wavelength transmission-line resonators. Figure 3.5 shows a transfer characteristic of a transmission-line. The interconnect structure is the same as Fig. 1.12 and interconnect length is 5mm. The relative permittivity of SiO$_2$ is 4.0, so the velocity of electromagnetic wave is $1.5\times10^8$ m/s. The frequency characteristics are extracted by a 2D field-solver and modeled by the frequency-dependent model [18]. In this case, the
3.3. Representative frequency for uniform transmission-lines

Figure 3.3: Waveform at near-end and far-end. (interconnect structure is shown in Fig. 1.12, $Z_0 = 55\Omega$, $R_d = 10\Omega$)

Figure 3.4: Open-ended transmission-line and equivalent series resonator.

resonance frequency $f_{\text{res}}$ where $V_{\text{out}}/V_{\text{in}}$ becomes maximum is $v/4l = 1.5 \times 10^8/(4 \times 5 \times 10^{-3}) = 7.5\text{GHz}$. The voltage gain $V_{\text{out}}/V_{\text{in}}$ becomes maximum at the resonance frequency $f_{\text{res}}$. Therefore the frequency component $f_{\text{res}}$ strongly affects the waveform at the far-end.

This transfer characteristic affects the waveform at the far-end of transmission-lines. The frequency components near the resonance frequency tend to appear at the far-end. On the other hand, the frequency components near the antiresonance frequency hardly affect the waveform at the far-end. Therefore the frequency spectrum at the far-end depends on the input pulse and the transfer characteristic of the interconnect. If the frequency spectrum of the input pulse spreads over the resonance frequency, the frequency components around the resonance frequency are expected to affect the waveform at the far-end. If the resonance frequency is higher than the significant frequency, the frequency components around the resonance frequency are small because almost all of the frequency components concentrate in the range from DC to the significant frequency.

The frequency spectrum of the waveform at the far-end is as shown in Fig. 3.6 when a transmission-line is driven by a voltage source and a resistor. The transition time $t_c$ is varied from 10ps to 50ps, and hence the significant frequency changes from 34GHz to 6.8GHz. The resonance frequency is 7.5GHz. In this case, the significant frequency is nearly equal to or higher than the resonance frequency. The frequency spectrum has a unique peak at the resonance frequency even if the signal transition time $t_c$ is changed. This result indicates the frequency component at the resonance frequency is an
important factor which determines the waveform at the far-end of the interconnect. If the transmission-line is uniform and open-ended, the frequency $f_{res}$ is determined only by interconnect length because the velocity $v$ can be assumed as a constant value. This resonance frequency $f_{res}$ is proposed as an extraction frequency. In following sections, the resonance frequency $f_{res}$ is rewritten to $f_{\text{proposed}}$.

In Section 3.4, experimental results are shown to verify the accuracy of several extraction frequencies; DC, proposed frequency $f_{\text{proposed}}$ and significant frequency $f_{\text{sig}}$.

### 3.4 Experimental results of uniform transmission-lines

This section shows some experimental results. The modeling accuracy of each representative frequency is verified by circuit simulation. First experimental conditions and some metrics of accuracy are explained. Then the accuracy under various experimental conditions is evaluated, and the results show that the proposed frequency $f_{\text{proposed}}$ provides the most accurate modeling. The experimental results also reveal that the ladder extracted at frequency $f_{\text{proposed}}$ is accurate enough to simulate interconnect behaviors.

#### 3.4.1 Experimental conditions and the metrics of accuracy

In this section, experimental conditions and metrics of accuracy are described. To cover major possible situations in digital circuits, the proposed method is verified under various frequency-dependence and various waveforms. Frequency-dependence of interconnects is determined by the interconnect structures. Waveform variation is expressed by pulse period, duty ratio and transition time. Therefore the following parameters are varied for evaluation of the proposed and the conventional representative frequencies.

- pulse period and duty ratio ($f_{p}$ changes and others are fixed).
3.4. Experimental results of uniform transmission-lines

Figure 3.6: Frequency spectrum of waveform at the far-end.

Table 3.1: Range of parameters and representative frequencies.

<table>
<thead>
<tr>
<th>Parameter range</th>
<th>Corresponding freq. range</th>
</tr>
</thead>
<tbody>
<tr>
<td>$500\text{ps} \leq T_p \leq 5\text{ns}$</td>
<td>$200\text{MHz} \leq f_p \leq 2\text{GHz}$</td>
</tr>
<tr>
<td>$10\text{ps} \leq \tau_t \leq 100\text{ps}$</td>
<td>$3.4\text{GHz} \leq f_{\text{sig}} \leq 34\text{GHz}$</td>
</tr>
<tr>
<td>$0.5\text{mm} \leq l \leq 10\text{mm}$</td>
<td>$3.75\text{GHz} \leq f_{\text{proposed}} \leq 75\text{GHz}$</td>
</tr>
</tbody>
</table>

- pulse transition time ($f_{\text{sig}}$ changes and others are fixed).
- interconnect length ($f_{\text{proposed}}$ changes and others are fixed).
- interconnect structure and driver strength.

First the variation of input pulse pattern is discussed. Pulse period and duty ratio are varied, that is, this situation corresponds various input pulse pattern on signal interconnects. In this experiment, pulse frequency $f_p$ changes according to pulse period, and other extraction frequency is fixed. Next, variation of pulse transition time is discussed. Transition time decides significant frequency, so $f_{\text{sig}}$ varies and others are fixed in this experiment. Then the case that interconnect length changes is verified. Frequency $f_{\text{proposed}}$ varies as changing interconnect length, and other frequencies are fixed. The ranges of each parameter are listed in Table 3.1. When a parameter is changed, corresponding one of the representative frequencies also change. The range of the representative frequencies are also listed in Table 3.1.

The above conditions are examined with various interconnect structures and driver output impedance. On transmission-lines, a waveform strongly depends in the relation between characteristic impedance of the interconnect and output impedance of the driver. As for interconnect structures, two popular interconnect structures; micro-strip and co-planar are used. The cross-sections of two interconnect structures are shown in Fig. 3.7. $W_s$ is the width of signal interconnect, $W_g$ is the width of ground line, $S$ is the spacing between signal interconnects and $S_g$ is the spacing between the signal interconnect.
and the ground line. The parameters of the structure are varied in the ranges of $1\mu m \leq W_s \leq 8\mu m$, $8\mu m \leq W_g \leq 40\mu m$, $2\mu m \leq S \leq 8\mu m$ and $2\mu m \leq S_g \leq 8\mu m$. For circuit simulation, the RLC value is extracted from interconnect structures by a field-solver [15]. The interconnects are expressed by the equivalent circuit shown in Fig. 3.8. The number of ladder is 51. The equivalent circuit is synthesized using the RLC values at the extraction frequency.

In transient analysis, the voltage waveforms are evaluated on the experimental circuit as shown in Fig. 3.9. One of two lines is stimulated by the input pulse, and the other is quiet. The stimulated line is labeled as "Aggressor", and the quiet line as "Victim". The near-end of each line are held by a resistance, which represents the output impedance of the driver. The output impedance of the driver is varied from 10$\Omega$ to 100$\Omega$. The far-end of each line is connected to the capacitor load that corresponds to the input capacitance of a receiver. The value of the capacitor loads is fixed to 50fF.

To verify accuracies of modeling, evaluation metrics are necessary. The $V_{ddl}/2$ propagation delay time (Delay), the amplitude of overshoot/undershoot ($V_{pp}$) and the amplitude of far-end crosstalk noise

**Figure 3.7:** Cross-sections of interconnects.

**Figure 3.8:** Equivalent circuit of coupled transmission-line.

**Figure 3.9:** Experimental circuit for transient analysis.
3.4. Experimental results of uniform transmission-lines

Figure 3.10: Definition of delay time, peak-to-peak voltage and crosstalk.

\((V_{\text{noise}})\) is used as evaluation metrics. Figure 3.10 shows the definition of delay time, peak-to-peak voltage and crosstalk. These metrics are evaluated on the ladder extracted at each representative frequencies and frequency-dependent model. The result of the frequency-dependent model is assumed as the reference data. This means that the evaluation results that are close to those of frequency dependent model are accurate.

### 3.4.2 Pulse pattern versus accuracy

The accuracy in the case that the input pulse pattern changes is discussed. In digital circuits, the width and period of pulse depend on the applied input pattern. As discussed in Section 3.3, frequency spectrum changes when width \(T_w\) and period \(T_p\) vary. The minimum \(T_p\) is determined by the clock frequency. \(T_w\) and \(T_p\) depend on input pulse pattern. When the logic state of the interconnect does not change, \(T_w\) and \(T_p\) become large and the duty ratio changes. For digital circuits, the model of interconnects have to be accurate even when \(T_p\) and \(T_w\) change. In this experiment, the voltage peak-to-peak and the signal delay time are used as evaluation metrics because crosstalk is not assumed to depend on the pulse period.

The errors in peak-to-peak voltage and delay time are shown in Fig. 3.11 and Fig. 3.12. The evaluated interconnect structure is a co-planar structure with 8\(\mu\)m signal wire width, 20\(\mu\)m ground wire width, 4\(\mu\)m spacing between each interconnects and 5mm length. The output impedance of the drivers is 50\(\Omega\). The transition time of the input pulse is 10ps, and the period is 500ps. The x-axis is the period of the pulse \(T_p\), where the duty ratio is set to be 50%. As you see, the errors of DC, \(f_{\text{proposed}}\) and \(f_{\text{rig}}\) are almost constant when the pulse period is changed. The proposed method achieves the most accurate modeling and its error is within 2%. The error of \(f_p\) get close to the error of DC as the period become large. This is because the pulse frequency \(f_p\) becomes lower as the period becomes larger.

The maximum errors are listed in Table 3.2. From Table 3.2, the maximum errors when using \(f_{\text{proposed}}\) is 2% both in the peak-to-peak voltage and in the delay time. The maximum errors of DC and
Chapter 3. Interconnect RL extraction at a single representative frequency

Figure 3.11: Voltage peak-to-peak when the period of pulse changed.

Table 3.2: Maximum errors when the period of input pulse changed.

<table>
<thead>
<tr>
<th>Extraction Freq.</th>
<th>DC</th>
<th>$f_p$</th>
<th>$f_{\text{proposed}}$</th>
<th>$f_{\text{sig}}$</th>
</tr>
</thead>
<tbody>
<tr>
<td>Error in $V_{\text{pp}}$</td>
<td>+9.0%</td>
<td>+9.0%</td>
<td>-1.7%</td>
<td>-11.5%</td>
</tr>
<tr>
<td>Error in Delay</td>
<td>+9.1%</td>
<td>+9.1%</td>
<td>+2.0%</td>
<td>+1.4%</td>
</tr>
</tbody>
</table>

$f_p$ is the same, and their errors are about 10% in peak-to-peak voltage and signal delay. The ladder extracted at $f_{\text{sig}}$ achieves the smallest error in signal delay, but the error in peak-to-peak voltage exceeds 10%.

The similar results are observed even if the duty ratio of input pulse is varied from 10% to 50%. Then the maximum error of $f_{\text{proposed}}$ is about 3% both in the peak-to-peak voltage and in the delay time.

An example of the typical waveforms at the aggressor and the victim are shown. The figures are results when the pulse period $T_p$ is 5ns. In this case, the ladder extracted at DC and that extracted at $f_p$ produce almost same results. Figure 3.13 shows the waveform at the far-end of the aggressor interconnect. From Fig. 3.13, the overshoot is overestimated on the ladder extracted at DC, and is underestimated on the ladder extracted at $f_{\text{sig}}$. From viewpoint of the signal delay, DC overestimates the delay time. Figure 3.14 shows the waveform at the far-end of the victim interconnect. From the observation of waveforms, the equivalent circuit extracted at $f_{\text{proposed}}$ is most close to the frequency-dependent model.

As a more realistic case, an interconnect driven by MOS transistors is evaluated. The waveforms at the far-ends of aggressor and victim are shown in Fig. 3.15. A transistor model of a 130nm technology is used and the $W/L$ value of the driver is 720. From Fig. 3.15, DC extraction overestimates overshoot, delay and crosstalk. In contrast, $f_{\text{sig}}$ extraction underestimates them. $f_{\text{proposed}}$ extraction achieves the most accurate modeling even when the interconnect is driven by transistors.
3.4. Experimental results of uniform transmission-lines

Figure 3.12: Delay time when the period of pulse changed.

Figure 3.13: The waveform at the far-end of the aggressor.
Chapter 3. Interconnect RL extraction at a single representative frequency

Figure 3.14: The waveform at the far-end of the victim.

Figure 3.15: The waveform driven by transistors. (Transistor W/L = 720)
3.4. Experimental results of uniform transmission-lines

Figure 3.16: Voltage peak-to-peak when the transition time is changed.

Table 3.3: Maximum errors when the transition time changed.

<table>
<thead>
<tr>
<th>Extraction Freq.</th>
<th>DC</th>
<th>$f_{\text{proposed}}$</th>
<th>$f_{\text{sig}}$</th>
</tr>
</thead>
<tbody>
<tr>
<td>Error in $V_{\text{pp}}$</td>
<td>+9.0%</td>
<td>-3.0%</td>
<td>-11.5%</td>
</tr>
<tr>
<td>Error in Delay</td>
<td>+9.2%</td>
<td>+1.9%</td>
<td>-1.2%</td>
</tr>
<tr>
<td>Error in $V_{\text{noise}}$</td>
<td>+11.8%</td>
<td>-7.9%</td>
<td>-10.4%</td>
</tr>
</tbody>
</table>

3.4.3 Transition time versus accuracy

The results when transition time is changed are shown. Significant frequency $f_{\text{sig}}$ is decided by transition time. When transition time $t_t$ is 10ps, $f_{\text{sig}}$ is 34GHz and when $t_t$ 100ps, $f_{\text{sig}}$ becomes 3.4GHz. Fig. 3.16 and Fig. 3.17 show the errors in the peak-to-peak voltage and delay time. The simulation condition is the same as Section 3.4.2. The error in crosstalk noise is also shown in Fig. 3.18. Table 3.3 shows the maximum errors when the transition time varied. From Fig. 3.16, extraction at DC causes about 9% error constantly in the peak-to-peak voltage. The extraction at $f_{\text{sig}}$ cause over 10% error when the transition time is small. Significant frequency $f_{\text{sig}}$ becomes extremely high when transition time is small. Therefore attenuation on interconnect is overestimated. From Fig. 3.17, the ladder extracted at DC causes about 9% error in the delay time. DC extraction overestimates the inductance value, so the velocity of signal is underestimated. Therefore delay time is overestimated especially when transition time is small. The extraction at $f_{\text{proposed}}$ achieves less than 3% errors in the peak-to-peak voltage and the delay time. From Fig. 3.18, there is the same trend as the peak-to-peak voltage in the amplitude of crosstalk noise. DC extraction causes error constantly and $f_{\text{sig}}$ causes remarkable error when the transition time is small. As seen in Table 3.3, DC extraction causes about 10% overestimation in $V_{\text{pp}}$, delay and $V_{\text{noise}}$. Resistance and inductance extraction at $f_{\text{sig}}$ causes over 10% underestimation in $V_{\text{pp}}$ and $V_{\text{noise}}$. The ladder extracted at $f_{\text{proposed}}$ steadily provides the most accurate estimation, and the maximum error is about 8%. 
Chapter 3. Interconnect RL extraction at a single representative frequency

Figure 3.17: Delay time when the transition time is changed.

Figure 3.18: Crosstalk noise peak-to-peak when the transition time changed.
3.4. Experimental results of uniform transmission-lines

### Figure 3.19: Voltage peak-to-peak when the interconnect length changed.

![Graph showing peak-to-peak voltage normalized by supply voltage against interconnect length.](image)

### Table 3.4: Maximum errors when the interconnect length changed.

<table>
<thead>
<tr>
<th>Extraction Freq.</th>
<th>DC</th>
<th>$f_{\text{proposed}}$</th>
<th>$f_{\text{sig}}$</th>
</tr>
</thead>
<tbody>
<tr>
<td>Error in $V_{pp}$</td>
<td>+10.2%</td>
<td>-2.4%</td>
<td>-15.7%</td>
</tr>
<tr>
<td>Error in Delay</td>
<td>+9.1%</td>
<td>+3.2%</td>
<td>+2.5%</td>
</tr>
<tr>
<td>Error in $V_{\text{noise}}$</td>
<td>+18.7%</td>
<td>-1.8%</td>
<td>-11.3%</td>
</tr>
</tbody>
</table>

### 3.4.4 Interconnect length versus accuracy

Here, the accuracy versus the interconnect length is discussed. Frequency $f_{\text{proposed}}$ depends on the interconnect length and the wave velocity. The wave velocity is determined by relative permittivity. Therefore the velocity can be assumed as a constant value in the same technology. The error in peak-to-peak voltage is shown in Fig. 3.19, and the error in delay time is shown in Fig. 3.20. Figure 3.21 shows the delay time normalized by the delay time of the FD model. Figure 3.22 shows the amplitude of the crosstalk noise. The simulation condition is the same as Section 3.4.2. As seen in Fig. 3.19, the ladder extracted at $f_{\text{proposed}}$ achieves the minimum error in peak-to-peak voltage. DC extraction always overestimates the $V_{pp}$, and $f_{\text{sig}}$ extraction causes underestimation when the interconnect length becomes long. As shown in Fig. 3.20 and Fig. 3.21, DC extraction causes about 10% error when the interconnect length becomes long. The errors of $f_{\text{proposed}}$ and $f_{\text{sig}}$ extraction are almost the same and below 4%. From Fig. 3.22, crosstalk noise becomes larger as the interconnect length becomes long in the region where the interconnect length is small. The noise amplitude is almost constant when the length is more than 2mm. Figure 3.22 shows that DC extraction causes overestimation and $f_{\text{sig}}$ causes underestimation of the crosstalk noise.

The maximum errors are listed in Table 3.4. As you see, DC and $f_{\text{sig}}$ may cause errors more than 10%, but the maximum error of $f_{\text{proposed}}$ is about 3%. These results indicates the ladder extracted at $f_{\text{proposed}}$ is robust against the change of the interconnect length.
Figure 3.20: Delay time when the interconnect length changed.

Figure 3.21: Normalized delay time when the interconnect length changed.
3.4. Experimental results of uniform transmission-lines

Figure 3.22: Crosstalk noise peak-to-peak when the interconnect length changed.

Table 3.5: Maximum errors in overall experiments.

<table>
<thead>
<tr>
<th>Extraction Freq.</th>
<th>DC</th>
<th>$f_{\text{proposed}}$</th>
<th>$f_{\text{sig}}$</th>
</tr>
</thead>
<tbody>
<tr>
<td>Error in $V_{\text{pp}}$</td>
<td>+22.5%</td>
<td>-4.6%</td>
<td>-28.0%</td>
</tr>
<tr>
<td>Error in Delay</td>
<td>+27.0%</td>
<td>+4.8%</td>
<td>+23.0%</td>
</tr>
<tr>
<td>Error in $V_{\text{noise}}$</td>
<td>+37.4%</td>
<td>+7.9%</td>
<td>-18.2%</td>
</tr>
</tbody>
</table>

3.4.5 Overall results of uniform transmission-lines

The above sections show that the frequency calculated from time-of-flight $f_{\text{proposed}}$ achieves the most accurate analysis. Table 3.5 shows the maximum errors in all of the results under this study. The experimental conditions are carefully decided so that the effectiveness of the proposed frequency is comprehensively confirmed. The amount of conditions is about 14,000. The ladder extracted DC or $f_{\text{sig}}$ causes errors beyond 20%. When wide micro-strip interconnect is driven by strong driver, DC and $f_{\text{sig}}$ tend to cause a large error. As you see, the proposed frequency $f_{\text{proposed}}$ achieves the error below 8%. The above discussions prove that the ladder extracted at the proposed frequency $f_{\text{proposed}}$ provides the most accurate modeling of frequency-dependent interconnects.

3.4.6 Tolerance to extraction frequency variation

The effect of $f_{\text{proposed}}$ estimation error on modeling accuracy is discussed. The proposed frequency is based on transmission-line resonator theory. The proposed method assumes that transmission-lines have ideal open-end. However in real chips, interconnects are terminated by the input capacitor of the receiver and, rigidly speaking, the sink is not ideal open-end. The resonance frequency is not equal to $f_{\text{proposed}}$ exactly, but the difference is usually quite small because input capacitor of CMOS receiver is small.

Figure 3.23 shows the extraction frequency versus errors. The x-axis is the extraction frequency...
and the y-axis is the error from frequency-dependent model. The experimental setup is the same as that of Fig. 3.13, 5mm wire length and 10ps transition time. The proposed frequency $f_{\text{proposed}}$ is 7.5GHz. As shown in Fig. 3.23, the errors in $V_{pp}$ and in $V_{noise}$ become minimum at the proposed frequency. The error in delay becomes minimum at about 20GHz, but the error is almost constant above 10GHz. From Fig. 3.23, the errors are below 2% in the region of $f_{\text{proposed}} \pm 30\%$. This result indicates that the proposed method is accurate enough even if the proposed frequency has a certain error in comparison with the exact resonance frequency.

Figure 3.23 also shows that extraction at DC and significant frequency $f_{\text{sig}} = 34\text{GHz}$ is far from the frequency with the minimum error around $f_{\text{proposed}}$. The errors at DC and significant frequency are above 10% whereas that of the proposed method is below 2%.

### 3.5 An extended method to determine an extraction frequency

The previous section shows a method which determines a single extraction frequency from the transfer characteristics of transmission-lines. This section extends the method to handle general interconnects, that is, branching and nonuniform wires. First the transfer characteristic of transmission-lines with a generic load is explained, and then the extended method is described.

#### 3.5.1 Transfer characteristic of generic transmission-lines

The basic idea of the proposed method is to choose the frequency where the transfer characteristic becomes maximum. The nature of the transfer characteristic of interconnects based on the transmission-line theory is explained. This section discusses a simple transmission-line as shown in Fig. 3.24. The characteristic impedance is $Z_0$, the propagation constant is $\gamma$, the length of the transmission-line is $l$ and the velocity of electromagnetic wave is $v$. The velocity $v$ is equal to $c/\sqrt{\varepsilon}$, where $c$ is the velocity
3.5. An extended method to determine an extraction frequency

The load impedance $Z_l$ is connected to the far-end of transmission-lines. The voltage at the near-end is written as $V_{in}$ and the voltage at the far-end as $V_{out}$.

According to the transmission-line theory, the voltage transfer characteristic $V_{out}/V_{in}$ is expressed as

$$\frac{V_{out}}{V_{in}} = \frac{1}{\cosh \gamma l + \frac{Z_l}{Z_0} \sinh \gamma l}$$  \hspace{1cm} (3.4)

### 3.5.2 Flow of the proposed method

This section proposes a method to determine the extraction frequency based on the transfer characteristic. The flow of the proposed method is explained. Figure 3.25 is the conceptual diagram of the proposed method. Interconnects are divided into segments at the branch points or discontinuous points and consider as a tree such that the root node is the output of the driver and the input of the receiver is the leaf node. The proposed method determines the extraction frequencies for each segment from leaf to root by replacing the downstream branches with the equivalent load impedances.

#### Assumptions of the proposed method

The proposed method determines the extraction frequency from the topology of interconnects and the length of each segment. The velocity of electromagnetic wave $v$ is assumed to be known and a constant value. In LSIs, this assumption is valid because the velocity $v$ depends on the relative dielectric constant $\varepsilon_r$ and it is constant in the same fabrication process. The significant frequency $f_{sig}$ is also assumed to be known. As described in following step, the significant frequency is used as the upper limit of the extraction frequency. Additionally, the characteristic impedances of each segment are assumed to be the same value. Strictly speaking, this assumption is not correct. However in LSIs, the characteristic impedance of on-chip interconnects does not vary very much even if the interconnect structure changes. The typical value of co-planar structures is from 50Ω to 100Ω. In the following section, the proposed method is experimentally verified.

#### Step 1. Determine the extraction frequency for terminal segments

If the length of the segment and the load impedance are known, the frequency $f_{req}$ at which the transfer characteristic of the segment becomes maximum can be determined from Eq.(3.4). The resonance
Figure 3.25: Conceptual diagram of the proposed method.
3.5. An extended method to determine an extraction frequency

An extension method to determine an extraction frequency depends on the length of the segment and the load impedance. Generally, as shorter the segment is, the extraction frequency becomes higher. The frequency $f_{\text{res}}$ can be too high frequency to use as an extraction frequency. If the frequency component of the input is too small at the frequency, extracting at the frequency at which the transfer characteristic becomes maximum is meaningless. Therefore the upper limit of the extraction frequency should be set. As mentioned so far, the significant frequency $f_{\text{sig}}$ is defined as the frequency that the 85% of all the energy is included from DC to $f_{\text{sig}}$. In the case that the resonance frequency is higher than significant frequency, the frequency components around the resonance frequency are small and hardly affect the waveform at the far-end. Therefore it is reasonable to use the significant frequency as the upper limit of the extraction frequency. Extraction frequency $f_{\text{proposed}}$ is expressed as

$$f_{\text{proposed}} = \min (f_{\text{res}}, f_{\text{sig}}). \quad (3.5)$$

Reference [101] reports the case of open-ended uniform transmission-lines. In CMOS circuits, the input capacitance of the gates is small and the segments connected to the receiver are open-ended transmission-line can be assumed as an open-ended transmission-line. The extraction frequencies of these open-ended branches are the frequency where the quarter wavelength is equal to the interconnect length. When the length of a segment is $l$, the resonance frequency $f_{\text{res}}$ is $v/4l$. The method in Ref. [101] cannot be applied to interconnects which have a large capacitive load or a resistive termination because Ref. [101] assumes open-ended transmission-lines. By using Eq. (3.4), the proposed method can be applied to the interconnects that are not regarded as open-ended transmission-lines.

**Step 2. Replacing terminal segments with equivalent load impedances**

At step 1, the extraction frequencies of terminal segments are decided. To decide the extraction frequencies of the preceding segments, the segments whose extraction frequency is already decided are replaced with equivalent load impedances. This step corresponds to Fig. 3.25. II. By replacing with the equivalent load impedance, the extraction frequency can be calculated by Eq. (3.4). For example, in Fig. 3.25, the extraction frequency for the segment B-C can be calculated by replacing the segment C-E and C-F with the equivalent load impedances.

The load impedance of a certain segment is the input impedance of the downstream branches. For example, the load impedance of the segment B-C in Fig. 3.25 is the input impedance of the segment C-E and the segment C-F. As shown in Fig. 3.25, II, the input impedance of transmission-lines can be modeled by a RLC series resonator circuit whose resonance frequency is equal to $f_{\text{res}}$. The resistance is ignored because the proposed method needs only the resonance frequency. The input impedance of a certain segment is expressed as

$$Z_{\text{in}} = \sqrt{\frac{L}{C}} \frac{1 - \omega^2 LC}{j \omega \sqrt{LC}} = \frac{Z_0}{j \tan \left( \frac{\omega l'}{v} \right)} \quad (3.6)$$

where $l'$ is equivalent length defined as $v/4f_{\text{res}}$. The characteristic impedances $Z_0$ of each segment are assumed to be the same value. Therefore the value of the inductance $L$ and the capacitance $C$ are determined from the characteristic impedance $Z_0$ and the resonance frequency $f_{\text{res}}$. This means that once the resonance frequency $f_{\text{res}}$ is calculated, the equivalent load impedance is uniquely determined.
Figure 3.26 shows an example of transfer characteristic estimated by Eq. (3.4) and the equivalent load impedance defined by Eq. (3.6). Figure 3.26 is the voltage gain between node A and node B. The interconnect topology is a branching wire as shown in Fig. 3.26. The solid line is the transfer characteristic by SPICE AC analysis. The dashed line labeled “Simulation with equivalent load” is the result of SPICE simulation using the equivalent load, and the dashed line labeled “formula” is that by Eq. (3.4) with the equivalent load impedance. The resonance frequency $f_{\text{open}}$ if the segment A-B as open-ended is 37.5GHz. As shown in Fig. 3.26, the transfer characteristic estimated by Eq. (3.4) with equivalent load impedance is valid to estimate the peak of the transfer characteristic. On the other hand, the frequency $f_{\text{open}}$ becomes antiresonance frequency on the segment A-B and the transfer characteristic at $f_{\text{open}}$ becomes minimum. This result shows that the load impedance have to be considered to estimate the first peak frequencies of the transfer characteristic. From the above discussion, the transfer characteristic can be estimated by Eq. (3.4) with equivalent load impedance by Eq. (3.6).

By replacing the terminal segments with equivalent load impedance, the terminal segments are eliminated and other segments become terminal segments. Then the extraction frequency can be determined for new terminal segments by returning to step 1. The proposed method determines extraction frequencies for each segment by iterating the step 1 and step 2.

### 3.6 Experimental results

This section demonstrates experimental results of the extended method. First the experimental results of two major interconnect topology, H-tree and stub-bus are shown. Then statistics of experimental results in various situations is described.
3.6. Experimental results

3.6.1 H-tree topology

First example is a H-tree topology. Figure 3.27 shows the topology of the H-tree. As shown in Fig. 3.27, the cross section of interconnects are a co-planar structure. The segments A-B, B-C and B-D have the same structure that signal wire width $W_s$ is $10\mu m$, ground wire width $W_g$ is $4\mu m$, spacing between wires $S$ is $2\mu m$ and the thickness of wire $T$ is $1\mu m$. The other segments, C-E, C-F, D-G and D-H, have the structure that $W_s = 4\mu m$, $W_g = 4\mu m$, $S = 2\mu m$ and $T = 1\mu m$. The input transition time $t_i$ is set to $10ps$ and the output impedance of the driver is $50\Omega$.

The frequency determination process is shown step by step. First, the segments C-E, C-F, D-G and D-H are connected to the receiver. These segments can be assumed as open-ended and the resonance frequency $\omega_{res} = \sqrt{1} = 37.5GHz$. However the significant frequency $\omega_{sig}$ is calculated to $0.34/(10 \times 10^{-12}) = 34GHz$. Therefore the extraction frequency of these segments is $34GHz$. From Eq. (3.6) and the extraction frequency $34GHz$, the equivalent load impedance is expressed as

$$\frac{Z_0}{j \tan \left(\frac{\omega_{res}}{c}\right)}$$

where $l'_{CE}$ is calculated to be $1.0mm$ from the resonance frequency $37.5GHz$. For the segments B-C and B-D, the voltage gain is expressed as

$$\frac{V_{out}}{V_{in}} = \frac{1}{\cos (\omega_{BC}/v) - 2 \tan \left(\omega l'_{CE}/v\right) \sin \left(\omega l_{BC}/v\right)}$$
where $l_{BC}$ is 2mm. From the above expression, the extraction frequency for the segments B-C and B-D is calculated to 10GHz. In a similar way, the extraction frequency for the segment A-B is 3.9GHz.

Figure 3.28 shows the waveform at the node E. The RLC ladder model extracted at DC and the proposed frequency is almost the same as the result of frequency-dependent model. When the ladder extracted at $f_{\text{sig}}$ is used, the waveform differs from that of the frequency-dependent model. The extraction at the $f_{\text{sig}}$ estimates the resistance too large, so the attenuation on the interconnect is overestimated. Table 3.6 shows the signal delay time and the transition time at the node E. The transition time is defined by the time to rise from 20% to 80% of the supply voltage. The error is calculated by assuming that the result of FD provides a reference. The ladder extracted at $f_{\text{sig}}$ causes large error, 25% in delay and 42% in transition time. The extraction at DC also causes about 9% error in delay. The ladder extracted at the proposed frequency provides the most accurate modeling.

### 3.6.2 Stub-Bus topology

The second example is a stub-bus structure. Figure 3.29 shows the interconnect topology. The bus line A-B-C-D-E is a fat wire that the signal width $W_s = 10\mu m$, the ground width $W_g = 10\mu m$ and the spacing $S = 2\mu m$. The stubs are short and thin wires, that $W_s = 1\mu m$, $W_g = 1\mu m$ and $S = 2\mu m$. The
3.6 Experimental results

The extraction frequencies by the proposed method are determined as shown in Fig. 3.30. The extraction frequency of the stubs is $f_{\text{sig}} = 34$ GHz because the stub is short and its resonance frequency is 375 GHz. Figure 3.31 shows the waveform at the node I and Table 3.7 shows the errors in delay and transition time. The ladder model extracted at DC or significant frequency causes serious error especially in signal transition time. As shown in Fig. 3.31, DC extraction underestimates the attenuation and $f_{\text{sig}}$ overestimates. DC extraction also causes 8% error in delay because the extraction at DC causes estimation error in phase velocity. On the other hand, the RLC ladder by the proposed method provides accurate modeling of frequency-dependent interconnect.

3.6.3 Results of overall experiments

The proposed method is verified under various conditions for verification. This section shows the statistical summary of all experiments. The topology of net, lengths of each segment, interconnect structure, driver size and transition time of input are varied. The number of segments in one net is varied from 1 to 5. The length of segment is 200 μm–5 mm. The interconnect structure is assumed as a co-planar structure, whose signal width $W_s$ and ground width are 1 μm–10 μm and the spacing $S$ is 2 μm–8 μm. The output impedance of the driver is 25 Ω–100 Ω. The transition time of input pulse is

<table>
<thead>
<tr>
<th>Extraction Frequency</th>
<th>delay [ps]</th>
<th>error [%]</th>
<th>transition time [ps]</th>
<th>error [%]</th>
</tr>
</thead>
<tbody>
<tr>
<td>FD</td>
<td>34.5</td>
<td>—</td>
<td>19.5</td>
<td>—</td>
</tr>
<tr>
<td>DC</td>
<td>37.3</td>
<td>8.1</td>
<td>6.8</td>
<td>−65.4</td>
</tr>
<tr>
<td>Proposed</td>
<td>35.3</td>
<td>2.3</td>
<td>21.2</td>
<td>8.7</td>
</tr>
<tr>
<td>$f_{\text{sig}}$</td>
<td>35.3</td>
<td>2.3</td>
<td>40.6</td>
<td>108.2</td>
</tr>
</tbody>
</table>

Table 3.7: Errors in the delay time and the transition time at the node I of the stub-bus.
Chapter 3. Interconnect RL extraction at a single representative frequency

Table 3.8: Statistical summary of overall experiments.

<table>
<thead>
<tr>
<th>Extraction Frequency</th>
<th>delay Max. error</th>
<th>transition time Max. error</th>
<th>&gt; 5% ratio</th>
<th>&gt; 5% ratio</th>
</tr>
</thead>
<tbody>
<tr>
<td>DC</td>
<td>-88.1%</td>
<td>-71.9%</td>
<td>11.5%</td>
<td>27.8%</td>
</tr>
<tr>
<td>proposed</td>
<td>-9.9%</td>
<td>-9.8%</td>
<td>5.4%</td>
<td>12.5%</td>
</tr>
<tr>
<td>( f_{\text{sig}} )</td>
<td>110.0%</td>
<td>160.3%</td>
<td>12.2%</td>
<td>35.2%</td>
</tr>
</tbody>
</table>

10ps–100ps. By changing those parameters, 9,545 patterns of net are examined and the waveforms at 43,199 nodes in total are observed.

Table 3.8 shows the summary of overall experiments. Table 3.8 contains the maximum error in delay time and transition time (rows of “Max. error”), and the ratio of nodes where the error is over 5% (rows of “> 5% ratio”). The ladder extracted at DC tends to underestimate the delay and transition time, and the ladder extracted at significant frequency \( f_{\text{sig}} \) overestimates. In the case of the ladder extracted at DC, the error in delay exceeds 5% at the 12% of all nodes and the maximum error is -88.1%. In transition time, the error at 28% of nodes exceeds 5% and the maximum error is -71.9%. In the case of significant frequency, the error in delay at about 12% of all nodes is over 5% and the maximum is 110.0%, and the error in transition time at about 35% of all nodes is over 5% and the maximum is over 160%. Those errors are serious problem for evaluating the circuit behavior, such as timing analysis. On the other hand, the proposed method achieves the error less than 10% in both delay and transition time. The results above confirm the RLC ladder extracted at the proposed frequency provides accurate modeling of frequency-dependent interconnects.
3.7 Summary

The frequency that should be used to extract RLC values is discussed. When frequency-independent equivalent circuits are used for circuit design, the extraction frequency must be carefully determined to maximize the fidelity in interconnect characteristics. An RL extraction scheme that uses the frequency determined by interconnect length is proposed. The proposed method is experimentally verified that the proposed frequency achieves the most accurate estimation in signal propagation delay and transition time. The maximum error is within 10% in delay and in transition time in the experiments. With the proposed representative frequency, RL extraction at a single frequency becomes accurate enough to model interconnect characteristics, and hence many effective design and analysis techniques developed ignoring frequency-dependence are exploited. The proposed method is effective when the topology and the length of interconnects are known, for example in post-layout extraction.
Chapter 4

Analytical performance estimation of on-chip transmission-lines

4.1 Introduction

In this chapter, an analytical performance estimation of on-chip transmission-lines is proposed. A global clock frequency is expected to exceed 9.5GHz in 2010 [4]. A big challenge in this era is high-speed and large capacity signal transmission. On such high-performance chips, one of the important issues is high-speed and large-capacity data transmission. The performance of global interconnects is considered to limit whole chip performance in near future. Recently to attack this problem, high-speed signaling and throughput driven interconnection are becoming a hot research topic both in design and EDA communities [31, 80]. Optical communication instead of metal wire signaling is also studied [102].

The current signaling scheme is roughly classified into single-end and differential signaling. Differential signaling is used for on-chip high-speed and long-distance interconnection as well as off-chip signaling, for example clock distribution [74]. On the other hand, single-end signaling is very common in chip design. Each scheme has both advantages and disadvantages, and the difference between single-end signaling and differential signaling is quite large in area and power, thus it is desired to decide which scheme should be used in the early stage of chip design. Therefore circuit designer should be aware of the maximum performance of both signaling schemes, and know in what situation differential signaling is preferable, or rather a sole solution.

The other problem in designing high-performance interconnects is termination. Resistive termination is one of the common and fundamental techniques for high-performance interconnection. Impedance matching by the resistive termination eliminates multiple reflection of signal wave and improves signal integrity. Therefore to achieve high bit rate signaling over 10Gbps, resistive termination is required [103]. On the other hand, resistive termination increases the power dissipation because static current flows through the terminator. Therefore designers have to carefully use resistive termination. For PCB wires and cables, resistive termination is a common technique because impedance matching is important to prevent the multiple reflection of signal wave. However in LSIs, the loss of the wire is significant. Even if the multiple reflection occurs, the reflected wave attenuates while prop-
agating on the interconnect. Therefore it is not clear in what condition the resistive termination should be used. Furthermore, process variation is becoming more and more appreciable in LSIs. The on-chip resistance for termination realized by MOS or polysilicon varies due to process variation. Therefore robustness to process variation must be examined when designing high-performance interconnection.

In this chapter, the trade-off analysis of on-chip interconnects is discussed. There are several factors that degrade signal integrity, i.e. attenuation, crosstalk and dispersion. Experimental results show that the main factor that inhibits high-speed signaling is attenuation in crosstalk-controlled interconnect structures. From the viewpoint of attenuation, the maximum eye opening is analytically derived for open-ended single-end signaling, terminated single-end signaling and differential signaling. Experimental results by circuit simulation verify that the analytical performance estimation is valid even when crosstalk noise and frequency-dependence of interconnects are considered. The analytic estimation provides trade-off curves among bit rate, length and eye opening. They indicate the performance difference between single-end and differential signaling and reveal in which region differential signaling has a significant advantage over single-end signaling. The trade-off analysis shows that the required voltage swing for the receiver strongly affect the performance of the signal transmission system. The improvement of the receiver makes differential signaling achieve tens Gbps signaling on the interconnects with the length up to several centimeters.

The proposed analytical estimation method also provides a design guideline of resistive termination for on-chip lossy transmission-lines. The proposed method indicates the situations when the resistive termination should be used and the optimal value of termination resistance. Furthermore, the proposed method provides the sensitivity to the variation of the termination resistance. From the sensitivity, designers can decide the design margin for process variation. The contribution of this chapter is to provide a design guideline of termination for high-speed on-chip interconnection that gives both the maximum eye-opening in voltage and minimum sensitivity to process variation.

### 4.2 Analytical estimation of interconnect performance

This section derives analytic expressions that estimate the performance of on-chip global interconnects. The focus of this section is the attenuation characteristics as the most dominant factor that prevents global signaling, and perform an analytical performance estimation based on simplified interconnect and waveform models. The effect of crosstalk noise and dispersion is examined in Section 4.3, which confirm that the simplified model based on the attenuation is valid for on-chip interconnects.

#### 4.2.1 Figure of merit for signaling performance

Eye-diagram is commonly used to evaluate the feasibility and quality, which include bit error rate, of signaling systems [104]. Figure 4.1 shows an example of eye-diagram. Large eye opening area means that signaling has timing/noise margin. To evaluate the area of eye opening, rectangle/hexagon eye mask is used commonly. However, for simplify in this chapter, the maximum eye opening in voltage shown in Fig. 4.1 is used as a figure of merit. In the case of on-chip signaling, attenuation is the most important factor that limits high-speed long-distance signaling. In this condition, the eye opening
4.2 Analytical estimation of interconnect performance

4.2.2 Assumptions on derivation

Assumptions used for the derivation of the analytic expressions are explained.

The first assumption is that the interconnect structure is designed to reduce crosstalk noise. Although crosstalk noise affects eye-diagram, it can be suppressed in a well-designed interconnect structure by shielding and spacing. Section 4.3 experimentally verifies that the effect of crosstalk noise can be controlled by the interconnect structure and the attenuation is the dominant factor which degrades the eye-opening. The effect of waveform dispersion is also ignored. Interconnect characteristics is frequency dependent because of skin- and proximity-effect and return-current distribution, which causes waveform dispersion. However the effect of waveform dispersion is small compared to that of the attenuation. In the analytical estimation, crosstalk and dispersion are not considered.

The second assumption is involved in impedance-matching. When driving transmission-lines, an impedance-matched driver is the optimum solution [67]. This chapter assumes that the impedance-matching is achieved. For conventional single-end signaling, the near-end is driven by a matched driver and the far-end is open-ended, because the far-end is terminated by a small input capacitance of the receiver. To examine the effect of the termination, the single-end signaling with impedance-matched termination is also evaluated. For differential signaling, the near-end is the same as the single-end signaling. The far-end of the differential pair is terminated by a bridge termination. The bridge termination is commonly used in Low-Voltage-Differential-Signaling (LVDS).

4.2.3 Piecewise-linear waveform model

The piecewise-linear (PWL) waveform model is described. This model assumes that the attenuation of the interconnects is the dominant factor that degrades the signal integrity. The circuit model of
Chapter 4. Analytical performance estimation of on-chip transmission-lines

Figure 4.2: Circuit model of a transmission-line with resistive termination.

Figure 4.3: PWL waveform model.

terminated transmission-lines is shown in Fig. 4.2. The resistance, inductance and capacitance per unit length are $R$, $L$ and $C$ respectively. The impedance $Z_0$ is the characteristic impedance of transmission-line. At the receiver side, the interconnect is terminated by the resistor and the resistance value is $R_t$. At the driver side, the driver of the interconnect is assumed to achieve impedance matching. In other words, the output impedance of the driver is equal to the characteristic impedance $Z_0$. For simplicity, the supply voltage $V_{dd}$ is 1V. This assumption does not lack the generality because the circuit model in Fig. 4.2 is a linear circuit.

The waveform at the receiver side of transmission-lines is modeled by the PWL waveform model shown in Fig. 4.3. Figure 4.3 is the eye-diagram by two isolated pulse ($0\cdots010\cdots0$ and $1\cdots101\cdots1$). If the crosstalk noise is small, these isolated pulses determine the eye-opening. In Fig. 4.3, the time $t_r$ is the transition time of the input pulse and period $T$ is the minimum width of the input pulse. The voltage $V_r$ is the rise voltage that is determined from the attenuation and the termination of the interconnect. The voltage $V_T$ is the voltage at the time $T$. The voltage $V_T$ decides the maximum eye-opening voltage. The voltage $V_{max}$ is the voltage level when the continuous “1” is input to the interconnect. The voltage $V_{max}$ is determined by the resistance of the terminator, the resistance of the interconnect and the output resistance of the driver. The time $t_{tof}$ is the signal time-of-flight that is determined from the interconnect length and the velocity of electromagnetic wave. From the analytical expression of the waveform on transmission-lines [67], if the driver achieves impedance matching, the voltage at the receiver side reaches $V_{max}$ when the time $2t_{tof}$ passed after the rising. By using this characteristic, the voltage $V_T$ can be derived.
4.2. Analytical estimation of interconnect performance

From the PWL waveform model in Fig. 4.3, the maximum eye-opening voltage \( V_{\text{eye}} \) is expressed as

\[
V_{\text{eye}} = \left\{ \begin{array}{ll}
\max\{V_{\text{max}} - 2(V_{\text{max}} - V_T), 0\} & (T - t_r < 2t_{\text{tot}}) \\
V_{\text{max}} & (T - t_r > 2t_{\text{tot}})
\end{array} \right..
\]

(4.1)

Resistive termination changes the reflection coefficient and the maximum voltage \( V_{\text{max}} \). Therefore designers can tune the eye-opening by using resistive termination.

4.2.4 Derivation of eye-opening voltage

The amplitude of the pulse injected to the interconnect is expressed as \( V_{\text{near}} = V_{\text{dd}}/2 = 1/2 \) because this paper assumes that the driver output impedance is equal to the characteristic impedance \( Z_0 \). The pulse attenuates as propagating on the lossy transmission-line. The amplitude of the attenuated pulse at the receiver side is expressed as

\[
V_{\text{far}} = V_{\text{near}} \exp(-\alpha l) = n/2,
\]

(4.2)

where the parameter \( \alpha \) is the attenuation constant of the interconnect and the parameter \( n \) is the attenuation parameter defined as \( n = \exp(-\alpha l) \). As the attenuation becomes weak, the parameter \( n \) becomes larger and if the line is lossless, the parameter \( n \) is equal to 1. Since the shunt conductance of the on-chip interconnects is small, the attenuation parameter \( n \) can be approximated to \( n \approx \exp(Rl/2Z_0) \) [5].

The reflection coefficient \( \Gamma \) at the receiver side is expressed as \((R_t - Z_0)/(R_t + Z_0)\). Therefore the rise voltage \( V_r \) is calculated by

\[
V_r = V_{\text{far}} \times (1 + \Gamma) = \frac{n}{2} \frac{2Z_n}{Z_n + 1},
\]

(4.3)

where the parameter \( Z_n \) is the normalized impedance of the termination defined as \( Z_n = R_t/Z_0 \). \( Z_n = 0 \) means short-circuit termination, \( Z_n = 1 \) means matched termination and \( Z_n = \infty \) means open-ended.

The maximum voltage \( V_{\text{max}} \) is determined by DC resistances \( R_{\text{drv}}, R_l \) and \( R_t \).

\[
V_{\text{max}} = \frac{R_t}{Z_0 + R_l + R_t}.
\]

(4.4)

Here the attenuation parameter \( n \) is approximately expressed as follows [5]

\[
n = \exp(-\alpha l) \approx \exp\left(\frac{-R_{\text{line}}}{2} \sqrt{\frac{C}{L}}\right) \approx \exp\left(\frac{-R_{\text{line}}}{2Z_0}\right).
\]

(4.5)

Therefore By using the normalized impedance \( Z_n \), the maximum voltage \( V_{\text{max}} \) is expressed as

\[
V_{\text{max}} = \frac{Z_n}{1 - 2 \log n + Z_n}.
\]

(4.6)

The voltage \( V_T \) is expressed as

\[
V_T = V_r + (V_{\text{max}} - V_r) \frac{T - t_r}{2t_{\text{tot}}}.
\]

(4.7)
From Eq. (4.7), the first equation of Eq. (4.1) is rewritten as

\[
 V_{\text{eye}} = 2V_r + (V_{\text{max}} - V_r) \frac{T - t_r}{t_{\text{tof}}} - V_{\text{max}}
\]

\[
 = \left( \frac{Z_n}{1 - 2 \log n + Z_n} - \frac{n Z_n}{Z_n + 1} \right) \left( \frac{T - t_r}{t_{\text{tof}}} - 1 \right) + \frac{n Z_n}{Z_n + 1}.
\]  

(4.8)

The equation above is valid in the region \( T < 2t_{\text{tof}} \). As mentioned in Eq. (4.1), the maximum eye-opening \( V_{\text{eye}} \) is equal to \( V_{\text{max}} \) in the region \( T > 2t_{\text{tof}} \).

**Equations of typical cases**

In the previous section, the equation in a general form is derived. The equations specialized for some typical conditions of on-chip interconnects, that is, open-ended transmission-line, terminated transmission-line and terminated differential transmission-lines are shown.

**i. Open-ended single-end signaling**

On open-ended transmission-lines, the terminal impedance \( R_t \) is infinity. Therefore the maximum voltage \( V_{\text{max}} \) is equal to the supply voltage, which is equal to 1 in this paper. Because \( Z_{\text{term}} \) is infinity, the reflection coefficient \( r \) is equal to 1. So the rise voltage \( V_r \) is equal to \( n \). The eye opening \( V_{\text{eye}} \) is expressed as

\[
 V_{\text{eye}} = \begin{cases} 
\frac{1-n}{l/v} (T - t_r) + 2n - 1 & (T < 2t_{\text{tof}}) \\
 V_{\text{max}} = 1 & (T > 2t_{\text{tof}}) 
\end{cases}.
\]  

(4.9)

The derived expression indicates that the maximum eye opening \( V_{\text{eye}} \) is determined by the minimum period \( T \), the rise time \( t_r \), interconnect length \( l \) and the attenuation parameter \( n \). The velocity \( v \) is determined by the dielectric constant of metal insulator.

**ii. Terminated single-end signaling**

On the terminated transmission-lines, \( R_{\text{term}} \) is equal to \( Z_0 \). From the difference of \( V_r \) and \( V_{\text{max}} \), the maximum eye opening is expressed as

\[
 V_{\text{eye}} = \begin{cases} 
\frac{1-n}{l/v} (T - t_r) + n - \frac{1}{2(1-\log n)} & (T < 2t_{\text{tof}}) \\
 T > 2t_{\text{tof}} 
\end{cases}.
\]  

(4.10)

**iii. Differential signaling**

In the case of differential signaling, the expression of the eye opening \( V_{\text{eye}} \) is simply the twice of Eq. (4.10).

\[
 V_{\text{eye}} = \begin{cases} 
\frac{1-n}{l/v} (T - t_r) + 2n - \frac{1}{(1-\log n)} & (T < 2t_{\text{tof}}) \\
 T > 2t_{\text{tof}} 
\end{cases}.
\]  

(4.11)

Please note that the attenuation constant of differential signaling is different from that of single-end signaling even if the interconnect structure is the same. This is because the interconnect characteristic
for differential mode has to be used when evaluating the differential signaling. Therefore the resistance $R_{\text{line}}$ and characteristic impedance $Z_0$ of Eq. (4.5) are different. In differential signaling, one signal wire of the pair becomes the current return path of the other wire. The return current is tightly confined, and hence the loop resistance of the differential pair is larger than that of single-end signaling and the loop inductance of differential pair is smaller than that of single-end signaling. The capacitance of differential signaling is larger than that of single-end signaling because the voltage of each wire transits to opposite direction. From Eq.(4.5), the attenuation parameter $n$ of differential signaling is smaller than that of single-end signaling.

4.3 Verification of analytical estimation

This section shows some experimental results and demonstrates the validity of the analytical formulae in the previous section by detailed circuit simulation that considers crosstalk and dispersion as well as attenuation. First the conditions of circuit simulation are explained. Next the simulation results and analytical estimation are shown such that the analytical estimation is verified.

4.3.1 Simulation setup

The eye opening voltage is evaluated by circuit simulation. First, interconnect $R(f)L(f)C$ are extracted by 2D field-solver, because inductance of a long interconnect such as 10mm is proportional to the length. The shunt conductance is negligible in LSIs because the electric loss of insulator is small. Figure 4.4 shows the interconnect structure which corresponds to a bus structure for long-distance signaling. A 45nm process in a roadmap [4] is assumed as the fabrication process. In Fig. 4.4, M10 means the tenth metal layer and M11 and M12 are assumed to be special thick layers for long distance interconnects or power/ground wires. In M12, there are seven signal lines ("S" in Fig. 4.4) and ten ground wires("G" in Fig. 4.4). There are twenty ground wires in M10. In M12 layer, 4μm width signal interconnects are aligned and shielding ground wires are allocated at every seven signal wires. The ground wires in the lower layers also affect the characteristics of the signal wires. Therefore the ground wires in M10 layer are taken into consideration. In M11 layer, there are some orthogonal interconnects. The interconnects in M11 are assumed to have the same width and pitch as those in M12. Orthogonal interconnects affect to the capacitance and they do not affect to the resistance and the inductance. The interconnect characteristics are modeled by a frequency dependent coupled transmission-line model [18] implemented in a circuit simulator [19].

Figure 4.5 shows the experimental circuit. Each signal wire is excited by an ideal voltage source with an ideal resistance. The input pulses of signal wires are random non-return-to-zero patterns that are independent of each other. The pulse shape is trapezoidal with pulse period $T$ and transition time $T/10$. In the following section, "bit rate" is defined by $1/T$. For simplicity, the supply voltage is 1V, because of the linearity of the circuit model. The eye opening of each signaling scheme are evaluated with various pulse period $T$ and interconnect length $l$. 
4.3.2 Eye-diagram vs. PWL waveform model

First an example of eye-diagram and its PWL waveform model is shown. Figure 4.6 shows the comparison between PWL waveform model and the circuit simulation. The signal bit rate is 20Gbps and the interconnect is the 10mm long open-ended single-end transmission-line shown in Fig. 4.4. At the edge of the rising and falling, the PWL waveform model has a certain error because of waveform distortion. However the PWL waveform model is accurate to estimate the maximum eye-opening voltage. The comparison in various situations is shown in the following sections.

4.3.3 The effect of attenuation and crosstalk noise

The proposed analytical model focuses the attenuation of the interconnects and ignores the other factors, such as crosstalk noise and dispersion. The crosstalk noise if exists, disturbs the waveform and it can be the limiting factor of the interconnect performance. The effect of the attenuation and the crosstalk noise on the performance degradation are discussed.

Figure 4.7 shows the bit rate vs. eye-opening curves on several crosstalk noise conditions. The line labeled “7 signal lines” shows the simulation result when 7 wires in Fig. 4.4 are driven independently. This result corresponds to the performance under the strong crosstalk noise. The line labeled “spacing” is the result when the signal lines S1, S3, S5 and S7 are removed. In other words, the spacing between signal lines is enlarged by 3 times. The line labeled “shielding” is the result when the signal lines S1, S3, S5 and S7 are grounded. This means that each signal wire has shield wires on both sides. The line labeled “w/o crosstalk” means that the only one signal line is excited and the other lines are quiet.
4.3. Verification of analytical estimation

From Fig. 4.7, the eye-diagram of "7 signal lines" is degraded and the curve is far from the curve of analytical estimation. However, crosstalk noise can be eliminated by the spacing or the shielding. As shown in Fig. 4.7, the result of "shielding" is almost the same as that of "w/o crosstalk" and that of "formula". This means that the effect of crosstalk noise is small if the interconnect is well-designed against the crosstalk.

On the other hand, the attenuation of the interconnects cannot be eliminated. Figure 4.8 shows the attenuation constant as a function of interconnect width for co-planar and micro-strip structures. The attenuation constant is a decreasing function with respect to interconnect width. However, it is seen that the decrease is quickly saturate and it does not decrease to a small value even if fat wires are used for the signal line, since skin and proximity effects force the current to concentrate near the surface of the signal and ground interconnects that face each other. From above discussion, the attenuation is the dominant factor in the estimation of the performance limitation. Therefore in the following sections, the crosstalk-controlled interconnects structure with attenuation are discussed.

4.3.4 Bit rate vs. eye opening voltage

The bit rate versus the maximum eye opening is shown. Figure 4.9 shows the analytical estimation and the simulation results. The interconnect structure is Fig. 4.4. To evaluate differential signaling, two signal wires are driven by differential signal and other 5 signal wires are driven by random pattern, which simulates the worst condition of a differential signaling embedded in a single-ended environment. In the case of single-end signaling, S1, S3, S5 and S7 wires are replaced with ground wires. It means that each signal wire has shield wires on both side. In this case, the interconnect resource used by single-end signaling and that used by differential signaling become the same. The far-end of interconnects are open-ended. From Section 4.2, the eye opening of terminated single-end transmission-lines are the half of differential signaling. So the open-ended single-end signaling and differential signaling are compared. The interconnect length is 10mm and the attenuation parameter of single-end signaling is $n = 0.42$, that of differential is $n = 0.36$. These attenuation parameters are calculated at the
Chapter 4. Analytical performance estimation of on-chip transmission-lines

Figure 4.7: The effect of crosstalk noise over the eye-opening.

Figure 4.8: The attenuation vs. interconnect width (at 10GHz).
4.3. Verification of analytical estimation

representative frequency proposed in Chapter 3. In this case, the representative frequency is 5GHz. In Fig. 4.9, analytical estimation (labeled “formula”) is valid because it is close to the experimental results (labeled “circuit simulation”). Figure 4.9 shows that in low bit rate region up to 20Gbps, the eye opening of single-end signaling is larger than that of differential signaling. This is because $V_{max}$ of single-end signaling is large. However as the bit rate becomes higher, the eye opening of single-end decreases very rapidly and becomes almost 0 over 40Gbps. This is because $V_{max} - V_i$ of single-end becomes larger by attenuation.

From Fig. 4.9, the discrepancy between the analytical estimation and the circuit simulation becomes larger as the bit rate becomes higher. In differential signaling, the difference is about 30% at 80Gbps and about 50% at 100Gbps, since the effect of waveform dispersion is not negligible at such high bit rate. Therefore the applicability of the analytical estimation has a limitation with respect to the bit rate. For example, from Fig. 4.9, the coverage of the analytical estimation becomes up to 80Gbps if the maximum error is limited to be 30%.

4.3.5 Attenuation vs. eye opening voltage

Next, the effect of attenuation for eye opening is examined. The attenuation is changed by setting different values to the width and spacing of the interconnect structure shown in Fig. 4.4. The same value is used for width and spacing from 1 μm to 6 μm in each configuration. Figure 4.10 shows the amount of eye opening as a function of the attenuation. Except the width and spacing of the interconnects, simulation set-up is the same as that of Section 4.3.4 at the signaling rate of 20 Gbps. As seen from Fig. 4.10, the maximum discrepancy is 0.07V. The analytical estimation (labeled "formula") gives a good prediction of eye-opening under different attenuation values with different interconnect width and spacing.

Figure 4.9: Bit rate vs. eye opening (10mm length, $Z_0 = 100Ω$).
4.3.6 Verification by circuit simulation

Some experimental results are shown for the verification of the proposed method. The co-planar structure shown in Fig. 4.11 is used for experiments. The wire width $W$, the spacing $S$ and the resistivity of the wire are varied. The permittivity of the insulator is set to 4.0 and the time-of-flight $t_{tof}$ of the 10mm wire is 66.7ps. The frequency characteristics are extracted by a 2D field-solver. For circuit simulation, the interconnects are modeled by a model that can represent the frequency dependency [18]. On the other hand, the analytical method cannot handle frequency-dependence of the interconnect characteristics. The parameters used in the analytical model is extracted at the representative frequency determined by the method proposed in Chapter 3. When the interconnect length is 10mm, the extraction frequency is 3.75GHz. The random NRZ pulse whose transition time is one tens of the pulse width ($t_e = T/10$) is injected through the output resistance of the driver.

Figure 4.12 shows the maximum eye-opening voltage $V_{eye}$ with various attenuation parameter $n$. The x-axis is the normalized impedance $Z_n$ of the termination and the input bit rate is fixed to 20Gbps. In Fig. 4.12, the result of the proposed method and that of the circuit simulation are plotted. From the Fig. 4.12, the curves by the proposed method are close to the simulation results. Figure 4.12 also shows that the eye-opening has maximal value at a certain termination resistance under strong attenuation. When the attenuation is weak ($n \geq 0.6$), the eye-opening is large as the normalized impedance is large. This means that the open-end termination maximizes the eye-opening. However as the attenuation becomes strong ($n \leq 0.4$), the eye-opening becomes maximum at a certain normalized impedance. The border of the region where the resistive termination is effective is discussed in the next section.
4.4. Trade-off analysis of on-chip interconnects

Figure 4.12: Eye-opening voltage versus the normalized impedance of the termination (with various attenuations, 20Gbps input, 10mm length)

Figure 4.13 shows the plot of the eye-opening voltage versus the normalized impedance. The attenuation parameter $n$ is fixed to 0.4 and the input bit rate is varied from 15Gbps to 80Gbps. Figure 4.13 also shows that the proposed method is close to the results of the circuit simulation. When the bit rate is low, the eye-opening monotonically increases as the normalized impedance increases. As the bit rate becomes higher, the maximum value appears and the normalized impedance that makes the eye-opening maximum becomes smaller. At the 40Gbps or higher, the eye-opening of open-ended ($Z_n = \infty$) transmission-lines becomes almost zero. On the other hand, if the termination is adjusted to the optimal value, the eye-opening is over 150mV at the 80Gbps. From Fig. 4.12 and Fig. 4.13, the resistive termination is necessary when the attenuation is strong and the input bit rate is high.

4.4 Trade-off analysis of on-chip interconnects

The proposed analytical performance estimation provides the trade-off curves of the interconnects. In this section, the performance tradeoffs among signaling scheme, bit rate, interconnect length and attenuation are shown.

The equations derived in Section 4.2 provide a trade-off curve between bit rate and interconnect length. Figure 4.14 shows the curves of single-end signaling and differential signaling. The condition is the same as that of Section 4.3. In Fig. 4.14, $V_{\text{req}}$ means the required eye opening $V_{\text{eye}}$ for signal comparison. $V_{\text{req}}$ depends on the sensitivity and noise margin of the receiver. The trade-off curve of single-end signaling does not change so drastically by $V_{\text{req}}$. On the other hand, the trade-off curve of differential signaling strongly depends on $V_{\text{req}}$. As $V_{\text{req}}$ becomes smaller, the advantage of differential signaling becomes larger. Generally speaking, the comparison ability of differential receiver is higher, and differential signaling does not suffer from the integrity of the reference voltage given to the receiver [104]. If $V_{\text{req}}$ is $0.25V_{\text{dd}}$, differential signaling can achieve 100Gbps communications on 10mm length interconnect. On the other hand, single-end signaling can perform 25Gbps signaling on 10mm length interconnect.
Chapter 4. Analytical performance estimation of on-chip transmission-lines

Figure 4.13: Eye-opening voltage versus the normalized impedance (with various bit rate, \( n = 0.4 \), 10mm length)

interconnects, and if the bit rate is 100Gbps, interconnect length has to be within 2.5mm.

Figure 4.15 shows the trade-off curves between length and bit rate with various attenuation parameter \( n \). \( V_{\text{req}} \) is equal to 0.25\( V_{dd} \). From Fig. 4.15, the performance of differential signaling depends on the attenuation, and it gets close to single-end signaling as the attenuation becomes large, because \( V_{\text{max}} \) decreases.

From the above discussion, differential signaling is much superior to single-end signaling when \( V_{\text{req}} \) is small and \( n \) is not too small. Exploiting the better comparison characteristics of the differential receiver, we can receive the benefit of differential signaling.

An example of eye diagram is shown. Figure 4.16 is the eye diagram of 80Gbps signaling on 10mm differential transmission-line. From Fig. 4.16, the 80Gbps signal transmission can be realized if the receiver sensitivity \( V_{\text{req}} \) is 0.15V. The simulation conditions are the same as those explained in Section 4.3. The eye opening is roughly consistent with the analytical estimation and this result shows the validity of the analytical performance estimation.

4.5 Design Guideline for resistive termination

The proposed formulae provide a design guideline that indicates when the resistive termination should be used. From the derived formulae, the sensitivity of the eye-opening against the resistance variation is derived. The optimal termination by the proposed method maximizes the eye-opening voltage and minimize the sensitivity to the resistance variation.

4.5.1 Termination for maximizing the eye-opening voltage

The resistance value where the eye-opening becomes maximum is derived from the derivative of Eq. (4.8). From the solution of the equation \( \partial V_{\text{eye}} / \partial Z_n = 0 \), the normalized impedance that makes
4.5. Design Guideline for resistive termination

Figure 4.14: Bit rate vs. maximum interconnect length with various receiver sensitivity ($V_{\text{eq}}$).

Figure 4.15: Bit rate vs. maximum interconnect length with various attenuation. (high $n$ means low attenuation.)
Chapter 4. Analytical performance estimation of on-chip transmission-lines

Figure 4.16: Eye diagram of 80Gbps signaling on 10mm differential interconnect.

eye-opening maximum is expressed as

\[
Z_n = \frac{(1 - 2 \log n) \{(1 - \tau) - n(2 - \tau)\} - |2 \log n| \sqrt{n(1 - 2 \log n)(1 - \tau)(2 - \tau)}}{(\tau - 1)(1 - 2 \log n) + n(2 - \tau)},
\]

where a parameter \(\tau\) is defined as \(T - t_{lof} = \tau\) for simplicity. If the denominator of Eq. (4.12) closes to zero, the optimal normalized impedance reaches infinity. The bit rate where the denominator becomes zero is calculated from the equation

\[
\frac{T - t_{lof}}{t_{lof}} = \frac{1 - 2n - 2 \log n}{1 - n - 2 \log n}.
\]

This equation indicates a critical bit rate. When the bit rate is higher than the critical rate, resistive termination is effective, and otherwise open-ended termination is optimal.

Figure 4.17 shows the relationship between the bit rate and the optimal normalized impedance. The curves are the optimal normalized impedance and the vertical dashed lines are the borders determined by Eq. (4.13). From Fig. 4.17, the optimal normalized impedance becomes smaller as the bit rate becomes higher. The optimal normalized impedance also depends on the attenuation. The region where the resistive termination is effective is larger when the attenuation is strong (the attenuation parameter \(n\) is small). From the discussion above, the resistive termination is more effective where the bit rate is high and the attenuation is strong.

4.5.2 Sensitivity to the variation of resistance

In recent design, taking process variation into account is significantly important and the discussion on the sensitivity to the variation is necessary to decide the design margin.

The derivative of Eq. (4.8) means how the eye-opening changes when the normalized impedance changes. Therefore this derivation can be used as an indicator of the sensitivity to the resistance.
The sensitivity of eye-opening to the variation of resistance value is defined as

\[ S = \frac{Z_n \ \delta V_{\text{eye}}}{V_{\text{dd}} \ \delta Z_n}. \]  

(4.14)

The sensitivity \( S \) means the eye-opening variation in the percentage of the supply voltage when the resistance value of terminator changes 1%. If the terminator has the optimal resistance determined by the method in Section 4.5.1, the sensitivity \( S \) is equal to zero. Therefore the optimal resistance that makes the eye-opening maximum is also optimal from the viewpoint of process variation.

Figure 4.18 shows the sensitivity \( S \) and the normalized impedance. From Fig. 4.18, the normalized impedance where the sensitivity becomes maximum is smaller than the optimal normalized impedance. The sensitivity becomes maximum where \( Z_n = 1.2 \) at 10Gbps and \( Z_n = 0.5 \) at 80Gbps. Therefore the termination whose resistance value is smaller than the optimal is sensitive to process variation. As mentioned in Section 1.1.5, smaller terminator resistance increases the power dissipation. From the discussion above, designers should not use smaller terminator resistance than the optimal resistance, because all performance metrics, tolerance to process variation, power dissipation and eye-opening degrade.

4.6 Summary

The performance limitation of on-chip interconnect is discussed. It is important to know the maximum performance and performance trade-off to choose a proper signaling scheme. First analytical expressions for performance estimation are derived. By some assumptions, the maximum eye opening voltage is expressed by attenuation parameter \( n \), interconnect length \( l \) and pulse shape. Then the analytical estimation is verified by circuit simulation. The analytical estimation is valid in crosstalk-controlled interconnect structures even though the estimation does not consider crosstalk and dispersion. The
analytical estimation gives trade-off curves of interconnect performance. The advantage of differential signaling is significant when the attenuation is not so severe.

Then a design guideline is derived from the proposed analytical model. The analytical eye-opening estimation provides the optimal termination that makes the eye-opening maximum. The proposed guideline is especially efficient at the early stage of circuit design because the analytical model requires only the fundamental parameters, i.e. input bit rate, characteristic impedance and attenuation of the transmission-line. The optimal termination by the guideline minimize the sensitivity to the variation of the terminator resistance. By using the proposed method, designers can decide the termination of the signal wire and the design margin for the variation of the resistance value.

As the operating frequency becomes high, the effect of the waveform distortion becomes significant. Future work includes to develop a performance prediction considering the effect of waveform distortion.
Chapter 5

Driver/Receiver design for high-speed signaling

5.1 Introduction

In this chapter, buffers for on-chip signaling are discussed. In contrast to conventional logic gates, driver circuits for on-chip transmission-line have to be aware of impedance matching and the attenuation of the interconnect. When driving on-chip transmission-lines, impedance matching to the characteristic impedance is an important to prevent multiple reflection. In on-chip transmission-line, resistance loss is large and the signal attenuate while propagating. Therefore driver circuits must have enough voltage swing, or else the receiver circuit cannot sense the signal from the arrived waveform. Receiver circuits are the circuits to sense the signal and amplify the signal for the following logic gates. The design of driver circuits, receiver circuits and transmission-lines are interdependent with each other.

In this chapter, design of the driver and the receiver of on-chip long distance interconnects are discussed. First a sizing method for a static CMOS driver is proposed. Driver sizing methods that take transmission-line effects into consideration has been proposed [26, 66, 95] and using an impedance-matched driver is a common practice. The discussion in Chapter 4 is also based on an impedance-matched driver. The proposed driver sizing is a method to improve the performance by using an impedance-unmatched driver. On-chip interconnects are lossy and the traveling wave attenuates. Though impedance matching is achieved at the near-end, sufficient voltage wave does not necessarily reach to the far-end. Taking attenuation into consideration, higher amplitude should be injected to the near-end by using the stronger driver than the impedance-matched driver. However an impedance-unmatched driver involves multiple-reflection and it may cause a ringing problem. Ref. [95] does not care ringing in driver sizing, though ringing harmfully affect circuit behaviors [5]. The proposed method resizes a driver to the necessary and sufficient strength according to the attenuation property, and hence the proposed method can prevent a ringing problem. The proposed method therefore provides signal propagation at the velocity of electromagnetic without overshoot, undershoot and ringing.

Then a performance estimation method for CML buffers is proposed. Reference [20] introduces fundamentals of CML buffer and discusses the signal propagation delay. However the throughput is
becoming more important metric than latency [80]. In this section, bandwidth-driven CML design is proposed. The proposed method focuses the pole location of the linearized system. The observation shows that there are strong correlation between the pole location and the opening of the eye-diagram. By focusing the pole location, a guideline for the number of stages or the taper factor in a cascaded configuration is obtained. Moreover, CML can achieve the highest performance afforable in a certain process, therefore the performance prediction of CML buffers provides the maximum operating frequency that can be realized in the process. The pole can be estimated from several parameters predicted in a roadmap [4] and the proposed method also provides the performance prediction of the future processes.

5.2 CMOS driver sizing for lossy transmission-lines

This section discusses a driver sizing method for lossy transmission-lines. In CMOS driver design, the relationship between the driver output impedance and the characteristic impedance of the transmission-line is important. This section proposes a driver sizing method for lossy transmission-lines.

For simplicity, a driver sizing for lossless transmission-lines is explained and derive the driver sizing formula by using the equivalent driver resistance. Then a driver sizing method that can handle lossy transmission-lines is proposed.

5.2.1 Modeling of on-chip interconnects

In this section, a co-planer structure is used for evaluation. A signal line is adjacent to a ground line as shown in Fig. 5.1. The resistance, the inductance and the capacitance are extracted by using a field-solver. The interconnect parameters, such as line width, thickness, spacing and vertical distance to upper/lower layers etc., are decided from 0.18–0.05μm processes predicted in the ITRS roadmap [105] and those of an actual 0.13μm process. As explained above, transmission-line effects are remarkable on the interconnects with large width and thickness, such as global interconnects and clock wires. Therefore the interconnects at the top level of metal layers are discussed. For example, the wire width is 1600nm and thickness is 1000nm in a 0.13μm process. Because resistance and inductance are dependent on frequency, The frequency used for R and L extraction has to be determined. In this section, the significant frequency is used as the extraction frequency because waveforms at both of the near-end and the far-end are evaluated. The modeling accuracy of interconnects does not relate with the essence of the driver sizing. Therefore a certain amount of modeling error is not a serious matter. The signal transition time $t_s$ is decided to 30ps from a circuit simulation result of a ring oscillator. In this case, the significant frequency is 11GHz. The transistor model is a predicted model based on the ITRS roadmap [106]. The model parameters are generated such that the I-V characteristics becomes similar with those predicted in ITRS roadmap.

5.2.2 Driver output impedance and its impact on signal waveform

An important parameter of a transmission-line is the characteristic impedance $Z_0$. The characteristic impedance of a lossless transmission-line is expressed by $Z_0 = \sqrt{L/C}$, where $L$ and $C$ is the inductance
5.2. CMOS driver sizing for lossy transmission-lines

Figure 5.1: Interconnect structure for R, L, C extraction

![Interconnect structure](image)

Figure 5.2: The model of transmission-line

![Transmission-line model](image)

and the capacitance of the interconnect respectively. When the driver whose output impedance is \( Z_{drv} \) drives the wire with characteristic impedance \( Z_0 \) (Fig. 5.2), the voltage amplitude of the injected wave \( V \) at the output of the driver is expressed as

\[
V = \frac{Z_0}{Z_0 + Z_{drv}} V_{dd},
\]

(5.1)

where \( V_{dd} \) is the supply voltage. The injected waveform is controlled by the output impedance of the driver \( Z_{drv} \). The reflection at the far-end is represented by the characteristic impedance \( Z_0 \) and the load impedance \( Z_L \). The amplitude of the reflected wave \( V_r \) is expressed as

\[
V_r = \frac{Z_L - Z_0}{Z_L + Z_0} V_i,
\]

(5.2)

where \( V_i \) is the amplitude of the traveling wave at the far-end of the transmission-line. At the near-end, the reflected wave disappears when impedances \( Z_L \) and \( Z_0 \) are matched. In CMOS circuits, the output load is a relatively small capacitance, and hence the end of line can be assumed to be open-ended. As the impedance of the open end is infinity, the reflected wave at the far-end have the same voltage amplitude as that of the input wave. Therefore, the voltage at the far-end rises to the twice of the input voltage. For example, when the injected voltage whose amplitude is \( V_{dd}/2 \) reaches the far-end, the voltage at the far-end rises to \( V_{dd} \), that is the sum of the traveling wave \( V_{in} = V_{dd}/2 \) and the reflected voltage \( V_r = V_{dd}/2 \).
5.2.3 Equivalent driver output resistance

Modeling a CMOS gate as a resistance is a widely-used technique, and some methods for evaluating equivalent driver resistance are proposed. One simple method derives the equivalent resistance such that the time required for charging/discharging a capacitive load, that is the time constant of RC, becomes the same with the delay time of a CMOS driver [2]. This method is useful for the conventional RC delay evaluation. However this method is not suitable for impedance matching of transmission-lines. It is because this method assumes that the voltage changes from 0 to \( V_{dd} \) exponentially, although the voltage on a transmission-line changes stepwise. An example of the driver sizing result is shown in Fig. 5.3. The transistor widths are adjusted so that the equivalent resistance of the driver derived by the above method become equal to the characteristic impedance. The interconnect is a lossless transmission-line whose length is 5mm and characteristic impedance is 83.8Ω.

If the driver resistance matches with the impedance of the transmission-line, the voltage injected into the transmission-line should be \( V_{dd}/2 = 0.6V \). However as shown in Fig. 5.3, the voltage of over 0.6V appears at the near end. It is because the driver modeling method [2] overestimates the equivalent resistance. Therefore this method is not suitable for impedance matching.

An equivalent resistance is evaluated from the drain voltage and the drain current of MOSFETs. The drain current of a transistor is represented as \( I_d(V_d) \), where \( V_d \) is the source-drain voltage. Then the gate-source voltage \( V_{gs} \) is assumed to be the same as the supply voltage \( V_{dd} \). The equivalent resistance of the standard-width(\( w_0 \)) transistor, \( R_{eq0} \), is defined as

\[
R_{eq0}(V_d) = \frac{V_d}{I_d(V_d)}. \tag{5.3}
\]

When the transistor width becomes \( w_x \) times larger, the drain current is assumed to increase \( w_x \)-fold. This assumption is proper when the narrow channel effect can be ignored, i.e., \( \frac{\text{Channel width } W}{\text{Channel length } L} \) is large. The equivalent resistance of the driver whose transistor width is \( w_x \times w_0 \),
5.2. CMOS driver sizing for lossy transmission-lines

\[ R_{eq} = \frac{R_{eq0}}{w_s} \] (5.4)

This resistance is proposed as the equivalent resistance of a driver. Using Eq. (5.4), the equivalent resistance for various drivers whose transistor widths are different can be evaluated quickly without any additional circuit simulations.

5.2.4 Driver Sizing for Lossless Transmission-Lines

Before discussing lossy transmission-lines, driver sizing for lossless transmission-lines is discussed for simplicity. The next section handles lossy transmission-lines.

In lossless transmission-lines, impedance matching between driver and the transmission-line is solution for the signal propagation without overshoot, undershoot and ringing [26]. When impedance matching is achieved, the voltage of injected wave is \( V_{dd}/2 \). A full amount of voltage reflects at the far-end and the voltage at the far-end rises to \( V_{dd} \). At the near-end, no reflection occurs. In this section, a method to achieve the impedance matching using the proposed equivalent output resistance is explained. In the proposed driver resistance model, the equivalent resistance depends on the source-drain voltage \( V_d \). Therefore the drain voltage \( V_d \) is determined first. Suppose the input signal of a driver, whose transistor width is \( w_s \times w_o \), falls from supply voltage \( V_{dd} \) to \( 0 \). In this case, pMOS pulls up the output from \( 0 \) to \( V_{dd} \). When the output voltage at the near-end of the interconnect rises to \( V_{near} \), the source-drain voltage \( V_d \) of nMOS is \( V_{near} \). The injected voltage at the near-end is determined by Eq. (5.1).

\[ V_{near} = \frac{Z_0}{R_{eq0}/w_s + Z_0} V_{dd}, \] (5.5)

where \( Z_0 \) is the characteristic impedance of the transmission-line. This equation can be transformed as follows.

\[ w_s = \frac{V_{near} R_{eq0}}{(V_{dd} - V_{near} Z_0)}. \] (5.6)

This equation determines the size of driver \( w_s \) that injects voltage \( V_{near} \). Applying \( V_{near} = V_{dd}/2 \) to Eq. (5.6), the gate width \( w_s \) that achieves impedance matching is obtained. In this case, \( Z_0 = R_{eq0}/w_s \) holds. The proposed technique can determine the driver size fast, because the driver resistance of various sizes can be obtained instantly as explained in Sec. 5.2.3. The result of impedance matching is shown in Fig. 5.4, where the interconnect structure of the evaluated transmission-line is the same with that of Fig. 5.3. The Eq. (5.6) realizes impedance matching in high accuracy compared with Fig. 5.3.

5.2.5 Application to lossy transmission-lines

In this section, Eq. (5.6) is applied to lossy transmission-lines. In the case of lossy transmission-lines, the propagating signal attenuates. Therefore the amplitude of the injected wave has to be higher by the amount of loss.

For signal propagation, it is sufficient that the voltage at the far-end exceeds the logical threshold. When the loss property become intensive, the voltage shifts continuously such as shown in Fig. 5.5.
Chapter 5. Driver/Receiver design for high-speed signaling

Figure 5.4: Result of impedance matching by Eq. (5.6) (A 0.13\(\mu\)m process, characteristic impedance 84\(\Omega\), 5mm long wire, pMOS W/L = 345, nMOS W/L = 143)

Figure 5.5: Input response on lossy and lossless transmission-line
5.2. CMOS driver sizing for lossy transmission-lines

Because of this voltage shift, it is better to adjust the rise voltage at the far-end lower than the supply voltage $V_{dd}$. Here the rise voltage at the far-end is represented by $r \times V_{dd}$, where $0 < r < 1$. This is consistent with the conclusion of Sec. 5.2.6 that it is better to set $r$ lower when the attenuation is large.

In order to transmit a signal whose amplitude is $r \times V_{dd}$ to the far-end on lossy transmission-lines, the amplitude of the traveling wave at the far-end has to be $rV_{dd}/2$. The traveling wave reflects at the far-end and the voltage at the far-end rises to $rV_{dd}$. On transmission-lines, the amplitude of a wave attenuates exponentially with the attenuation constant $\alpha$. If the shunt conductance is negligible, the attenuation constant is expressed as

$$\alpha = \sqrt{\frac{1}{2} \frac{\omega C}{\sqrt{[R^2 + (\omega L)^2] - \frac{1}{2} (\omega^2 LC)}}.}$$  \hspace{1cm} (5.7)

The relationship between the voltage amplitude at the near-end $V_{near}$ and the amplitude at the far-end $V_{far}$ is expressed as follows.

$$V_{far} = V_{near} e^{-\alpha t}. \hspace{1cm} \text{(5.8)}$$

Therefore, in order to transmit a signal whose amplitude is $V_{far} = rV_{dd}/2$ to the far-end, $V_{near}$ should be

$$V_{near} = \frac{rV_{dd}}{2e^{-\alpha t}}. \hspace{1cm} \text{(5.9)}$$

Removing $V_{near}$ from Eq. (5.6) and Eq. (5.9), the necessary and sufficient strength of the driver is given by

$$w_s = \frac{r R_{eq0}}{(2e^{-\alpha t} - r)Z_0}. \hspace{1cm} \text{(5.10)}$$

Eq. (5.10) determines the driver strength that realizes the signal propagation at the velocity of electromagnetic wave.

An example of the driver sizing result is shown. The required amplitude at the far-end $V_{far}$ is set to 70% of the supply voltage. This value of 70% is the addition of the logical threshold voltage and a certain amount of margin. In this case, $V_{far}$ should be 35% of $V_{dd}$. The transistor sizes of the driver are tuned by Eq. (5.10) with the condition $r = 0.70$. The result of circuit simulation is shown in Fig. 5.6.

The same experiments in the other fabrication process are also examined. Fig. 5.7 shows the result in a 50nm process. This figure shows that the signals propagate through the interconnect at the velocity of electromagnetic wave without the disorder of the signal waveform. The similar results are obtained in the other processes.

5.2.6 Ringing caused by impedance mismatch

The proposed method tunes the driver size stronger than impedance matching to compensate the loss. When the impedance match is not achieved at the near-end, a wave on the transmission-line repeats reflecting at both the ends of line. These reflections may cause ringing and ringing may distort the waveform and prevent signal integrity. In order to avoid this problem, the wave which remains on a line once after it reached the far-end should become small enough.
Chapter 5. Driver/Receiver design for high-speed signaling

Figure 5.6: Result of gate sizing for a lossy transmission-line (A 0.13\(\mu\)m process, characteristic impedance 86\(\Omega\), 5mm long wire, pMOS \(W/L = 369\), nMOS \(W/L = 150\))

Figure 5.7: Result of gate sizing for a lossy transmission-line in a 50nm process (characteristic impedance 86\(\Omega\), 5mm long wire, pMOS \(W/L = 133\), nMOS \(W/L = 62\))
Here the conditions that the multiply-reflected wave becomes sufficiently small is discussed. The voltage amplitude of the wave on a transmission-line decreases to 0 gradually by attenuation and reflection at the near-end. The reflection at the far-end does not change the amplitude of the wave, since the far-end is open. Then the behavior of decreasing reflected wave is described by attenuation $n = e^{-at}$ and reflection coefficient $\Gamma$. The wave reflects at the near-end with the reflection coefficient $\Gamma$, where $\Gamma$ is represented as $(R_{eq} - Z_0)/(R_{eq} + Z_0)$. The reflection reduces the voltage amplitude of the wave from $V$ to $\Gamma V$, where $V$ is the voltage amplitude before the reflection. The voltage amplitude of wave also decreases $e^{-at}$-fold (Eq. (5.8)) by attenuation as the wave propagates on the line. The voltage $V^{(i)}_{\text{near}}$ is defined as the voltage amplitude of the wave after the $i$-th reflection at the near-end. Similarly $V^{(i)}_{\text{far}}$ represents the amplitude after the $i$-th reflection at the far-end.

$$
\begin{align*}
V^{(i)}_{\text{near}} &= n^{2(i-1)}(\Gamma^i-1)2^{-i}V_{dd} \\
V^{(i)}_{\text{far}} &= n^{2(i-1)}2^{-i}V_{dd}
\end{align*}
$$

where $n = e^{-at}$. As the voltage at the near-end rises, the equivalent resistance $R_{eq}$ changes. Although the reflection coefficient $\Gamma$ changes, the reflection coefficient $\Gamma$ is assumed to be constant.

$$
\Gamma = \frac{R_{eq} - Z_0}{R_{eq} + Z_0}.
$$

From Eqs. (5.5) and (5.9), the relationship among $Z_0$, $R_{eq}$, $\Gamma$ and $n$ can be expressed as

$$
\frac{Z_0}{R_{eq} + Z_0} = \frac{r}{2n}.
$$

$\Gamma$ is transformed as follows from Eqs. (5.12) and (5.13).

$$
\Gamma = 1 - 2\frac{Z_0}{R_{eq} + Z_0} = 1 - 2\frac{r}{2n} = 1 - \frac{r}{n}.
$$

From Eq. (5.14), it is clear that when $n$ equals $r$, the reflection coefficient becomes 0 and no reflections occur at the near-end. The condition $r = n$ means that impedance matching is achieved at the near-end. $V^{(i)}_{\text{near}}$ and $V^{(i)}_{\text{far}}$ are expressed as follows using the variables of $i$, $n$ and $r$.

$$
\begin{align*}
V^{(i)}_{\text{near}} &= n^{2(i-1)}(1 - r/n)^i r/2 \\
V^{(i)}_{\text{far}} &= n^{2(i-1)}(1 - r/n)^i r/2
\end{align*}
$$

The voltage amplitude of the wave reflected at the near-end after one round trip, $V^{(1)}_{\text{near}}$, is evaluated under various values of $n$ and $r$. The variable $r$, which represents the normalized amplitude of the signal transmitted to the far-end, is varied between 0.5 and 1. The results are shown in Fig. 5.8.

At a glance, the maximum voltage of the reflected wave is as much as 25% of the supply voltage. However this maximum voltage can be reduced by tuning $r$ according to $n$. When an interconnect is given, the attenuation $n$ is fixed, but $r$ is usually tunable. The reflections at the near-end can be removed by equalizing $r$ to $n$. In this case, the driver impedance is matched with the characteristic impedance of the line. Even when the minimum value of $r$, $r_{\text{min}}$ is given and $r$ can not be set to $n$, The multiple-reflection is suppressed by setting $r$ to the closest value to $n$. An example of $r_{\text{min}} = 0.8$ is highlighted by a thick line in Fig. 5.8. As you see, the amplitude of the reflected wave is decreased by the adjustment of $r$. The maximum amplitude is below 10% of the supply voltage. The proposed method therefore can propagate signals without overshoot, undershoot and ringing.
Chapter 5. Driver/Receiver design for high-speed signaling

Figure 5.8: Normalized voltage amplitude of the wave which reflects at near-end (normalized by the supply voltage, $i = 1$)

![Normalized voltage amplitude](image)

Figure 5.9: A tapered static CMOS buffer.

5.3 Bandwidth of static CMOS drivers

In this section, the bandwidth of static CMOS drivers is discussed.

5.3.1 Tapered static CMOS drivers

Figure 5.9 shows a tapered CMOS buffer. When the load of the buffer is an on-chip transmission-line, the characteristic impedance is typically from 50$\Omega$ to 100$\Omega$. To achieve impedance matching with the on-chip interconnect, the size of transistors becomes large. For example in a 90nm process, the transistor size has to be $W/L = 508$ for the pMOS transistor and $W/L = 212$ for the nMOS transistor. The tapered buffer is an effective technique to compose such large inverters. In Fig. 5.9, $N$ is the number of stages and $u$ is the taper factor. Each stage of the tapered buffer is $u$ times larger than the previous stage. The final stage of the tapered buffer is $u^{N-1}$ times larger than the first stage. Here the ratio $u^{N-1}$ is written as $X$ for simplicity.

As the taper factor $u$ becomes smaller, the load capacitance of each stage becomes relatively small...
5.4. CML driver/receiver design based on the pole frequency

Figure 5.10: Experimental circuit for eye-diagram evaluation.

and the operating frequency is improved. However the large number of stages is required to reduce the taper factor. As the number of stages increases, the power dissipation increases. As assuming that the power dissipation is proportional to the gate width, the power dissipation of the tapered buffer is expressed as

$$\sum_{n=1}^{N} P_1 n^{a-1} = \frac{P_1}{1 - (1/u)^N}$$

where $P_1$ is the power dissipation of the final stage. Therefore there is a trade off between the bandwidth and the power dissipation.

5.3.2 Bandwidth of tapered CMOS buffers

Here the bandwidth of tapered CMOS buffers is experimentally evaluated by circuit simulation [19]. The experimental circuit is shown in Fig. 5.10. The transistors in the final stage are tuned to have 50\(\Omega\) output impedance. A 90nm process is assumed as the transistor model and the transistor sizes of the final stage are $W/L = 508$ for the pMOS transistor and $W/L = 212$ for the nMOS transistor. To evaluate the buffer performance, no load is connected to the output of the final stage. The input is a random NRZ pulse sequence whose pulse shape is the trapezoidal. The eye-diagram at the output of the final stage is evaluated.

The number of stages is changed under the situations that the ratio $X$ is 3, 9 or 27. The operating frequency versus the eye opening voltage is shown in Fig. 5.11–Fig. 5.13. As show in Fig. 5.11–Fig. 5.13, the maximum operating frequency increases as the taper factor is reduced. From the experimental results, the efficiency of increasing the stages decreases as the number of stages increases. When designing tapered CMOS drivers, the number of stages should be determined with consideration of the trade off between the bandwidth and the power dissipation.

5.4 CML driver/receiver design based on the pole frequency

This section proposes a performance estimation method of a CML buffer. The proposed method focuses the pole location of the tapered CML buffer and estimates the bandwidth.
Figure 5.11: Relationship between the taper factor and the operating frequency ($X = 3$).

Figure 5.12: Relationship between the taper factor and the operating frequency ($X = 9$).

Figure 5.13: Relationship between the taper factor and the operating frequency ($X = 27$).
5.4. CML driver/receiver design based on the pole frequency

5.4.1 Tapered CML buffer

On-chip CML driver has to drive a differential transmission-line. The characteristic impedance of on-chip differential lines is typically in the range from 50Ω to 200Ω. To drive such low impedance load, tapered driver is used. Therefore the number of stages and the taper factor are also design parameters.

Figure 5.14 shows a tapered driver. In Fig. 5.14, the number of stages is \( N \) and the taper factor is \( u \). In this discussion, the input stage of the tapered driver is written as the 1st stage. The last stage is the \( N \)-th stage. The transistor size and the bias current gradually scale up with the taper factor \( u \). In opposite, the pull up resistance scales down with the taper factor \( u \). The relationship between the \( k \)-th stage and the \( k+1 \)-st stage is expressed as

\[
\begin{align*}
R_{Dk+1} &= R_{Dk}/u \\
W_{k+1} &= uW_k \\
I_{\text{tail}_{k+1}} &= uI_{\text{tail}_k}
\end{align*}
\]

(5.17)

the subscript \( k+1 \) and \( k \) denote the \( (k+1) \)-st stage and the \( k \)-th stage respectively. The parameter \( X \) is the ratio between the first stage and the last stage and is equal to \( u^{N-1} \).

The latency of tapered CML buffer is discussed in Ref. [20] and Ref. [20] concludes that the delay of CML buffer has similarity with static CMOS buffer and the number of stages becomes optimal when the number of stages \( N \) satisfies

\[
N \approx \ln\left(u^{N-1}\right).
\]

(5.18)

In other word, the optimal taper factor is Napier's constant \( e \). However the bandwidth is also an important metric of CML driver. In the following part, a design guideline to maximize the bandwidth is proposed.

5.4.2 Pole frequency of tapered CML buffers

The frequency characteristics of the small signal gain of CML buffers is discussed.

As the taper factor is scaled down, the bandwidth of each stage increases because the load capacitance of each stage decreases. However as the number of stages increases, the gain of cascade driver drops quickly in high frequency. The proposed method focuses the frequency characteristic of the voltage gain. A factor that decides the frequency characteristics is the pole frequency of the system.
Chapter 5. Driver/Receiver design for high-speed signaling

Except for the final stage, the load of each stage is the CML buffer in the next stage. The load of the nMOS transistor of the $k$-th stage is the pull-up resistor $R_{DL_k}$, drain-backgate capacitance $C_{DB_k}$, and the gate capacitance of the next stage $C_{G_{k+1}}$. Therefore the pole frequency $\omega_{pk}$ is expressed by

$$\omega_{pk} = \frac{1}{R_{DL_k} (C_{DB_k} + C_{G_{k+1}})}.$$  \hspace{1cm} (5.19)

Here by defining the pull-up resistance, the drain-backgate capacitance and the gate capacitance of the final stage as $R_D$, $C_{DB}$, and $C_G$ respectively, Eq. (5.19) is rewritten as

$$\omega_{pk} = \frac{1}{\mu^{N-k} R_D \left( \frac{C_{DB}}{\mu^{N-k}} + \frac{C_G}{\mu^{N-k-1}} \right)} = \frac{1}{R_D (C_{DB} + \mu C_G)}.$$ \hspace{1cm} (5.20)

Therefore the poles of each stage except the final stage are at the same frequency. Figure 5.15 shows gain curves of tapered driver. The driver is designed for differential transmission-line whose differential characteristic impedance is $100\Omega$. The fabrication technology is a 90nm CMOS process. In the final stage, the pull-up resistance is $50\Omega$, the transistor size $W/L$ is 675 and the bias current is 9.6mA. The ratio of the first and the last stage $X$ is fixed to 9 and the number of stages is changed. The smooth curves in Fig. 5.15 are calculated by a circuit simulation [19] and piecewise-linear curves are derived from the pole frequency calculated by Eq. (5.20). When the number of stages is 2, the pole frequency is relatively low because the taper factor $\mu$ is large and the load capacitance is large. As the number of stages increases, the taper factor decreases and the pole location shifts to higher frequency.

The pole frequency is based on a small signal analysis. Rigidly speaking, this small signal analysis has a certain error because the voltage amplitude of CML buffers cannot be assumed as zero. Next section experimentally verifies the correlation between pole frequency based on small signal analysis and the eye-diagram by transient analysis.
5.4. CML driver/receiver design based on the pole frequency

5.4.3 Performance estimation based on the pole frequency

To discuss the effect of the frequency characteristics on time-domain waveforms, eye-diagrams are evaluated. The experimental circuit is shown in Fig. 5.16. The tapered driver is designed as discussed above. The maximum differential output voltage $\Delta V_{\text{max}} = 2R_D I_{\text{tail}}$ is 0.48V. The transistor model is adopted from a 90nm process. To evaluate the performance of the driver, the load of the final stage is an ideal resistor that has the same value as the differential characteristic impedance of the transmission-line. The input is a random NRZ pulse sequence. The pulse shape is a trapezoidal pulse that the transition time is one tenth of the minimum pulse width. The eye-diagram at the output of the final stage is evaluated.

Figure 5.17, Fig. 5.18 and Fig. 5.19 shows the eye-opening voltage in terms of the clock frequency. The ratio between the first stage and the last stage $X$ is varied in 3, 9 and 27. As the clock frequency becomes higher, the eye-diagram degrades and the eye-opening voltage decreases. The arrows in the top of figure points the pole frequency calculated by Eq. (5.20). From Figs. 5.17–5.19, the pole frequency indicates the frequency where the eye-diagram starts degrading. The pole frequency depends on the taper factor $u$, thus the necessary taper factor can be determined by the pole frequency expressed by Eq. (5.20).

Figure 5.20 shows the pole frequency versus the taper factor of the driver used for Figs. 5.17–5.19. As the taper factor closes to 1, the frequency of the pole frequency becomes higher. However in order to decrease the taper factor, the number of stages has to be increased. Figure 5.21 shows the pole frequency as the function of the number of stages. The taper factor $u$ is expressed as

$$u = X^{1/(N-1)}.$$  \hspace{1cm} (5.21)

Therefore as the number of stages increases, the efficiency of adding the stage decreases. On the other hand, the power dissipation increases as the number of stages increases. The dashed line in Fig. 5.21 shows the sum of the bias current normalized by the current in the final stage. The sum of the current is expressed as

$$\sum_{n=1}^{N} \frac{I_{\text{tail}}}{u^{n-1}} = I_{\text{tail}} \frac{1 - (1/u)^N}{1 - 1/u},$$  \hspace{1cm} (5.22)

where $I_{\text{tail}}$ is the bias current in the final stage. As shown in Fig. 5.21, the sum of the current is almost proportional to the number of stages. From the discussion above, increasing the stages improves the maximum operating frequency and increases the power dissipation at the same time. As the number of
Figure 5.17: Clock frequency versus eye-opening voltage ($X = 3$).

Figure 5.18: Clock frequency versus eye-opening voltage ($X = 9$).

Figure 5.19: Clock frequency versus eye-opening voltage ($X = 27$).
5.4. CML driver/receiver design based on the pole frequency

Figure 5.20: The pole frequency of a CML buffer versus the taper factor.

Figure 5.21: The pole frequency and the total bias current versus the number of stages \((X = 9)\).

When the number of stages increases, the ratio of bandwidth improvement decrease though the increase of the power dissipation is almost linear to the number of stages. For example, when the number of stages is changed from 3 stages to 4 stages, the bandwidth improves by 31% and the power dissipation increases by 23%. However in the case that the number of stages is changed from 4 stages to 5 stages, the improvement of the bandwidth is only 13% in spite of the 19% power dissipation increase. Therefore the number of stages should be determined considering the trade off between the performance and the power dissipation and the proposed method provides the cost performance estimation with regard to the number of stages.

5.4.4 Output amplitude and the performance of CML buffers

Here the relationship between the output amplitude and the bandwidth of CML driver is discussed. When the output amplitude is changed, the tail current changes because the pull up resistance \(R_D\) is fixed to the characteristic impedance of the interconnect. The gate width of the nMOS transistor depends on the tail current and the capacitances vary with the gate width.

From Eq. (1.4), the gate width \(W\) is proportional to the tail current \(I_{\text{tail}}\) and the inverse of the square of the output voltage \(V_{\text{out}}\). The tail current is expressed as

\[
I_{\text{tail}} = \frac{V_{\text{out}}}{R_D}.
\]  

This equation leads to the gate width as

\[
W = \frac{2L}{\mu C_{\text{ox}} R_D \Delta V_{\text{out}}}. \tag{5.24}
\]

Therefore the gate width becomes small as the output voltage becomes large because the gate-source voltage of the nMOS transistor increases. From above discussion, the bandwidth of the CML driver improves as the output amplitude increases. Figure 5.22 shows the eye-opening when the output voltage is changed. The transistor model and the load of the CML driver is the same as those of Section 5.4.3.
The ratio between the first stage and the final stage is fixed to 9 and the number of stages is fixed to 4. The output voltage is varied from 0.18V to 0.36V. The eye-openings are evaluated by circuit simulation [19] and the dashed line labeled “Pole location by the proposed equation” shows the relationship between the output voltage and the pole frequency. From Fig. 5.22, the bandwidth improves as the output voltage increases. Figure 5.22 also shows that the frequency where the eye-opening starts to degrade is predictable by the proposed formula, Eq. (5.20).

When the output voltage is changed, the bandwidth and the power dissipation are proportional to the output voltage. Therefore if the maximum bandwidth is required, the output voltage has to be set to the threshold voltage because the output voltage is limited by the threshold voltage to operate in saturation region. In the case of Fig. 5.22, the threshold voltage in saturation region is 0.37V and the maximum bandwidth is 29GHz.

5.4.5 Performance prediction of CML buffers

By using Eq. (5.20), the maximum performance of the future process can be predicted. From Eq. (5.20), the pole frequency depends on the pull-up resistance, the drain-backgate capacitance, the gate capacitance and the taper factor. The resistance is determined the characteristic impedance of the transmission-line and the characteristic impedance does not change drastically by technology scaling. For simplicity, two rules are assumed.

- When the gate length $L$ is scaled, the area of the drain scales in the same ratio. By this assumption, the drain-backgate capacitance can be considered to be proportional to the product of the gate length $L$ and the gate width $W$.

- The drain-backgate capacitance per area scales in the same ratio as the gate capacitance per area.
5.4. CML driver/receiver design based on the pole frequency

![Performance prediction of CML driver](image)

Figure 5.23: Performance prediction of CML driver.

From Eq. (1.2), the gate width is proportional to $L/(\mu C_{\text{ox}})$. As a result, if the transmission-line to drive and the taper factor is fixed, the pole frequency $\omega_p$ has the nature expressed as

$$\omega_p \propto \frac{\mu C_{\text{ox}}}{L^2} \frac{1}{R_D(C_{DB} + uC_G)}.$$ (5.25)

As the technology scales, $\mu C_{\text{ox}}$ increases and the gate capacitance decreases [4]. Assuming the capacitance $C_{DB}$ scales at the same ratio as the gate capacitance $C_G$, the performance trend is predicted as shown in Fig. 5.23. The x-axis is the technology node and the y-axis is the pole frequency normalized by the value at the 100nm process. The characteristics of the future process are taken from a roadmap [4]. From Fig. 5.23, the performance of the CML driver improves as the technology scales. The pole frequency expressed by Eq. (5.20) indicates the maximum performance of CML driver in a certain fabrication process. Therefore combining the performance prediction of on-chip interconnect proposed in Section 4.2 and that discussed in this section, the maximum performance of the system can be estimated.

5.4.6 Design of CML receivers

The CML receiver is also the CML buffer shown in Fig. 1.17. The difference between CML drivers and CML receivers is their constraint of the pull up resistance $R_D$. In the CML driver, the pull up resistance has to be matched with the characteristic impedance of the interconnect. The CML receiver has more design freedom than the driver because the pull up resistance can be varied.

Here the output voltage is assumed to be determined by the next logic gates of the receiver. From Eq. (5.24), the gate width of the transistor is proportional to the inverse of the pull up resistance $R_D$. When the resistance $R_D$ increases, the tail current $I_{\text{th}}$ decreases and the nMOS transistor is shrunk. Therefore the product $R_D C_{DB}$ is constant value even if the resistance $R_D$ is changed. The pole frequency
of the receiver is expressed as

\[ \omega_p = \frac{1}{R_D (C_{DB} + C_{in})}, \]  

(5.26)

where \( C_{DB} \) is the drain-backgate capacitance of the receiver and \( C_{in} \) is the input capacitance of the next logic gates. If the output voltage \( V_{out} \) and the capacitance \( C_{in} \) are fixed, the pole frequency is controlled by the resistance \( R_D \) and there is a trade off between the pole frequency and the tail current. As the resistance \( R_D \) increases, the pole location shifts to lower frequency and the tail current decreases. By using Eq. (5.20) in receiver design, the necessary and sufficient bandwidth is realized and the power dissipation can be suppressed to the minimum.

5.5 Summary

In this chapter, the driver and the receiver design for on-chip interconnects is discussed. First, a driver sizing method for lossy transmission-lines in VLSI is proposed. Impedance matching is a common practice in design of transmission-line drivers. The proposed method uses an impedance-unmatched driver to improve the performance of the signaling system. The proposed method determines the injected voltage properly considering attenuation and reflection. Therefore the proposed method realizes the signal propagation at the velocity of electromagnetic wave without deteriorating waveforms. The proposed method is experimentally verified that the proposed method is effective in 0.18-0.05\( \mu \)m technologies. In this thesis, the impedance-unmatched driver is discussed only on static CMOS drivers. Applying this method to CML drivers is a future work.

Then a bandwidth estimation of CML buffers based on the pole frequency is proposed. By focusing the pole frequency of tapered CML driver, the proposed method derives the relationship between the number of stages or the taper factor and the bandwidth of CML driver. The effect of increasing the stages on the bandwidth and the power dissipation is expressed by simple equations. The proposed method also provides the performance prediction in future process. Combining the method and the performance prediction of on-chip interconnects proposed in the previous chapter indicates the maximum performance that can be realized in a certain process.
Chapter 6

Design methodology of on-chip high-speed signaling

6.1 Introduction

In this chapter, the performance of on-chip high-speed signaling systems is discussed. The performance of on-chip interconnects is discussed in Chapter 4 and the performance of the driver and the receiver is discussed in Chapter 5. This chapter merges these two performance estimation methods and proposes a performance estimation of the total signaling system. Conventionally, on-chip signaling is single-end and static CMOS driver/receiver are used. The bottleneck of the signaling system was the CMOS driver, however as improving the performance of transistors, interconnects have been becoming the bottleneck of the system. Therefore it is important to discuss the performance of the interconnect and that of the driver/receiver together. Moreover there is an option to use differential signaling and CML buffers. As discussed so far, CML and differential signaling can achieve high performance but they requires much cost in the power dissipation and the interconnect resource. It is also a crucial problem in which situation the differential signaling is required.

This chapter proposes a performance estimation method based on the methods discussed in Chapters 4 and 5. The proposed method provides the maximum performance of the conventional single-end signaling and the differential signaling using CML. The performance estimation by the proposed method is based on analytical performance estimation and the required parameters can be obtained in the early stage of circuit design. The selection of the signaling method strongly affects the entire chip design because it may limit the whole chip performance and the required power and interconnect resource are quite different between single-end signaling and differential signaling. Therefore which method should be used is a crucial problem. The contribution of the proposed method is providing a design guideline of on-chip signaling in the early stage of circuit design.

6.2 Interconnects under study

First the interconnect structure under study is explained. To discuss the maximum performance, thick and wide interconnects are used. The cross section of the structure is shown in Fig. 6.1. The intercon-
Air
(sing-ene)

(differential)

<table>
<thead>
<tr>
<th>4μm</th>
<th>4μm</th>
<th>4μm</th>
</tr>
</thead>
<tbody>
<tr>
<td>G</td>
<td>S</td>
<td>G</td>
</tr>
</tbody>
</table>

<table>
<thead>
<tr>
<th>4μm</th>
<th>4μm</th>
<th>4μm</th>
</tr>
</thead>
<tbody>
<tr>
<td>G</td>
<td>S</td>
<td>G</td>
</tr>
</tbody>
</table>

<table>
<thead>
<tr>
<th>1.6μm</th>
<th>1.6μm</th>
<th>3.6μm</th>
<th>3.6μm</th>
<th>3.6μm</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

ILD (k=3.0)

Si substrate (conductivity=0)

15μm

10μm

Figure 6.1: Sectional structure of the interconnects under study.

Interconnects are assumed to be in the top layer of a certain process and the thickness of the wire is 1μm. The width of each interconnect is 4μm to reduce the resistance loss. The spacing is adjusted so that the characteristic impedance becomes 50Ω for single-end and 100Ω for differential. The dielectric constant is 3.0 assuming that a low-k dielectric is used. By a 2D field-solver, the frequency dependent characteristics are extracted. In analytical performance estimation, the characteristics at the frequency determined by the method proposed in Chapter 3 are used. For example, the extraction frequency is 8.6GHz for 5mm long single-end line. The attenuation constant α is 71Np/m in the single-end interconnect and is 89Np/m in the differential pair at the extraction frequency.

6.3 Performance of single-end signaling

This section discusses the performance of the conventional single-end signaling using static CMOS inverters. The maximum performance of the static CMOS buffer depends on the characteristic impedance of the interconnect and the taper factor. For example, the maximum operating frequency of 4-staged buffer whose taper factor is 2.1 is about 15GHz from Fig. 5.12. The performance of open-ended transmission-line is discussed in Chapter 4 and here the equations of the eye opening voltage is re-described.

\[
V_{\text{eye}} = \begin{cases} 
\frac{1-n}{l/v} (T - t_r) + 2n - 1 & (T < 2t_{\text{tot}}) \\
V_{\text{max}} = 1 & (T > 2t_{\text{tot}})
\end{cases}, 
\]

where \( n \) is the attenuation parameter and it is equal to \( \exp(-\alpha l) \), \( \alpha \) is the attenuation constant, \( l \) is the interconnect length, \( v \) is the speed of electromagnetic wave, \( T \) and \( t_r \) are the minimum pulse width and the transient time of the input pulse respectively, and \( t_{\text{tot}} \) is the signal time-of-flight. The supply voltage is assumed as 1V. In the following discussion, the transition time \( t_r \) is assumed as one tenth of the minimum pulse width \( T \). Here the required eye-opening at the input of the receiver is written as \( V_{\text{req}} \). From Eq. (6.1), the maximum operating frequency \( f_{\text{max}} \) is expressed as

\[
f_{\text{max}} = \frac{1}{2T} = \frac{1}{2} \left( 1 - \frac{t_r}{T} \right) \frac{1 - n}{l/v(V_{\text{req}} - 2n + 1)} = \frac{9}{20} \frac{l/v(V_{\text{req}} - 2n + 1)}{1 - n}. 
\]

Above equation indicates that as the required receiver input \( V_{\text{req}} \) closes to the voltage \( (2n - 1) \), the maximum operating frequency increases up to infinity. The meaning of the voltage \( (2n - 1) \) is shown
6.3. Performance of single-end signaling

6.3.1. Performance of single-end signaling

Figure 6.2: Meaning of the voltage $(2n - 1)$.

Figure 6.3: Performance limitation of the single-end signaling with various required eye-opening.

Figure 6.4: Performance limitation of the single-end signaling with various attenuation parameters.

in Fig. 6.2. By assuming the impedance matching at the near-end, the rise voltage at the far-end is $n$. Therefore the eye-opening voltage at the transition is $(2n - 1)$. If the voltage $(2n - 1)$ exceeds the required eye-opening $V_{\text{req}}$, the maximum eye-opening $V_{\text{eye}}$ certainly exceeds the required eye opening $V_{\text{req}}$.

Figure 6.3 shows the performance limitation of the single-end interconnect shown in Fig. 6.1. The horizontal line labeled “limitation of driver” shows the performance limitation of a static CMOS driver discussed in Section 5.3. The region surrounded with x-axis, y-axis, the limitation of driver and the limitation of interconnect is the region where the single-end signaling using a static CMOS logic is functional. As the receiver input $V_{\text{req}}$ becomes larger, the performance degrades and the interconnect length where the voltage $V_{\text{req}}$ becomes equal to $2n - 1$ becomes short. This is because that the large receiver input is difficult to realize on long interconnects due to their attenuation. Figure 6.4 shows the performance limitations with various attenuation constants. As discussed in Chapter 4, the sensitivity to the attenuation is relatively low and the curves do not shift drastically even if the attenuation constant is tripled. Therefore the required receiver input $V_{\text{req}}$ is an important parameter for the single-end signaling.
In reality, eye-diagram degrades because of the waveform distortion caused by the dispersion of the interconnect characteristics. To suppress the waveform distortion, several techniques are developed [30, 107, 108]. These methods improve signal integrity and close the situation to that assumed in Chapter 4.

### 6.4 Performance of differential signaling

The similar discussion with the previous section is applicable to the differential signaling. From the performance estimation in Chapter 4, the maximum operating frequency limited by the interconnect is expressed as

\[
f_{\text{max}} = \frac{9}{20} \frac{1 - \log n}{V_{\text{req}} - 2n + \frac{1}{1 - \log n}}. \tag{6.3}
\]

In this equation, the parameter of the receiver input \( V_{\text{req}} \) is the amplitude in differential. From Eq. (6.3), the interconnect does not limit the performance in the region

\[
V_{\text{req}} \leq 2n - \frac{1}{1 - \log n}. \tag{6.4}
\]

Figure 6.5 shows the performance limitation of the differential signaling using a CML. Trade-off curves are obtained from the discussion in Chapter 4. The difference from static CMOS buffers is that CML buffers have the limitation in the output amplitude. As explained in Chapter 5, the maximum output voltage of the CML driver is limited to the threshold voltage \( V_{\text{th}} \). Using the receiver input \( V_{\text{req}} \) and the attenuation parameter \( n \), this constraint is written as

\[
V_{\text{req}} \leq 2nV_{\text{th}}. \tag{6.5}
\]

Because of this requirement, the length of the interconnect is limited.

### 6.5 Comparison between single-end signaling and differential signaling

This section compares the single-end signaling and the differential signaling and discusses in which situation the single-end signaling or the differential signaling should be used.

#### 6.5.1 Required noise margin of single-end signaling and differential signaling

As discussed in Section 6.3 and Section 6.4, the required receiver input \( V_{\text{req}} \) is an important parameter that determines the entire performance. In ideal case, the receiver input \( V_{\text{req}} \) is determined from the input-output curve of the receiver. However in real chips, a certain amount of noise margin is required to guarantee the correct operation. Single-end signaling is susceptible to the noise and requires larger noise margin than differential signaling. A crucial problem for the receiver is power/ground bounce.
6.5. Comparison between single-end signaling and differential signaling

Figure 6.5: Performance limitation of the differential signaling.

[109]. Especially in static CMOS circuits, current surge of switching cause larger power/ground bounce than CML buffers [31]. Reference [20] reports that the power/ground bounce may exceed the half of the supply voltage. On the other hand, CML buffer can reject the common-mode noise because of the differential architecture. Moreover static current flow prevents the large current surge and suppress the power/ground bounce.

6.5.2 Single-end signaling versus differential signaling

Figure 6.6 show the estimated performance limitation of the conventional single-end signaling and the differential signaling using CMLs. The transistor model is assumed as a 90nm process whose supply voltage is 1.0V. The interconnect structure is the same as shown in Fig. 6.1. The required receiver input $V_{eq}$ is 0.4V (0.1V at each side) in differential signaling and is 0.8V in single-end signaling. The number of stages of tapered driver is 4 and the taper factor is 2.08 for both the single-end signaling and the differential signaling. To maximize the bandwidth, the output voltage of the CML driver is set to the threshold voltage. The noise margin for power/ground bounce is assumed to be 20% of the supply voltage in differential signaling and 60% in single-end signaling. As shown in Fig. 6.6, the CML driver limits the whole performance. In Fig. 6.6, the region is hatched where the differential signaling using CML is the only solution. When the interconnect length is short, the difference of the system performance is the difference of the driver performance. The CML driver can achieve about twice-higher operating frequency comparing with the static CMOS driver. As the interconnect length becomes large, the interconnect performance becomes the bottleneck in the single-end signaling. As shown in Fig. 6.6, the maximum performance of single-end signaling drops in 4mm or longer interconnect. On the other hand, the maximum performance of the differential signaling is always limited by the performance of the CML driver. On 4mm or shorter interconnect, the performance of the differential signaling is about twice of that of the single-end signaling. As the interconnect becomes longer, the advantage of the differential signaling becomes larger. The differential signaling
can operate at 3 times faster on 5mm long interconnect and 3.8 times faster on 6mm long interconnect. Therefore the differential signaling using CML buffers is suitable to the high-speed and long-distance signaling. However on 7mm or longer interconnect, CML driver cannot transmit the signal because of the constraint of Eq. (6.5). The proposed method provides the guidelines in which situation the differential signaling should be used.

### 6.6 Summary

An analytical performance estimation of on-chip high-performance signaling is proposed. The proposed method is based on the method in Chapter 4 about interconnects and is based on the discussion in Chapter 5 about driver and receiver circuits. By combining these two methods, the proposed method enables the performance estimation of the signaling system composed by the driver, the transmission-line and the receiver. The proposed method shows under what condition the differential signaling is needed. Because the proposed method is an analytical method, the performance estimation can be performed by a few basic parameters such as the supply voltage, the threshold voltage, the gate capacitance, the attenuation of interconnects and so on. Therefore the proposed method is effective for the performance estimation and the decision of the signaling strategy in the early stage of circuit design.
Chapter 7

Conclusion

This thesis discusses modeling and design methodology of on-chip high-performance interconnection. As the LSI performance improves, on-chip long-distance signaling is becoming a bottleneck of the whole chip performance. To breakthrough the interconnect bottleneck problem, several techniques are developed such as differential signaling, wave pipeline, short pulse signaling and so on. Each technique has advantages and disadvantages, thus LSI designers have to be careful which method should be used. The proposed methodology provides trade-off analysis of on-chip signaling systems and contributes to the early stage design and the decision of signaling strategy.

Chapter 2 discusses parameter extraction of on-chip interconnects. The inductance of long-distance and high-speed interconnects is not negligible. To extract accurate interconnect parameters, the current distribution is a critical issue. In LSIs, there are huge number of interconnects and the current flow depends on the frequency. In this chapter, a method is developed that screens the wires to be included in return paths. The proposed method is based on the energy dissipation and can select necessary and sufficient wires. This chapter also discusses the effect of orthogonal interconnects and silicon substrate. The measurement results and the simulation by field-solvers show that the wide and dense orthogonal wires such as power/ground rails may act as return paths. If the orthogonal interconnects are ignored, the extracted inductance value has more than 40% error in maximum. Silicon substrate is one of the difficulties in on-chip interconnect modeling. This research find that the power/ground lines for standard cells, that is, narrow and dense wires, can intercept the coupling between the silicon substrate and the interconnects in upper layers.

Chapter 3 proposes a method to improve the modeling accuracy of the conventional frequency-independent interconnect model. In the accurate simulation such as sign-off verification, interconnect models which can handle the frequency-dependence of the characteristics. On the other hand, in the early stage of circuit design, the conventional frequency-independent model is suitable because there are many methods and techniques proposed so far. Therefore it is important to improve the modeling accuracy of the frequency-independent model. The proposed method focuses the transfer characteristics and decides at which frequency the characteristics should be extracted. Experimental results show that the modeling error in the signal delay and the signal transition time is less than 10% using the proposed method.

In Chapter 4, an analytical performance estimation of on-chip interconnects is proposed. By the piecewise-linear waveform model, the proposed method derives equations of eye-opening at the far-end
of interconnects. The derived equations can provide trade-off curves among the interconnect length, the operating frequency, the attenuation and the required eye-opening voltage. Comparison with the circuit simulation, the accuracy of the proposed method is enough for rough performance estimation or discussion about performance trend.

Chapter 5 discusses the design of on-chip drivers and receivers. The drivers and the receivers are classified into the conventional static CMOS inverters and the current-mode logic buffers. About the static CMOS inverters, a driver sizing method to achieve signal transmission at the speed of electromagnetic wave is proposed. The proposed method decides the gate sizes considering the attenuation by the loss of the transmission-line and shows that impedance-unmatched driver is an option in design of on-chip transmission-line drivers. CML buffers are used for high-speed operation. This research proposes a performance estimation method based on small signal analysis. The proposed method focuses the pole frequency of tapered CML buffers and uses the pole frequency as an indicator of the maximum operating frequency. Experimental results show that the performance estimation matches with the result of circuit simulation.

Chapter 6 merges the methods of performance estimation proposed in Chapters 4 and 5 and discusses the performance estimation of the total signaling system. The design parameters of interconnect and driver/receiver circuits have mutual interdependencies. The proposed method reveals the relationship among the design parameters and provides trade-off analysis. The performance estimation discussed in this thesis is based on an analytical approach and provides performance estimation from a few parameters such as the interconnect length, the attenuation and the characteristic impedance of the interconnect, transistor model and so on. Therefore the proposed method contributes to the early stage of circuit design or performance prediction in the future fabrication process.

This thesis mainly focuses on single-end and differential signaling with a pair of driver and receiver circuits. This is the most basic and frequently used building block for on-chip interconnection. In some case, there may be a cascaded interconnection relayed by repeaters. Also, in the future signaling scheme, coding or error correction techniques may be applied for enhancing the communication performance of on-chip signaling. Future work includes the development of a design methodology considering such advanced communication schemes.
BIBLIOGRAPHY

Bibliography


Publication list

Major publications


