### A Modified Clustering Approach For Sub-Micron Cmos Amplifiers

NAVEENKUMAR D PG SCHOLAR Dr. A.RAJARAM ASSOCIATE PROFESSOR

DEPARTMENT OF ELECTRONICS AND COMMUNICATION ENGINEERING KARPAGAM UNIVERSITY, COIMBATORE

Abstract - Process variation is an obstacle in designing reliable CMOS mixed signal systems with high yields. To minimize the variation in voltage gain due to variations in process supply voltage and temperature for common trans conductance based amplifier, we present a new compensation method based on statistical feedback of process information by using a different clustering approaches to design the systems. We develop the background theory of the scheme and present its performance across process corner. We further apply our scheme to two well known amplifier topologies in the Cadence 180 nm CMOS process as design example under the mixed signals of Virtuoso environment and an inductive degenerated low-noise amplifier and a common source amplifier. Measured results over different layout designs chips of the low-noise amplifier show that our compensation technique reduces variation in gain by a factor of  $4.7 \times$ compared to the baseline cases. The common source amplifier exhibits similar reductions in gain variation across different measured layouts. We also present measured results demonstrating how our technique alleviates voltage gain variations caused by temperature and supply voltage changes with different power variations and delays.

Index Terms—CMOS analog integrated circuits, process compensation, process variation, self-biasing.

#### I. Introduction

Advances in device technologies have enabled the integration of multiple analogy and digital blocks on the same layout levels. At the same time continuous device scaling is accompanied by increased device sensitivity to process variation, causing significant loss of yields [1] and limitations in the fabrications. Process cause large fluctuations in the transistors channel Lengths, oxide thickness', and number of dopant atoms in the Device channel [2] [3]. Die-to-die and within-die variations translate to significant variations in circuit performance such as centre frequencies, leakage powers, and critical performances.

Parameters such as gain input matching and noise figures [4], [5]. Even after optimizing fabrication it is fairly common For the  $3\sigma$  variation of threshold voltages of

transistors to exceed 10% for sub-100 nm feature sizes with this number increasing in advanced technologies [6], [7] Post fabrication Efforts such as laser trimming and Zener zapping have been Proposed in [8] to correct for manufacturing defect. Unfortunately both these methods increase testing and validation costs. In modern processes process variation severely affects performance and yield of various mixed signal system. For example, in wireless receiver chains, large variations in voltage gain of lownoise amplifiers (LNAs) which is the first active circuit block are recorded at all process corners [9]–[11] affecting system performance and overall yields.

In this paper we determine that the variation in threshold voltage of the input transistor is the main contributor to gain variations of standard amplifier configurations where transconductance determines gain. With this in mind, we design and develop a compensation scheme that measures the changes in threshold voltage and generates a bias signal for amplifiers in order to minimize deviations in their voltage gains. Our scheme can be adapted to a variety of such amplifier topologies and we experimentally demonstrate the validity of our method on two well known amplifier topologies and an inductively degenerated cascade LNA and a common source amplifier, both used as standard gain cells in many mixed-signal system applications. Both topologies have been designed in the Cadence180nm CMOS process. In this work the first demonstration of successful experimental lavout compensation of sub-micron amplifier. Measurement results show that our method is successfully able to lower the variation in voltage gain of the LNA centred at 4.2 GHz for WiMAX requirements to 1.2%. This is a 2.7× reduction in the standard deviation of S21 as compared to a baseline uncompensated LNA translating to yield improvement of 50% our scheme also reduces the variation in voltage gain due to supply voltage and temperature variations by  $9.8 \times$ and  $1.8\times$ , respectively and by applying the same technique to a common source amplifier (CSA) shows similar reductions in voltage gain variations. Our scheme occupies a small footprint and consumes very little additional power making it an attractive low cost solution.

In Section II we introduce the related work on submicron amplifier compensations. We discuss the impact variation has on gain in Section III and followed by the design concept and circuit implementation in Sections IV and V Measured results of our prototype LNA and CSA used as design examples are presented in Section VI.

#### **II. RELATED WORK**

This section focuses on the PA design issues of the CG transistor of the cascade topology. Conceptual cascade topology, which is comprised of a stacked CG transistor in tandem with common-source transistor, is shown in Fig. 1. The gate node of the CG transistor has a constant dc biasing with an ac ground. The bypass capacitor is required to establish the ac ground of the gate of the CG transistor. In the conventional cascade topology based, the CG device experiences fluctuations in the dc bias

as the input power increases. The voltage waveform of the CG device between the gate and the source nodes with increasing output power is shown in Fig. 2. The figure shows that starts to fall below the threshold voltage, causing the CG device to operate in the cutoff region.

Furthermore, the large swing at the drain node can also drive the CG device from saturation to the triode region during the turn-on operation. Fig. 2 shows the voltage swing between the drain and the gate nodes of the CG device; for 25 dBm of output power, falls below and drives the CG device into the triode region. Thus, the operating region of the transistor continuously changes as a function of time, especially at higher power levels. The operating region variation leads to fluctuations of the intrinsic capacitance of gate-source capacitance drain-junction capacitance, drain-gate capacitance, and transconductance, which are dominant sources of the nonlinearity that degrades the linearity of the CMOS PA. In addition, the gate-to-drain voltage stress on the CG transistor limits the reliability of the cascade PA. The voltage swing on the gate-drain of the CG transistor is usually larger than that of the CS transistor under large-signal operation [8], which leads to a gate-oxide breakdown or hot carrier degradation of the CG transistor of the cascade topology. Employing a thick gate-oxide transistor for the CG transistor alleviates reliability problems with the CG transistor. However, it causes a gain reduction and degrades the RF performance of the PA because of the high knee voltage of the thick gate-oxide transistor compared to the knee voltage of a standard transistor.

#### III. CASCADE FEEDBACK BIAS LINEARIZATION TECHNIQUE

To resolve the linearity and reliability issues of the cascade topology, the CFBT is proposed. The technique realizes the negative feedback loop for the cascade PAs in a novel way. Negative Feedback Formation by the Coupled Signals and the Effects of the CFBT In CMOS technology, because of low substrate resistivity and large parasitic elements, signals are coupled through the silicon substrate and the parasitic elements of the CMOS circuits. Specifically, intrinsic capacitances between the ports of the transistors are unavoidable, and they can easily couple

high-frequency signals to other ports. In a practical implementation, the gate node of the CG device in the cascade topology is not an ideal ac ground and signals from both the source and the drain are coupled to the gate of the CG device through intrinsic capacitances and so these coupled signals will exist on the gate of the CG device and degrade performance. The CFBT utilizes the coupled signals to improve the linearity of the PA by employing negative feedback from a successive stage to a previous stage for cancelling out these unwanted leakage signals. Fig. 4 shows a conceptual diagram of the CFBT employed in the two-stage cascade PA. The RF leakage signals on the gate of the CG device of the power stage are fed back through the bias network to the gate of the CG device of the driver stage. Fig. 5 show the comparison of the voltage wave form of the CG transistor with 1.5 dBm of the same input power in the time domain.

A carrier is described by

v

$$v = V_c Sin(\omega_c t + \theta)$$
 (1)

To **amplitude modulate** the carrier its amplitude is changed in accordance with the level of the audio signal, which is described by

$$= V_{\rm m} \sin \left( \omega_{\rm m} t \right) \qquad (2)$$



Fig.1. schematic of cascode circuit

With the CFBT, the gate node waveform of the driver stage follows the waveforms of drain node and source node, as shown in Fig. 5. The effect of the technique is to suppress the harmonics and improve the AM–PM, which helps improve the reliability. As we mentioned, harmonic suppression is obtained by employing the negative feedback loop. This is done through the bias network from the power stage CG to the driver CG gate, which feeds back the fundamental signal with 180° phase shift. The Open-loop gain of this feedback loop in Fig. 4 is given by (15)as follows. Where the gain of driver stage is, is the gain of power stage, and is the insertion loss of inter-stage

matching network for the fundamental signal. The loop gain for the fundamental signal, is derived as where is the impedance of the drain of the CG transistor, is impedance of looking into the drain of CS transistor, and is the transconductance of the CG transistor.

The ratio that falls on the gate of CG device from the output is assumed as, and the network loss of the bias network and the inter-stage matching network are assumed as and , respectively. From the simulation results, the overall loss for the negative feedback of the fundamental signal, , is 0.1124 and the value of is0.4145 with 0.00829 A/V being the value of the transconductance of the CG device of the source degeneration. Fig. 6 shows the magnitude comparison of the fundamental, second, and third harmonic signals at the output of the driver stage and the power stage.

Carrier amplitude =  $V_c + V_m Sin (\omega_m t)$ 

and the instantaneous value (value at any instant in time) is

$$\mathbf{v} = \{\mathbf{V}_{c} + \mathbf{V}_{m} \operatorname{Sin} (\boldsymbol{\omega}_{m} \mathbf{t})\} * \operatorname{Sin} (\boldsymbol{\omega}_{c} \mathbf{t})$$
$$= V_{c} \operatorname{Sin}(\boldsymbol{\omega}_{c} t) + V_{m} \operatorname{Sin}(\boldsymbol{\omega}_{m} t) * \operatorname{Sin}(\boldsymbol{\omega}_{c} t)$$
(3)

Using Sin A \* Sin B =  $\frac{1}{2}$  Cos (A-B) -  $\frac{1}{2}$  Cos(A+B) this becomes

$$V=V_{c}Sin(\omega_{c}t)+\frac{1}{2}V_{m}Cos((\omega_{c}-\omega_{m})t)-\frac{1}{2}V_{m}Cos((\omega_{c}+\omega_{m})t)$$

(4)

The harmonic signals are suppressed, and it should be noted that much of the suppression of the third harmonic is achieved at the high-power region. In addition, the fundamental feedback signal, which is the gate signal that follows the drain and the source signal in-phase, provides less operating variation. The operating region of the conventional cascade configuration with the fixed gate voltage of the CG device varies continuously because the large drain voltage swing reaches a lower point than the gate voltage, as shown in Fig. 5. Fig. 3 presents and compares the variation of the operation of the CG device.

As shown in Fig. 3, a portion of the cutoff region has declined from 40% to 20% within one waveform cycle by following the waveforms of Fig. 5 represents the operation region ratio during the turn-on region of the device.



Fig.2. schematic of the cascoding module

The triode region without the CFBT is removed, and thus mostly operates in the saturation region. Variations of the operating region of the transistor leads to fluctuation of which are major sources of nonlinearity of the PA. To verify the effect of operation region variations, the phase deviation of CG transistors are simulated. Fig.2 shows the comparison of the phase deviation of CG transistor in the driver stage and the power stage. It shows the reduction of phase deviation from  $30.8^{\circ}$  to  $9.8^{\circ}$  of the driver stage and from  $36^{\circ}$  to  $14^{\circ}$  of the power stage respectively over a wide output power level. Furthermore amplitude-to-phase (AM– PM) is derived as shown in Fig. 4 and they illustrate the reduction of AM–PM distortion in the CG transistors through the reduced fluctuation in operating This is a signal made up of 3 signal components

- **carrier** at  $\omega_c$  (rad/s)
- Frequency is  $f_c = \omega_c/2\pi Hz$
- **upper side frequency**  $\omega_c + \omega_m$  (rad/s)
- Frequency is  $(\omega_c + \omega_m)/2\pi = f_m + f_c Hz$
- lower side frequency  $\omega_c \omega_m$  (rad/s) Frequency is  $(\omega_c - \omega_m)/2\pi = f_m - f_c Hz$

The **bandwidth** (the difference between the highest and the lowest frequency) is

BW =  $(\omega_c + \omega_m) - (\omega_c - \omega_m) = 2*\omega_m \text{ Rad/s}(= \omega_m / \pi \text{ Hz})$  (5)



Fig.3. angular frequency and bandwidth

Voltages in these transistors. Yet another benefit is that the reliability of the cascade PA has been improved. In the cascade PA topology, the voltage Fig. 3 comparison of the ratio of the operation region variations. (a) Turn-on versus turn-off. (b) Triode versus saturation region during turn-on region. Stress between the drain and the gate becomes one of the biggest concerns for the reliability of the CMOS PA. When the gate signal follows the drain signal of the stacked device, the reliability issue can be resolved [11]. This technique reduces the peak-to-peak voltage difference of the CG transistor from 4.5to 1.9 V by forcing the gate signal to follow the source and the drain signals, as shown in Fig. 3.



Fig.4. power module for the amplifiers

This makes it possible to employ a standard transistor for the CG transistor, not a thick-oxide transistor, which allows wider bandwidth performance. Feedback Bias Network and the Control of the Feedback Factor the feedback bias network consists of an inductor and two capacitors in a T-configuration, shown in Fig. 4. Its original function was to achieve an ac ground for the gates of the CG device of the cascade for both the driver and power stage. It also provides a way to adjust the feedback factor; the following will give a detailed analysis of the bias network function, the impedances of each node, and the function and operation of the fundamental and harmonic signals in the bias network. The gates of the CG devices in the driver and power stages are labeled and, respectively, and the node between the driver and power stages of the CG devices is labeled. For the simple derivation, the additional parasitic capacitances such as the intrinsic capacitances of the device are neglected. With these design parameters, the resonance frequency from and is determined to 2.4 GHz.

Therefore, the operating fundamental frequency, which is between 1.92–1.98 GHz, is lower than the resonance frequency, and the fundamental signal has a 180° phase shift given the condition that the 1- term is more than 0. The practical range can be limited by the negative feedback condition that the 1- term is more than 0. However, the range can be expanded by reducing, thereby having a greater margin of the range to maintain the negative feedback condition. Thus, the network can achieve insensitiveness of more than 20% of the variations for the feedback condition. The feedback factor, , of the negative feedback is mainly controlled by adjusting in the feedback bias network. Since the impedance of increases as value of the decreases, the magnitude of signal swing on increases.

This is defined as 
$$\mathbf{m} = \frac{\mathbf{V}_{\mathbf{m}}}{\mathbf{V}_{\mathbf{c}}}$$

Using this Equation can be re-written as

 $\begin{aligned} v &= V_c \, Sin \left( \, \omega_c \, t \, \right) + \frac{1}{2} \left( V_m \, Cos \left( \, \left( \omega_c - \omega_m \right) t \, \right) - V_m \, Cos \right. \\ \left( \left( \omega_c + \omega_m \right) t \right) \, \right) * V_c \, / V_c \end{aligned}$ 

The larger signal swing makes a larger network loss of the bias network, and results in more loop gain, thereby changing the overall feedback factor. By adjusting from 12 to 2 pF for the linearity improvement, upto 1.1 dB of the additional gain change could be achieved after 2 dB of the original negative feedback gain reduction. For example, 2 pF of shows the improvement of the linearity compared with 12 pF of the bypass capacitor due to1.1 dB of additional gain reduction, as shown in Fig. 1. Fig. 1 presents the voltage difference between the drain and the gate of the CG device with the adjustment. Since the larger magnitude of the gate signal follows the drain signal of the CG device by changing from12 to 2 pF, the peak-to-peak voltage difference between the drain and the gate of the CG device is decreased from 2.1 to 1.5 V, improving the reliability of the PA, but sacrificing the 1.1 dB of gain.

Therefore, adjusting, which mainly functions as the feedback factor, can balance the gain, reliability, and linearity of the PA with the feedback network. However, because the gate node of the CG transistor will have less ac ground, as the bypass capacitor value decreases, there the optimum value of the bypass capacitor for PA performances. In this design, 11 pF of both and is chosen for optimized performance, and it provides the function of maintaining negative feedback, as well as achieving ac ground of the gate of the CG transistor. The derivation of feedback bias network was verified by simulations. Fig.10. Variations of Simulated IMD3. Voltage difference between the drain and the gate of the CG device.

### IV. DESIGN AND IMPLEMENTATION OF THE LINEAR CMOS PA

Fig. 5 is an overall schematic of the two-stage singleended CMOS linear PA. The PA cores are realized in a single-ended configuration to avoid the use of baluns to simplify integration, and provide smaller area for cost. They employ a deep -wellin both the driver and the power cells to reduce the noise and signal coupling with other components on the silicon substrate [12]. The widths of and in the driver stage are 1.5 and 0.9 mm, and the widths of and in the power stage are 5.6and 4.7 mm, respectively. To achieve minimal wire-bond inductance (better ac ground), multiple bonding wires to ground are utilized to minimize source degeneration effects of both driver and power cells, which typically can lead to a decrease in gain. With a supply voltage of 3.4 V, the gate bias voltage of the CS and CG device are set to 0.45 and 2.5 V, respectively, and they are carefully adjusted to achieve the sweet spot at the desired output power near 22 dBm, resulting in lower thirdorder inter modulation distortion (IMD3). To enhance reliability of the device sat 3.4-V operation, a 0.4- m thickoxide nMOS transistor is used in the power stage cascade CG transistor, and 0.18- m nMOS transistors are used for the both the CG and CS in the driver stage to compensate for the low RF power gain of the thick-oxide CG transistor in the power stage. The inter-stage matching, consisting of a shunt inductor and two series capacitors in a T-network, is employed to improve dc isolation, and allow tuning of the shunt inductor value via a bond-wire inductor

To achieve compact chip size with optimized performance, a high quality factor inductor is used for the output matching network while smaller size inductors with less quality factor are used for input- and inter stagematching networks and all capacitors are implemented by an on-chip metal–insulator–metal (MIM) capacitor.



Fig.4. Circuit diagram of the compensated LNA

#### V. LINEARIZATION TECHNIQUE OF CASCODE PAS USINGADAPTIVE BIAS CIRCUITS

Despite the attractive advantages of the CMOS technology in integration and cost, there are some problems to implement CMOS PA, such as no support of ground via, low reliability due to a low breakdown voltage, conductive substrate, and a low current-driving capability. In the CMOS process, a differential cascade structure PA is widely used to minimize the source grounding effect of the CS amplifier and to enhance the break down voltage by stacking two transistors into a cascade configuration. Because of the low transconductance of the CMOS device, large power cells are required to achieve a high output power. Parasitic components due to the large cell scan lead to degradation of the efficiency and linearity of the PA. Since the differential structure doubles the voltage swing to a balanced load; the load impedance can be two times larger. For the same load impedance, the output power can be four times larger than that of a single-ended structure. Modulation index using clustering:

$$m = \frac{V_m}{V_c} = \frac{\frac{1}{2} (V_m + V_m) - \frac{1}{2} (V_c - V_c)}{\frac{1}{2} (V_c + V_c) + \frac{1}{2} (V_m - V_m)}$$
(8)

$$= \frac{(V_{m} + V_{m}) - (V_{c} - V_{c})}{(V_{c} + V_{c}) + (V_{m} - V_{m})} = \frac{(V_{c} + V_{m}) - (V_{c} - V_{m})}{(V_{c} + V_{m}) + (V_{c} - V_{m})}$$
(9)

Hence, the use of a differential structure helps to mitigate the drawback of the source degeneration.

#### A. Bias Dependent using clustering:

The overall performance of a PA is strongly dependent on its operation class, which is determined by the gate bias. Generally a deep class-AB mode with low quiescentcurrent operates more linearly and efficiently than that with class-AB or class-Amod eat a high power region because the IMD minimum point moves towards near the compression point as shown in Fig. 1. It can be achieved through biasing the gate of the CS amplifier at a deep class-AB in the cascade PAs.

Therefore, deep class-AB operation is the most attractive operation for linear PAs design at a high power region. As mentioned earlier, the PA presents a severe nonlinear behavior near turn-on voltage in the deep class-AB operation and generates a large distortion at low and mid power regions. A differential cascade amplifier composed of the CS and CG amplifiers. The CS and CG amplifiers function as a main amplification as the first stage and a current buffer as the second stage, respectively, with a part of the output load seen by the CS stage. Combining a gain-expansion stage (deep class-AB) and a gaincompression stage (class-A) compensates the AM-AM distortion through reduction of the gain deviation. The examples are MGTR and multi-stage cascade structures. In a similar manner, the CG stage can be employed to reduce the gain deviation and IMD generated by the CS stage under the deep class-AB operation. Fig. 2 shows a block diagram of the linearization process. The CS amplifier operates in deep class-AB mode while the CG amplifier in Class-A mode in low and middle power regions. In a high power region, the CG amplifier operates in class-B mode. In summary, the CS stage operates in deep class-AB mode for the good high power performance while the CG stage operates in either class-A or class-Bmode according to the input power level to absorb the in modulation nonlinear distortion.

### B. LINEARIZATION TECHNIQUE USING ADAPTIVE BIAS CIRCUITS

In a large signal operation, the drain current of a CMOS PA is dependent on the drain bias as well as the gate bias. The drain bias of the CS device is determined by the CG gate bias in the cascade structure, as follows: where and are the gate and threshold voltages of the CG device, and is an envelope signal of the input signal. Hence, the gate bias of the CG device can be controlled for the optimized performance in designing cascade PAs. Simulation results

of the gain and IMD3 for a two-tone signal with 10-MHz tone spacing at 1.85-GHz center frequency are depicted in Fig. 3. For the simulation, the gate bias of the CS is fixed at 0.46 V and that of the CG is varied.



Fig.5. Schematic bias circuit designed to compensate for process

The gain deviation that causes AM-AM distortion decreases with increasing the CG gate bias. In contrast, the IMD3 behavior is quite complicated according to the CG gate bias. For a linear operation, the CG gate bias should be high (2.8 V) for output power less than 23 dBm and low (2.4 V) for over 23-dBm output power. Fig. 4 shows the optimum CG gate bias for the linearity according to the output power level. When the optimal CG gate bias is applied to the PA, the gain deviation and IMD3 are improved simultaneously by 2 dB and 6 dB, respectively, in the wide range of the low output region as shown in Fig. 5. Therefore, we use this optimally shaped CG bias to improve the linearity and gain deviation at the low power region, while the PA operated in a deep class-AB bias to get the better performance at the high power region. We have already reported the measured adjacent channel leakage ratio (ACLR), gain and power-added efficiency (PAE) as a function of the average output power with the CG gate bias [15]. For a 10-MHz bandwidth LTE signal at 1.85 GHz, the PA with the proposed bias circuit improves ranging from 7.0 dB to 2.5 dB over the constant bias circuit.

Therefore, the PA with the CG bias has a significant improvement in ACLR, which is affected by the reduction of the AM-AM distortion and IMD3.An envelope signal injection to the gate of the CS stage can improve the linearity at high power regions. To further improve the performance of the CMOS PA, the envelope injection technique is adopted using a class-D bias circuit [19]. To simplify the adaptive bias circuits, the envelope injection circuit is merged with the CG bias circuit. The schematic of the adaptive bias circuit is shown in Fig. 6. The integrated adaptive bias circuit consists of three parts: an envelope detector, CS bias and CG bias. The envelope detector amplifies the input signal using the NMOS device (BM1) and rejects RF carrier signal using the resistor and capacitor networks.

The power in the carrier will be

$$P_c = \frac{V_c^2}{R}$$
 Watts

The power in each of the frequencies is

$$P_{s} = \frac{(mV_{c}/2)^{2}}{R} = \frac{m^{2}}{4} \frac{V_{c}^{2}}{R} = \frac{m^{2}}{4} P_{c}$$
(10)

The total power is

$$P_t = P_c + P_s + P_s = P_c + 2 P_s = P_c (1 + 2 \frac{m^2}{4}) = P_c (1 + \frac{m^2}{2}) Watts$$
 (11)

The fraction of the power in the carrier is

$$\frac{P_{c}}{P_{t}} = \frac{1}{1 + \frac{m^{2}}{2}}$$
(12)

2

A class-D envelope amplifier (BM2 and BM3) injects an envelope signal into the gate of the CS stage with the initial gate bias, and a voltage divider bias circuit with the NMOS device (BM4) controls the gate bias of the CG stage. When the input envelope signal increases, the output node of envelope detector decreases, the PMOS (BM2) of the envelope amplifier begins to charge the from the initial value to the optimum value, and the resistance of the NMOS (BM4) increases to make the gate voltage of the CG stage from the initial value to a lower value. Controlling the bias from class-A to class-B of the CG stage is determined by the voltage difference between the Vcg and the drain voltage of the CS stage. As the input power increases, the decreases. Therefore, the class of the CG stage moves from A to B due to the decreasing voltage difference. It is difficult to design an envelope injection circuit for a wide bandwidth signal because the performance of the circuit degrades as the signal bandwidth increases. For reliability all transistors of the bias circuits employ thick oxide devices to prevent the break down. As mentioned earlier, the gate bias circuits operate at the envelope signal, generating large memory effects. Therefore the second harmonic control circuits at the source of the CS and the gate of the CG stages to eliminate the side band asymmetry (IMD or ACLR). In the next section, the memory effect reduction technique for the PAs will be discussed in more detail.

#### VI. MEMORY EFFECTS REDUCTION USING SECOND-ORDERHARMONIC CONTROL

#### A. Investigation on Memory Effect in RF Amplifiers

The memory effect is mainly caused by the reflections of the envelope frequency and second harmonics at the device terminals. Hence, to reduce the sideband asymmetries, it is the best practice to short the envelope impedance. In a real device, to make a short circuit condition at the envelope impedance, a low-pass filter consisting of external components is needed to generate a large time constant. These filters can also lead to AM-PM distortion in the PA. Therefore, control of the second harmonic impedances can be a practical solution to reduce the sideband asymmetries, as proposed in.

To analyze the causes of the memory effects, we review their generation process of the IMD that was introduced in earlier studies [31]. The drain current, the most significant nonlinear distortion source, can be expanded in the following two-dimensional Taylor-series, where only the dominant term are included where and is the gate and drain signal voltages, and the G terms represent the transconductance, drain conductance and cross terms, respectively. The higher order non linearity's are ignored to reduce the complexity of the analysis.

The optimum load impedance of the PA at the fundamental frequency is a pure real value due to the resonance of the imaginary part. Assuming the same amplitude two-tone input signal of A, including with the envelope injection, the input signal is given by is applied, the lower and upper IM3s of the drain voltage can be derived and simplified as where and are the lower and upper two-tone input frequencies, respectively, is the difference frequency of 5, is the center frequency of , and is the frequency-dependent load impedance which generates a memory through its imaginary part.

The significant difference between the (4) and (5) is the phase of the envelope IM3 terms that are changed to the opposite direction according to the tone spacing, while the second harmonic IM3 terms have the same phase. Therefore, in (4) and (5), the resultant IM3 terms are generated through composition of the third-order nonlinearity (intrinsic IM3) and the two second-order and fundamental interacted nonlinearities (memory IM3). Since the intrinsic IM3 is almost a real value due to the real fundamental load impedance, the term does not generate the memory. On the other hand, the second-order terms can generate the memory effects. There are two cases that can eliminate the sideband asymmetries; the second harmonic load termination is real or shorted and the load at the envelope frequency is terminated at a short-circuit condition. However, it is preferred to have a shorted second harmonic termination because of the resulting lower IMD3 and the needs of large external capacitors to terminate the envelope signal.

#### B. Control of Second Harmonic Impedances in a Differential Cascade PA

A differential cascade amplifier has two common nodes; on eat the source of the CS and another at the gate of the CG stage. The common nodes create a virtual ground at odd harmonics, including the fundamental frequency, but they do not create aground at even harmonics. The lowimpedance terminations at the common nodes for the even harmonics are essential for a RF amplifier design. In the real device, however, it is difficult to make the lowimpedance condition at the envelope and second

Harmonic frequency at the nodes of devices if a harmonic control circuit is not provided, as explained in [30]. If the source impedance of the CS stage does not terminate at the second harmonic, the linearity of the amplifier is deteriorated by the series feedback of the impedance. From (4) and (5), the gate impedance of the CG stage, which is a part of the output load seen by the CS stage, should be terminated properly. This termination allows significant reduction of the memory effects and second-order nonlinear distortions. Fig. 4 shows the second harmonic impedances at the gate of the CG and the source of the CS stages of the PA according to the input power at 1.85-GHz frequency. As shown, the both second harmonic impedances are not properly shorted.



#### Fig.6. Overall circuit for the process

These second harmonic impedances can make the sideband asymmetric and degrade the IMD3 by (4) and (5). After applying the second harmonic control circuits, the second harmonic impedances at the common nodes are properly terminated to a short-circuit condition, as shown in Fig. 4.To verify the linearization techniques, a simulation result of the IMD3 using a two-tone input signal with spacing 10 MHz at Fig. 3.

This result verifies that almost all of the memory effects in the PA can be suppressed, when applying the second harmonic control circuits at the common nodes. As a consequence, we can improve the linearity performance of the PA using the adaptive gate bias circuits with reduction of the sideband asymmetry. Fig. 7 shows a conventional DA constructed by MOSFETs and inductors. The basic idea is to use the inductive element together with the inherent parasitic capacitances in the active devices to create artificial transmission lines for wideband operation. The DA configuration is comprised of two (gate and drain) artificial transmission-line sections with series inductances and shunt capacitances. The inductors absorb the parasitic capacitances introduced from the transistors to achieve a wide band characteristic of the amplifier.

As the input signal propagates through the transistors by the gate line, the amplified signal of each stage is accumulated at the output by the drain line. The following equations provide simple guidelines for DA design, including the impedances of the gate and drain lines and also the signal phases in the gate and drain lines [15]: where and are the equivalent inductances of the gate line and drain line, respectively; and are the equivalent parasitic capacitances of the gate and drain nodes, respectively, and is the cutoff frequency. is the system characteristic impedance, which is typically 50. To ensure the forward signal can be constructively added, the propagation delay of the gate line and drain line should be made equal so that the signals are in phase. Design and implementation of a DA in CMOS technology encounters several challenges. One major concern is the loss Si substrate, which could introduce significant undesired parasitic sand degrade the circuit performance.

The DA structure often occupies a large chip area, making the parasitic effect more pronounced and more unpredictable in practical design. The spiral inductors are also commonly used in CMOS DAs [2], [8], [15],[16]. The low- inductors introduce losses and reduce the gain of the amplifier. The relatively low gain of CMOS at high frequencies also makes it difficult to achieve high-gain DAs. One possible solution is to use more gain stages, but with increased power consumption and chip area. A stage number ranging from3 to 6 is typically used for CMOS DA design [8], [15]–[17],[19].

#### VII. SIMULATION RESULTS

The proposed TIA is designed and fabricated in Global-Foundries' 0.18- $\mu$ m 1.8-V IC CMOS technology. L1 and L2are implemented using on-chip spiral inductors for the purpose of monolithic implementation and area efficiency. A capacitor of 0.25 pF is used to mimic a typical junction capacitance of the photodiodes. To facilitate the measurement setup two input coupling capacitors are added but it causes the transient impedance gain to reduce by around 2dB.The measurement results degrade due to under estimation of parasitic effects and process imperfection as compared to the post-layout simulation. Fig. 6 shows the transient analysis where the core has an area of 20 mm<sup>2</sup>. The power consumption is around 8.7 mW excluding the output buffering stages.

With the buffer included the total power consumption is 18.5 mW. The frequency response of the device is measured with ADL logic devices. Fig. 4 shows the measured trans impedance frequency response with green lines. It is observed that the low-frequency transimpedance gain for the positive single-ended port is around 46 dB and the -3-dB bandwidth is about 4 GHz.

As compared to the blue and red lines for pre-/postlayout simulation results, the peaking effect in the measured result is less severe. The pole within the band of interest is not tuned out well. This could be due to the nonideality of inductors as well as larger parasitic effects for real implementation. In addition, the unaccounted EM radiation loss, the silicon substrate loss, and process variations all contributed to the bandwidth degradation. A. Input Capacitance Insensitive one of the merits of this cross-coupled current convey or based TIA is its ability to tolerate large variations of total capacitive load at the input node. The major contributor of this load is the junction capacitance of the photodiode.

#### Fig.7. Transient analysis of amplifier stage 1

Therefore, this merit makes the proposed TIA circuit efficient to work for a wide range of systems with various types of photo diodes. As shown in (7), as the junction capacitance of photo diode CPD increases, Rc will decrease, and ri in (8) will decrease accordingly. Therefore, the time constant at the input load with increasing CPD can still keep at a low value since riis moving in the opposite direction. This property makes the usage of large area photodiodes possible since the TIA performance will not be that sensitive to the variations of CPD. With a larger area photodiode, a better signal power can be captured with acceptable degradation of the bandwidth and noise performance [6].



Fig.8. Transient analysis of amplifier cluster stage

For conventional design, the value of the photodiode junction capacitance directly affects the TIA bandwidth. As CPD increases, the bandwidth reduces proportionally. Table I shows the post-layout simulated bandwidth and transimpedance for different photodiode junction capacitances. It can be seen that the variations in bandwidth and transimpedance are kept within6% of the nominal value as CPD varies from 0.05 to 0.5 pF. Thus, the "zero differential impedance" property of the current conveyor stage indeed makes the TIA bandwidth less sensitive to the variations of CPD.

#### **B.** Differential transient Output

The other merit of the proposed TIA circuit is the ease of obtaining differential output signals. The proposed TIA is based on a cross-coupled structure; therefore, it is convenient to make the TIA circuit differential. Unlike the conventional single-ended TIA circuit, where an extra single to differential conversion step is needed, this crosscoupled structure provides a good base for a differential TIA circuit as shown in Fig. 3. Due to the measurement setup, the negative output port has been internally terminated with 50 resistors. As a result, only the schematic simulation results are provided here for demonstration and comparison. The drawback of this differential TIA circuit is that the two output signals do not have the same swing magnitude as shown in Fig. 8, since the negative input terminal has no input signal supply.



Fig.9. Transient analysis of amplifier cluster stage 1



solid line, which represents the output signal at the positive terminal has a peak-to-peak swing of 72 mV with a 200 µA input current source, while the bar representing the output signal at the negative terminal only has a peak-to-peak swing of 30 mV. This difference in swing magnitude will be greatly reduced and eventually eliminate din subsequent amplification stages. However, due to the cross coupled structure; there will always be a small phase difference between the two output signals. Again, the phase difference can be controlled to a negligible level. As shown in Fig. 8, the phase difference is within 5 ps for a 5-GHz sinusoid input waveform. To minimize the phase difference, first, faster transistors are desired to bring smaller delay in the cross coupled input stage; second, proper layout routing and floor planning must be considered in the design. It is better to have as short routing as possible in the crosscoupled input stage.

The measured input-referred noise current spectral density is from 600 MHz to 6 GHz. The measured inputreferred noise current spectral density is within 18 pA/ $\sqrt{Hz}$ . The average input-referred noise current spectral density over 6 GHz is estimated to be 10pA/ $\sqrt{Hz}$ . As the negative output port is internally terminated for measurement purpose, this measured input-referred noise 10 pA/ $\sqrt{Hz}$  is the output noise with respect to the positive single-ended gain. Therefore, based on the analysis in the previous section, the overall input-referred noise current is estimated to be 4.1 times larger than the positive single-ended part, that is 40 pA/ $\sqrt{Hz}$ . As compared to other CMOS TIA design, this cross-coupled current conveyor based TIA has an apparent advantage in the noise performance with single-ended configuration.



Fig.10. Transient analysis of amplifier cluster stage 2 Fig.11. Transient analysis of CS amplifier

The group-delay variation is the other important parameter in determining the amount of ISI and jitter introduced in TIA. Even with large enough bandwidth, distortions in the form of data-dependent jitter may occur if the phase linearity of the transimpedance response ZT (f) is insufficient [2], [6]. As the common parameter to measure the phase linearity, group-delay with respect to frequency is calculated and shown in Fig. 10.It can be observed that within the bandwidth of 4 GHz, the group delay is about 125±25 ps. Fig. 9 shows the analysis of the positive output terminal measured by Agilent 86105C with 4.25 Gb/s 231 –1proactive Reed–Solomon bypass (PRSB) data pattern generated by Agilent J-BERT N4903A. The output has a peak-to peak value around 51 mV with input level to be 200  $\mu$ A, and t has a peak-to-peak jitter of 28 ps. For 4.25 Gb/s 231 - 1PRSB data pattern, the electrical sensitivity is measured tobe 19 µA for a bit-error-rate of 10-12, which is corresponds to an optical sensitivity of -15dBm, assuming the optical responsively of photodiode to be 0.3 A/W.D. Performance Comparison Table I summarizes a detailed performance comparison of the proposed TIA with



several other CMOS TIA designs.

Total power = 
$$25 = P_c (1 + \frac{m^2}{2}) = P_c (1 + \frac{0.3^2}{2}) = P_c *$$
  
1.045 (13)

Table I:

|            | Total<br>Power<br>(mW) | Area<br>Mm <sup>2</sup> | Number of<br>BUFFER | Power<br>overhead<br>(%) |
|------------|------------------------|-------------------------|---------------------|--------------------------|
| This paper | 0.54                   | 20                      | 2                   | 0.8                      |
| [1]        | 1.24                   | 32                      | 3                   | 2.5                      |
| [2]        | 2.74                   | 33.4                    | 3                   | 2.3                      |
| [3]        | 3.47                   | 34.5                    | 3                   | 2.0                      |



Fig.11. Comparison chart

We consider the power overhead in generating the phase-shifted clock using delay buffers. Given the largest clock period from Table I, we note the worst case where the largest amount of delay needs to be inserted from the chain of buffers. We found that given the propagation delay characteristic shown in Fig. 7, if we use the buffer with the least amount of propagation delay, it is only necessary to insert seven buffers maximum to generate the delays necessary for the phase-shifted clocks. The power overhead percentage is shown in Table II. The total power column represents the amount of power consumption for each benchmark in its original configuration. From our

#### **REFERENCES:**

[1] Ransford Hyman, Jr., Member, IEEE, Nagarajan Ranganathan, Fellow, IEEE, Thomas Bingel, and Deanne Tran Vo "A Clock Control Strategy for Peak simulation results, we note that the phase-shifted clock average-power is small, averaging only 1.5% over the entire suite of benchmark circuits. It is important to note that this is an extreme upper bound for our smaller benchmark and we should expect a much smaller percentage in overhead in the actual (physical) implementation.A larger transconductance gm6requires a larger current, and thus leads to larger power consumption. For very high speed, e.g., 10 Gb/s systems, band width enhancement is required, which can be achieved through layout/process optimization, over design as discussed above. The advantage in noise performance makes the proposed TIAan excellent candidate for high speed low noise and low power applications. The common source amplifier as another design example because it is one of the most efficient single transistor amplifiers that can be implemented in standard CMOS technologies. Measured results of samples of the CS shown in analysis and the results are tabulated along with the comparison table.

#### CONCLUSION

In this paper we developed a general design methodology to compensate for voltage gain variations of common amplifier topologies with the help of the clustering methodologies. Where gain is a strong function of transconductance and in our work, the first experimental demonstration of compensation of the gain of amplifier designed in a submicron process using statistical feedback to track changes in Vth due to process and temperature and by generating an appropriate bias signal to the amplifier, we experimentally demonstrated-without any post-fabrication trimming or calibration 4.7× reductions in gain variation of low noise amplifiers and common source amplifiers designed in the CADENCE 180 nm CMOS process in Virtuoso environment. We also showed that our scheme can successfully reduce variations arising from fluctuations in supply voltages. Results obtained from our design examples confirm that our scheme can easily be adapted to other amplifier topologies where transconductance determines gain such as differential amplifier common gate amplifier and operational transconductance amplifier. Our compensation method occupied a small footprint and had a low power overhead of 8% making it attractive for a variety of robustnes, low power devices, mixed signal system. By regulating the gain of amplifier, our scheme increased overall yield of system, reduces cost, and decreases turnaround time in VLSI devices.

Power and RMS Current Reduction Using Path Clustering" IEEE TRANSACTIONACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMEMS, VOL. 21, NO. 2, FEBRUARY 2013

[2] Yow-Tyng Nieh, Shih-Hsu Huang and Sheng-Yu Hsu, "Minimizing Peak Current via Opposite-Phase

Clock Tree", Proceedings of the 42nd annual Design Automation Conference, ACM digital library, 2005.

[3] M. Ester, H. Kriegel, J. Sander, and X. Xu. A Density- Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. Proc. of the 2nd Int'l Conf. on Knowledge Discovery and Data Mining, August 1996.

[4] M. Bellos, D. Bakalis, D. Nikolos and X. Kavousianos, "VECTOR EPETITION AND MODIFICATION FOR PEAK POWER REDUCTION IN VLSI TESTING", IEEE Workshop on Design and Diagnostics of Electronic Circuits and Systemems, 2005.

[5] WaiChing Douglas Lam, ChengKok Koh and ChungWen Albert Tsao, "Power Supply Noise Suppression via Clock Skew Scheduling", International Symposium on Quality Electronic Design, 2002.

[6] Inderjit S. Dhillon and Dharmendra S. Modha, "A Data-Clustering Algorithm on Distributed Memory Multiprocessors", Springer, 2002, pp.245-260.

[7] Paul S. Bradley Usama M. Fayyad Cory A. Reina, "Scaling Clustering Algorithms to Large Databases", American Association for Artificial Intelligence, Microsoft Research, 1998.

[8] Saraju P. Mohanty, N. Ranganathan and Sunil K. Chappidi, "Simultaneous Peak and Average Power Minimization during Datapath Scheduling for DSP Processors", ACM digital library, 2003.

[9] Hyman, R, Ranganathan, N, Bingel, T. and Tra Vo, D., "A Clock Control Strategy for Peak Power and RMS Current Reduction Using Path Clustering", IEEE Transactionactions on Very large Integration Systemems, Vol.21, Issue 2, 2013.

[10] K. Ravindran, A. Kuehlmann, and E. Sentovich, "Multi-domain clock skew scheduling," in Proc. Int. Conf. Comput.-Aided Design (ICCAD), 2003.

[11] A. Manzak and C. Chakrabarti, "A low power scheduling scheme with resources operating at multiple voltages," IEEE Transaction. Very Large Scale Integr. (VLSI) System., Feb2002.

[12] M. Johnson and K. Roy, "Datapath scheduling with multiple supply voltages and level converters," ACM Transaction. Design Autom. Electron. System., vol. 2, no. 3, pp. 227–248, Jul. 1997.

[13] J. M. Chang and M. Pedram, "Energy minimization using multiple supply voltages," IEEE Transaction. Very Large Scale Integr. (VLSI) System., Dec. 1997.

[14] Y. R. Lin, C. T. Hwang, and A. C. H. Wu, "Scheduling techniques for variable voltage low power design," ACM Transaction. Design Autom. Electron. System., Apr. 1997.

[15] S. P. Mohanty and N. Ranganathan, "Energy efficient scheduling for datapath synthesis," in Proc. Int. Conf. VLSI Design, 2003.

[16] S. Raje and M. Sarrafzadeh, "Variable voltage scheduling," in Proc. Int. Symp. Low Power Electron. Design, 1995.

[17] A. Manzak and C. Chakrabarti, "A low power scheduling scheme with resources operating at multiple voltages," IEEE Transaction. Very Large Scale Integr. (VLSI) System., vol. 10, no. 1, pp. 6–14, Feb. 2002.

[18] S. P. Mohanty, N. Ranganathan, and V. Krishna, "Datapath scheduling using dynamic frequency clocking," in Proc. IEEE Comput. Soc. Annu. Symp. VLSI, 2002, pp. 65–70.

[19] S. Park and K. Choi, "Performance-driven High-Level Synthesis with Bit-Level Chaining and Clock Selection," IEEE Transaction. Comput.-Aided Design Integr. Circuits SystemFeb. 2001.

#### **Author Profile**



NAVEENKUMAR D received the Bachelor of Engineering in Electrical and Electronics Engineering from V.L.B. Janakiammal college of Engineering and Technology, Coimbatore, Anna university of

Technology, Coimbatore in 2011. Currently doing Master of Engineering in VLSI Design in Karpagam University, Coimbatore. His research interests include VLSI design in low power, Clustering methods in low power.



Rajaram A received the BE degree in electronics and communication Engineering from the Govt.,College of Technology, Coimbatore, AnnaUniversity, Chennai, India, in 2006, the ME degree in electronics and

communication engineering (Applied Electronics) from the Govt., college of Technology, Anna University, Chennai, India, in 2008 and he received the Ph.D. degree in electronics and communication engineering from the Anna University of Technology, Coimbatore, India in March 2011. He is currently working as a Associate Professor, ECE Department in Karpagam University, Coimbatore, India. His research interests include mobile adhoc networks, wireless communication networks (WiFi, WiMax HighSlot GSM), novel VLSI NOC Design approaches to address issues such as low-power, cross-talk, hardware acceleration, Design issues includes OFDM MIMO and noise Suppression in MAI Systems, ASIC design, Control systems, Fuzzy logic and Networks, AI, Sensor Networks.