#### 針對內建人工智慧之車用晶片且以 【零故障】為目標之智慧型測試方法

# Process Resilient Fault-Tolerant Delay-Locked Loop Using TMR with Dynamic Timing Correction

楊竣宇 (Darren J.-Y. Yang) 03/26

IC-Design Exploration Lab Department of Electrical Engineering National Tsing Hua University, HsinChu, Taiwan



# Outline

- Introduction
- Evolution of Fault and Error Tolerant (FET) DLL

1

- Experimental Results
- Conclusion

# Outline

- Introduction
  - Background
  - Problem
  - Objective
- Evolution of Fault and Error Tolerant (FET) DLL
- Experimental Results
- Conclusion

# **Clock Distribution Problem**

- In a heterogeneously integrated Multi-Die IC
  - Functional dies are designed and fabricated with different process
- Important building blocks
  - Phase-Locked Loop (PLL)
  - Delay-Locked Loop (DLL)



C.-Y. Cheng, S.-Y. Huang, D.-M. Kwai, and Y.-F. Chou, "DLL-Assisted Clock Synchronization Method for Multi-Die ICs", Proc. of IEEE Int'l Conf. on Computer Design, pp. 473-476, Nov. 2017.

# **DLL-Assisted Clock Synchronization**

- Objective: All Flip-Flop in each die receive clock at the same time
- Different dies  $\rightarrow$  Different Within-Die Clock Latencies
  - $\rightarrow$  Flip-Flops (FFs) receive the clock signal with skews



• The inter-die clock skew is to be minimized by inserted DLLs

• The width of a DLL box denoted as its input-to-output delay

## Architecture of a Basic DLL

- The basic DLL consists of 3 components:
  - (1) Phase Detector (2) Controller (3) Tunable Delay Line (TDL)
- DLL will experience two stages:
  - (1) Phase Locking (2) Phase Tracking



# Performance We Care – Max. Phase Error

#### Definition of maximum phase error

• The worst-case phase error amount between clk\_ref and clk\_out over a time frame (e.g., 1000 cycles) after the DLL is locked!



# **Objective of This Work**

- Build a Fault and Error Tolerant (FET) scheme for clock subsystem
  - Especially for Delay-Locked Loop (DLL)



# **Objective of This Work**

- Hardware redundancy is suitable for clock subsystem
- Active redundancy will need other circuit to monitor our DLL
- We take passive redundancy Triple Module Redundancy (TMR)



# Outline

#### Introduction

- Evolution of Fault and Error Tolerant (FET) DLL
  - Naïve FET-DLL Architecture
  - FET-DLL with static timing correction
  - Process-Resilient FET-DLL with dynamic timing correction
- Experimental Results
- Conclusion

# **Naïve FET-DLL Architecture**

- Three primitive DLL decide the output through VOTER
- Each DLL performs their phase-locking simultaneously



10

# **Output Lagging Problem**

#### The VOTER circuit takes time to calculate

- $\delta_{voter1}$  = Delay ( $\phi$ 1 $\rightarrow$ clk\_out)
- $\delta_{voter2}$  = Delay ( $\phi 2 \rightarrow clk_out$ )
- $\delta_{voter3}$  = Delay ( $\phi$ 3 $\rightarrow$ clk\_out)
- Max( $\delta_{voter1}, \delta_{voter2}, \delta_{voter3}$ ) greater than 100ps





# FET-DLL with static timing correction

#### Fix the output-lagging problem

Add dummy voter circuit on its feedback path



### **Detailed Voter circuit and Its dummy**

Feedback signal dominate dummy output by input assignment



## **Timing Relationships after timing correction**

- Clk\_out and {fb1, fb2, fb3} have similar phase
- →Since {fb1, fb2, fb3} in-phase with clk\_ref
- →clk\_out is roughly in-phase with clk\_ref



### **Process Variation Issue**

#### Ideally, we have wished that Delay of {V1, V2, V3} = Delay of the "VOTER"

But in reality,

there could be mismatch due to process variation

# Outline

#### Introduction

#### • Evolution of Fault and Error Tolerant (FET) DLL

- Naïve FET-DLL Architecture
- FET-DLL with static timing correction
- Process-Resilient FET-DLL with dynamic timing correction
- Experimental Results
- Conclusion

# **FET-DLL with Dynamic Calibration**

Enhance FET-DLL incorporating a "dynamic timing correction"



# **Tunable Delay Element (TDE)**

- Delay amount from input to output can be tuned by two level
  - Tunable driving strength controlled by  $\beta\text{-code}$
  - Tunable output capacitance controlled by  $\gamma$ -code



## **Overall Online Calibration Procedure**



#### Calibration for one DLL At a time



# Once one DLL-instance is faulty ...



# Outline

- Introduction
- Evolution of Fault and Error Tolerant (FET) DLL
- Experimental Results
- Conclusion

# Layout of FET-DLL with Dynamic Calibration

- The FET-DLL design using a 90nm CMOS process
- The primitive DLL instance is a synthesizable one in reference



Z.-H. Zhang, W. Chu, and S.-Y. Huang, "A Ping-Pong Methodology for Boosting the Resilience of Cell-Based Delay-Locked Loop", IEEE Access, Vol. 7, pp. 97928-97937, Aug. 2019.

# 1<sup>st</sup> Post-Layout Simulation Scenario

- When there is a random timing drift at  $\varphi\mathbf{1}$ 



**Output clk\_out is not affected !** 

# 2<sup>nd</sup> Post-Layout Simulation Scenario

• When there is a short-pulse error at  $\varphi\mathbf{1}$ 



**Output clk\_out is not affected !** 

# Max. Phase Error Comparison

• The performance of 4 versions of DLL design (5 Corners)

| DLL Version                | Max. Phase Error (ps) |  |
|----------------------------|-----------------------|--|
| Primitive DLL              | 10                    |  |
| (not fault/Error tolerant) |                       |  |
| FET-DLL with Naïve TMR     | 130                   |  |
| FET-DLL with               | 20                    |  |
| Static Timing Correction   |                       |  |
| FET-DLL with               | 11                    |  |
| Dynamic Timing Calibration |                       |  |

45%

# Outline

- Introduction
- Evolution of Fault and Error Tolerant (FET) DLL
- Experimental Results
- Conclusion

## Conclusion



## **Future Work**

• FET-DLL with graceful degradation via a low-cost Excessive Phase Error Detector



# **Future Work**



# Thank you !