

# Thermo-electrical modelling of the ATLAS ITk Strip Detector

Graham Beck<sup>a</sup>, Kurt Brendlinger<sup>b</sup>, Yu-Heng Chen<sup>b</sup>, Georg Viehhauser<sup>c</sup>

<sup>a</sup>*Queen Mary University of London, Mile End Road, London E1 4NS, UK*

<sup>b</sup>*Deutsches Elektronen-Synchrotron DESY, Notkestraße 85, 22607 Hamburg*

<sup>c</sup>*University of Oxford, Keble Rd, Oxford OX1 3RH, UK*

## Abstract

In this paper we discuss the use of linked thermal and electrical network models to predict the behaviour of a complex silicon detector system. We use the silicon strip detector for the ATLAS Phase-II upgrade to demonstrate the application of such a model and its performance. With this example, the thermo-electrical model is used to test design choices, validate specifications, predict key operational parameters such as cooling system requirements, and optimize operational aspects like the temperature profile over the lifetime of the experiment. The model can reveal insights into the interplay of conditions and components in the silicon module, and it is a valuable tool for estimating the headroom to thermal runaway, all with very moderate computational effort.

*Keywords:* Silicon detector, Thermal runaway, Thermal management, Cooling

## 1. Introduction

The temperatures in silicon detector systems are critically important to their performance. Fundamentally, the leakage current of a silicon sensor has a pronounced temperature dependence

$$I \propto T_S^2 e^{-T_A/T_S}, \quad (1)$$

where  $T_S$  is the sensor temperature and  $T_A \simeq 7000$  K. Leakage currents in the silicon sensor can become particularly significant after irradiation, and the heat generated by these leakage currents, together with the heat from front-end electronic components on the detector, needs to be removed by a cooling system. The capability of the cooling system to remove this heat is limited by the temperature of the local cold sink (typically a circulated fluid) and the thermal impedance of the heat path between the source (electronics and sensor) and the sink. Due to the strong growth of leakage power with temperature, there is a critical temperature  $T_{\text{crit}}$  above which the heat cannot be removed quickly enough, and the detector becomes thermally unstable ('thermal runaway')<sup>1</sup>. Understanding the thermal behaviour and the headroom to thermal runaway is crucial for the design of a silicon detector system. Even before the limit of thermal stability is reached, temperatures in silicon detector systems have a major impact on key system parameters such as power supply capacity and cable dimensions, necessitating an accurate estimate.

In addition to the silicon, there can be aspects of the front-end electronics that have a temperature dependence. In the strip system for the ATLAS Phase-II upgrade [1], which is the subject of this case study, there are two additional temperature-dependent heat sources. The first is a radiation damage effect in the readout electronics, which leads to an increase in the digital power of the chip whose magnitude depends on the total ionisation dose (TID) and the temperature of the chip [1]. This phenomenon was first observed in the ATLAS IBL [2]. The other temperature dependence of a power source stems from the converter chip (FEAST [3]) used in the on-detector DC-DC converter system supplying power to the front-end electronics.

In principle, the temperatures in the system for a given set of operational parameters (power density, thermal conductivities, etc.) can be predicted by FEA to an accuracy that is limited only by the quality

<sup>1</sup>In a real detector system, the resulting growth of sensor temperature would be arrested by overcurrent limits in the power supplies, resulting in a reduction of the bias voltage. At the same time, the increased current leads to an increase of the noise, such that the overall result is a degradation of the S/N performance of the system.

31 of the input parameters. However, this is a time-consuming process and can be prohibitively difficult if a  
 32 number of local heat sources depend non-linearly on temperature. A simplification to this problem that  
 33 allows for an analytical solution in the case of a simple heat source topology has been developed in [4]. Here  
 34 we develop this method further to include several temperature-dependent non-linear heat sources in the front-  
 35 end electronics. The resulting set of equations cannot be solved analytically anymore, but the solution can  
 36 be found with little effort using numerical problem solvers. This enables us to predict with some confidence  
 37 the temperatures and power requirements in the ATLAS strip system throughout Phase-II operation. The  
 38 results from this prediction have been used throughout the ATLAS strip project to consistently dimension  
 39 the different systems (cooling, power, services, etc.), including an appropriate margin due to the inclusion  
 40 of a common set of safety factors. This method can be easily adapted to any other system by adjusting the  
 41 model to the system-specific geometries and parameters.

#### 42 1.1. The ATLAS strip system

43 The strip system for the ATLAS Phase-II upgrade consists of two parts: the barrel system, comprised of  
 44 four concentric cylindrical barrels, and two endcaps consisting of six disks each.

45 In the barrel, the detector modules are made of square sensors ( $96.85 \times 96.72 \text{ mm}^2$ ) with a hybrid on top,  
 46 which hosts the front-end chips (ABC130 [5] and HCC [1]) as well as circuitry to convert the supply voltage  
 47 of larger than 10 V to the chip voltage of 1.5 V, controlled by the FEAST chip. The modules are glued onto  
 48 both sides of a composite sandwich that contains two parallel thin-wall titanium cooling pipes embedded  
 49 in carbon foam (Allcomp K9) between two facesheets of UHM carbon fibre (3 layers of K13C2U/EX1515)  
 50 with a co-cured Kapton/copper low-mass tape. A model of this geometry is shown in Fig. 1. During final  
 51 operation, cooling will be achieved by evaporating CO<sub>2</sub> in the cooling pipes with a final target temperature  
 52 no higher than  $-35^\circ\text{C}$  anywhere along the stave.

53 The geometry of the stave is uniform along its length, with the exception of the end region of the stave,  
 54 where an End-Of-Substructure (EOS) card is mounted on both surfaces. The EOS card shares part of its  
 55 heat path with the module; underneath the EOS card, the thermal path is degraded by the presence of  
 56 electrically-insulating ceramic pipe sections. The thermal and electrical properties of a module adjacent to  
 57 the EOS card (hereafter referred to as an ‘EOS module’) are sufficiently different from other modules along  
 58 the length of the stave (‘normal modules’) to warrant separate treatment in the thermo-electrical model of the  
 59 barrel.



Figure 1: Strip barrel local support geometry. On the left, a complete stave is shown (EOS card in the foreground). The right picture shows a cross-section of the stave with the two cooling pipes visible inside the core.

60 The endcap system consists of two endcaps composed of 6 disks each. Each disk contains 32 ‘petals,’ the  
 61 local substructure depicted in Fig. 2. Both sides of the petal are loaded with 6 silicon modules, each with

62 a distinct design, located at increasing radius from the beam pipe and labeled R0 through R5 (where ‘R’  
 63 stands for ring). Each endcap module consists of one or two irregularly-shaped silicon sensors and a varying  
 64 number of front-end chips and DC-DC converters. The EOS card is located adjacent to the R5 module, but  
 65 the cooling pipes run directly underneath it without a shared heat path, in contrast to the barrel EOS card.  
 66 The remaining module and petal core design details are largely identical to the barrel module description  
 67 above. Because of the unique geometry of each module in a petal, each of the six different types of module  
 68 are modelled separately in the thermo-electrical model.



Figure 2: The geometry of the endcap strip petal, featuring 6 distinct module designs. A close-up of the R0 module is shown on the right.

### 69 1.2. Radiation environment

70 A key input to the thermo-electrical calculation is the radiation environment of the strip system, as several  
 71 inputs depend on radiation damage effects. The sensor leakage current can be parametrized as a function  
 72 of the fluence expressed in 1 MeV neutron-equivalents, and the TID effect on the digital chip current will  
 73 be described as a function of the total ionizing dose rate (more details on its dependencies can be found in  
 74 Section 7).

75 Predictions for both of these quantities have been generated for each point in the ITk using the FLUKA  
 76 particle transport code and the PYTHIA8 event generator (Fig. 3) [6]. In the barrel system, both of these  
 77 distributions display a strong dependence on  $r$  but a weak  $z$ -dependence. Accordingly, we make the simplifying  
 78 assumption that modules within the same barrel layer have identical fluence and TID, and model four  
 79 different radiation profiles (one for each barrel layer). In the endcaps, the radiation levels vary significantly  
 80 over the length of the petals and from disk to disk; therefore, we model each disk and ring position separately  
 81 (36 in total).

## 82 2. The electrical model

83 The electrical model consists of low-voltage (LV) and high-voltage (HV) circuits, depicted in Fig. 4. The  
 84 LV current (11 V) is used to power the hybrid controller chips (HCCs), ATLAS Binary Chips (ABCs) and  
 85 Autonomous Monitoring and Control chip (AMAC) located on PCBs that are glued directly onto the surface  
 86 of the sensor. These chips require between 1.5 and 3.3 V, which are provided by the temperature-dependent  
 87 FEAST DC-DC converter (labeled bPOL12V in Fig. 4) and an LDO regulator (labeled bPOL12). The num-  
 88 ber of chips and converters on each module vary according to the design of each different module type (barrel  
 89 short-strip and long-strip modules, and six different endcap module designs). A barrel or endcap module con-  
 90 tains 10–28 ABC chips, 1–4 HCCs, and 1–2 of each of the other components (linPOL12V/bPOL12V/AMAC).

91 The LV current is also delivered to the EOS card to power various data transfer components (the GBLD,  
 92 LpGBT and GBTIA). A FEAST identical to the one used on the module is used to step the voltage down  
 93 from 11 V to 2.5 V, and an additional LDO regulator brings the voltage down further for some components.  
 94 Some modules (the short-strip barrel staves) contain two GBLD and LpGBT chips.

95 The bus tape, which carries both LV and HV currents, has a small ohmic resistance, which impacts the  
 96 module in two ways. First, the tape itself will generate some heat according to the amount of current passing



Figure 3: The ATLAS ITk radiation environment. (a) 1 MeV neutron equivalent fluence and (b) total ionizing dose. Both plots are for an integrated luminosity of  $4000 \text{ fb}^{-1}$  [6].



Figure 4: The electrical model of the ITk Strip barrel and endcap modules. Green arrows represent temperature-dependent heat sources, while orange arrows are temperature-independent. Grey squares are chips.

through it; this source of heat is accounted for in the model, however the contribution to the total module power is negligible. Second, due to the voltage loss along the traces, a slightly lower voltage is supplied to modules on one end of the substructure (farthest from the EOS card). The treatment of this effect is slightly different in the barrel and endcap models: in the barrel, the voltage delivered to every module is averaged to 10.5 V; in the endcap, the  $\Delta V$  is estimated based on the calculated expected power loss along the tape for each module. In both cases, the impact of using a different treatment is small.

Finally, the HV current provides the voltage bias on the silicon sensors. An HV multiplexer switch (HVMUX) can be used to disconnect the sensor from the bias line (it requires a  $10\text{ M}\Omega$  resistor parallel to the sensor in order to function). Two HV filters with an effective resistance of  $10\text{ k}\Omega$  are situated in series with the sensor. The nominal operating voltage of the sensor is expected to be 500V, but the system is designed to handle a voltage bias of up to 700V.

108 **3. The thermal model**

109 The thermal network consists of heat sources (some of which are temperature-dependent) and thermal  
110 resistances. The latter are given by the properties of the mechanical design (heat conductivities of the  
111 materials) and the geometry of the heat path. The geometry is generally 3-dimensional, but it is the strategy  
112 of the simple network models to lump the 3D behaviour into one thermal resistance parameter. In the models  
113 discussed here, we have used a granularity corresponding to single detector modules for which the thermal  
114 resistance has been modelled. The temperatures in the model are then given for the nodes in the network in  
115 analogy to the potentials in an electrical network.<sup>2</sup>

116 The complexity of the thermal network used in this study, depicted in Fig. 5, is given by the variety of  
117 temperature-dependent heat sources in the ATLAS strip system. These sources consist of the digital power  
118 for each type of chip, the FEAST chip providing the on-detector DC-DC conversion, and the sensor leakage  
119 power. In the ATLAS ITk strip modules, all of these components are located on top of the sensors, such  
120 that the heat generated in them flows through the sensor into the support structure, the stave (barrel) or  
121 petal (endcap) core with the embedded cooling pipe. In the network model, the heat flow from these sources  
122 combines and travels through a common impedance  $R_M$  to the sink at a temperature  $T_C$ . For each of the  
123 temperature-dependent heat sources (ABC, HCC, FEAST and the sensor) we have added a resistance from  
124 the common temperature  $T_{\text{mod}}$  to allow for a finite and different heat path for each of them. Finally, the  
125 EOS card adjacent to the last module on the barrel stave or endcap petal is modeled as an additional source  
126 of heat with an independent impedance for its unique thermal path.



Figure 5: Thermal network model.

127 This is a more complex thermal network than the one studied in Ref. [4], for which an analytical solution  
128 for the determination of thermal stability is given. In particular, because of the non-linear temperature  
129 dependence of some of the heat sources, it is not possible in the present case to solve the set of equations  
130 describing the model analytically. However, the set of equations is still sufficiently small to solve numerically  
131 using functional programming languages such as Mathematica (used in the barrel model) or Python (used  
132 in the endcap system).

---

2Historically, Fourier's description of heat conduction pre-dated and inspired Ohm's work on electrical resistive networks. Here we followed the opposite direction.

133 **4. Obtaining thermal impedances using FEA**

134 The cooling path between the sources dissipating electrical power and the cooling fluid is 3-dimensional  
 135 and includes components with orthotropic thermal conductivity. Hence the prediction of temperature at any  
 136 node of the model requires a 3D thermal FEA [7, 8]. However, the thermal conductivities of the components  
 137 along the path are approximately constant, so that the temperature rise  $\Delta T_i$  above the coolant temperature  
 138 of any node  $i$  ( $i = \text{ABC, HCC, AMAC, FEAST, tape, RHV, or sensor}$ ) in the thermal network model is  
 139 adequately described by a linear sum of contributions from individual sources, i.e:

$$\Delta T_i \equiv T_i - T_C = R_i P_i + (R_C + R_M) \sum_j P_j, \quad (2)$$

140 where the index  $j$  runs over all powered nodes. (We have momentarily ignored the contribution from the  
 141 EOS card.)

142 In order to extract the thermal impedances for the thermal network model, the finite element model is run  
 143 multiple times, with each heat source (or group of similar sources) switched on in turn with a representative  
 144 amount of heat. In each of these cases, the temperature is calculated for all nodes in the thermal network  
 145 model (Figure 5). The temperature of a node is here taken as the average of the temperatures for all the  
 146 gridpoints in the FEA model within the volume of the object corresponding to the node<sup>3</sup>. The thermal  
 147 impedances are then obtained from a fit of Eq. 2 using the temperature data for all nodes for all cases of  
 148 heat injection.

149 Because of the nature of the network, the fitted value for the common impedance  $R_{CM} = R_C + R_M$  is  
 150 determined by the observed temperature rises of components where no heat is injected. The linearity of  
 151 this relationship is illustrated in Fig. 6. The values for the additional component-specific impedances are  
 152 in turn driven by the observed temperature rise in the component with the heat injected in that particular  
 153 component.



Figure 6: The relationship between the temperature rise observed in the FEA for a specific component and the heat injected in another component. The slope of the fitted line is the estimate for  $R_{CM}$ . (a) The fit for a short-strip barrel module adjacent to the EOS card. (b) The fit for the endcap R0 module. For each data point marker, the source of power is indicated by the shape, and the measured component is indicated by the color. The blue band represents a  $\pm 20\%$  error band on the fit for  $R_{CM}$ .

<sup>3</sup>This is particularly interesting in the case of the sensor, which fills a large volume, with a potentially large range of temperatures. Ref. [4] is not clear about the exact definition of the sensor temperature to be used for the calculation of the thermal impedance. In fact, at the time we were still using the maximum sensor temperature for the calculation of thermal stability. Since then we have acquired more experience with thermal network models and found that the best agreement is achieved if the average sensor temperature is used.

154 For a barrel module, the agreement of the network temperatures using the thermal impedances from the  
155 fit with the data from FEA is better than  $0.5^{\circ}\text{C}$  for all nodes. This procedure is performed for both an EOS  
156 module and a normal module. The thermal impedance from the sensor to the sink ( $R_{\text{CM}}$ ) is consistently  
157 between  $1.1$  and  $1.4\ ^{\circ}\text{C}/\text{W}$ , but higher values (between  $10$  and  $20\ ^{\circ}\text{C}/\text{W}$ ) are found for other impedances in  
158 the network ( $R_{\text{HCC}}$  and  $R_{\text{FEAST}}$ ), mostly because these are for components with a small footprint constituting  
159 a bottleneck for the heat flow.

160 For the endcap modules, the procedure to determine the thermal impedances is performed for each of the  
161 6 module types.  $R_{\text{CM}}$  ranges from  $0.6$  to  $1.4\ ^{\circ}\text{C}/\text{W}$ , with other nodes between  $5$  and  $20\ ^{\circ}\text{C}/\text{W}$ . Because the  
162 location of powered components is more irregular on an endcap module, the difference between the predicted  
163 temperatures of the linear network and the FEA can reach up to  $1.2\ ^{\circ}\text{C}$  for key temperature-dependent  
164 nodes. To compensate for this additional degree of uncertainty, the thermal impedance safety factor used in  
165 the endcap is increased by a factor of 2 compared to the barrel modules (see Section 7.2).

166 There are two recognised departures from linearity of the thermal path: the rise in thermal conductivity  
167 of the silicon sensor with decreasing temperature, and the rise in heat transfer coefficient (HTC) of the  
168 evaporating  $\text{CO}_2$  coolant with increasing thermal flux. The FEA models are run using mean values for these  
169 quantities appropriate to the operating conditions, and the thermo-electrical model results are insensitive to  
170 the variations expected in practice. However, if this level of realism is required and if reliable parametrizations  
171 for these dependencies can be obtained, then the inclusion of such variations in the model is possible.

## 172 5. Other model inputs

173 The two temperature-dependent elements of the thermo-electrical model—the radiation-induced digital  
174 current increase in the front-end chips, and the efficiency of the FEAST-controlled DC-DC converter—are  
175 described in this section. Both effects are studied experimentally and fit with functional forms in order to  
176 accurately represent them in the model. The uncertainty in the experimental data, and in our modelling  
177 assumptions, are estimated here and considered in the evaluation of safety factors, described in detail in  
178 Section 7.2.

### 179 5.0.1. DC-DC converter

180 The DC-DC converter, controlled by the FEAST chip, supplies a low-voltage (1.5 V) current to the  
181 ABC130 and HCC front-end chips on the module. The efficiency of the FEAST depends on its temperature  
182 as well as the output (load) current load delivered to the front-end chips. To correctly model the FEAST  
183 efficiency, experimental measurements have been performed to characterize the dependence and fitted with  
184 a functional form.

185 For the measurement, the FEAST power board was glued to an aluminum cold plate, cooled with  $\text{CO}_2$ , and  
186 powered with the nominal working input and output voltages (11 V input, 1.5 V output). The temperature  
187 of the FEAST was measured with an NTC thermistor and a PTAT sensor residing on the FEAST for a range  
188 of load currents up to the maximum design current of 4A.

189 The data was then fit with a function with sufficient parameters to ensure reasonable agreement; the  
190 choice of functional form has no physical interpretation. Figure 7 depicts the FEAST efficiency data and  
191 the parametrized fit used in the model. The parametrization fits the data with an accuracy better than 1%;  
192 this uncertainty in the FEAST efficiency modelling is small compared to other uncertainty sources, and is  
193 therefore neglected in our model.

### 194 5.0.2. Digital current increase of chips using 130 nm CMOS technology

195 The ABC and HCC chips, designed using IBM 130 nm CMOS 8RF technology, are known to suffer  
196 from an increase in digital current when subjected to a high-radiation environment [1]. This phenomenon,  
197 known as the “TID bump,” is well-studied [9, 10] and has a characteristic shape whereby the effect reaches  
198 a maximum as a function of the accumulated dose and then gradually diminishes (see Fig. 8).

199 In an effort to characterize the nature of the TID bump in the ABC and HCC chips empirically, many  
200 irradiation campaigns have been conducted using a variety of radiation sources, testing the effect at different  
201 temperatures and dose rates. The data collected from these studies was used to develop a model of the  
202 TID bump that estimates the digital current increase given the total ionizing dose, the dose rate, and the  
203 operating temperature of the chip. This parametrization, which is depicted in Fig 8, is used as an input



Figure 7: The FEAST efficiency model based on experimental data. (a) The experimental data points characterizing the FEAST efficiency are plotted as dots and color coded for load current. The data is compared to the analytic fit, evaluated in curves of equal current. (b) The same analytic fit, presented as a function of current load for curves of equal temperature.

204 to the thermo-electrical model in order to correctly model the ABC and HCC currents. The TID bump is  
 205 assumed to fully apply to the HCC digital current, and apply to 69% of the ABC digital current (according  
 206 to our understanding of its digital circuitry).

207 The TID bump displays certain key features, which are reflected in the parametrization: first, the effect is  
 208 larger at colder temperatures and higher dose rates. This means it can be mitigated by operating the chips at  
 209 higher temperature (note that the dose rate is determined by the LHC operational conditions). Second, the  
 210 figure illustrates how chips receiving different dose rates will reach their maximum digital current increase  
 211 at different times. This feature is particularly important when modelling the total power consumed by the  
 212 barrel and endcap systems. In both systems, the dose rate varies significantly depending on the position of  
 213 the module in the detector. The effect means that the maximum system power will be smaller than the sum  
 214 of the maximum power of each module, as each chip reaches its maximum at a different point in time.

215 The TID bump is an important source of uncertainty in our model. The experimental data exhibit a  
 216 relatively large variation in the TID bump effect, in particular between different batches of the same type of  
 217 chip delivered by the manufacturer, suggesting an unknown effect in the fabrication process. To estimate the  
 218 uncertainty in the TID bump, the parametrized function is fit again using only the worst-performing data  
 219 (defined as having the largest TID bump effect). This “pessimistic” parametrization is used as a safety factor  
 220 to estimate the detector performance in worst-case scenarios.

221 The irradiations of individual chips have typically been performed at constant dose rate and temperature.  
 222 However, both of these parameters will vary as a function of time in the scenarios that we attempt to model.  
 223 In our current parametrization, we use only the instantaneous value of these two parameters, thus neglecting  
 224 any possible history of the TID effect for a given chip. We also ignore any short-term effects due to variations  
 225 in the dose rate on the scale of hours or days. This approach is mandated by the lack of more varied  
 226 experimental data and the absence of a good theoretical model for this effect. This probably constitutes the  
 227 largest source of unknown error in our model.

### 228 5.0.3. Radiation-dependent leakage current

229 The radiation-induced sensor leakage current can be parametrized as a function of the hadron fluence  
 230 expressed in 1 MeV equivalent neutrons. The parametrizations we have used for the evaluation of our model  
 231 are shown in Fig. 9 for a reference sensor temperature of  $-15$   $^{\circ}$ C [11]. In the model, the leakage current is  
 232 scaled to a given sensor temperature using Eq. 1.



Figure 8: Parametrization of the impact of the total ionizing dose on the magnitude of the front-end chip digital current (the TID bump), presented as a function of time. The current is multiplied by a scale factor that is modeled as a function of total ionizing dose, dose rate, and temperature, based on experimental data.



Figure 9: Parametrization used for the leakage current at  $-15^{\circ}\text{C}$  as a function of the fluence for two different sensor bias voltages [11].

## 233 6. Running the model

234 The thermo-electrical model constructs a profile of the sensor module operation conditions over the  
 235 lifetime of the detector in the following manner. First, the total module power (including all components,  
 236 but excluding the sensor leakage power) and the sensor temperature assuming no leakage current ( $T_0$ ) are  
 237 calculated using a reasonable set of initial component temperatures. The initial value for the module power is  
 238 used to solve for the sensor power and temperature accounting for leakage current, using the thermal balance  
 239 equation and the relationship from Eq. 1. Using this calculated sensor leakage current and temperature, the  
 240 power and temperature of the module components are updated given the initial (year 0, month 0) startup  
 241 parameters.

242 Next, the module conditions of the following month (year 0, month 1) are calculated. Using the compo-  
 243 nent temperatures calculated from the previous month and the operational parameters (ionizing dose and  
 244 dose rates) from the current month, the module total power (excluding sensor leakage) is again calculated,  
 245 and subsequently the sensor temperature and leakage current are computed. Following this, the module  
 246 component temperatures and power values are derived for this month. This process is repeated in one-month  
 247 steps until the final year of operation, or until a real solution for the sensor temperature does not exist,  
 248 indicating that thermal runaway conditions have been reached.

249 In the barrel subsystem, the above procedure is performed four separate times to represent the radiation  
250 conditions of the four barrel layers located at different radii from the beam axis<sup>4</sup> for both a normal and an  
251 EOS-type module. Thus, eight modules are simulated in total for the barrel (4 layers  $\times$  normal/EOS), and  
252 they are combined in their proper proportion to simulate the entire barrel system.

253 In the endcap subsystem, the total ionizing dose and dose rates vary significantly depending on the position  
254 of the module; furthermore, the design of each module on a petal differs significantly. Therefore, all 36 module  
255 types (6 rings  $\times$  6 disks) are simulated independently, and combined to represent the full endcap.

256 We have implemented this algorithm in Mathematica (barrel) and Python (endcaps). In both cases, the  
257 calculation for a set of operating conditions over the full lifetime of the LHC takes between 5 and 10 minutes  
258 on a standard PC, thus enabling a quick turn-around for systematic studies of the parameter space.

## 259 7. Outputs of the thermo-electrical model

260 The thermo-electrical model provides a wide range of predictions for the operation of the strip system.  
261 A detailed discussion of all results would only be of interest to ITk strip system experts and is beyond the  
262 scope of this article. Instead, we present here a subset of results to demonstrate the capabilities and use of  
263 the thermo-electrical model for the design of the detector system.

### 264 7.1. Operational scenarios

265 To study the different aspects of our predictions for the operation of the ITk strip system throughout its  
266 lifetime, we performed the calculation of the system parameters over the expected 14 years of operation in  
267 monthly steps as outlined in Section 6. Time-dependent operational inputs to the calculation were taken from  
268 the expected performance of the HL-LHC (Fig. 10a). For the cooling temperature, which can be adjusted  
269 during data taking using detector control systems, we studied flat cooling profiles with temperatures as low  
270 as  $-35^{\circ}\text{C}$ , the lowest evaporation temperature achievable with the ITk evaporative  $\text{CO}_2$  cooling system, as  
271 well as a ‘ramp’ scenario in which the cooling temperature starts at  $0^{\circ}\text{C}$  and is gradually lowered down to  
272  $-35^{\circ}\text{C}$  over the course of 10 years (Fig. 10b).



Figure 10: (a) Expected HL-LHC performance and (b) ‘cooling ramp’ scenario for the coolant temperature. Year-long shutdowns of the LHC are anticipated in years 5 and 9.

---

<sup>4</sup>The correct module type, short-strip in the inner two layers and long-strip for the outer two layers, is used for each layer.

273    7.2. Safety factors

274    To ensure the robustness of the system design against errors in the assumptions used in the model, we  
 275    also evaluate the model using a set of input parameters with some key inputs degraded. The set of safety  
 276    factors used is given in Table 1. Each safety factor has been estimated individually based on experience, the  
 277    complexity of the system aspect described by the parameter, and from available data or the absence of such  
 278    data. Note that the model can be evaluated with all the safety factors listed in Table 1 used together, a  
 279    situation that is unlikely to occur in the real system, to provide a worst-case estimate for the performance of  
 280    the ITk strip system. The individual effects of the different safety factors are demonstrated in Fig. 11.

Table 1: Safety factors.

| Safety factor             | Value                  | Reason                                                                       |
|---------------------------|------------------------|------------------------------------------------------------------------------|
| Fluence                   | 50%                    | Accuracy of fluence calculations and uncertainties in material distributions |
| Thermal impedance         | 10% barrel, 20% endcap | Local support build tolerances, thermal network assumptions                  |
| Digital current           | 20%                    | Final chip performance and parametrization of TID effect                     |
| Analog current            | 5%                     | Final chip performance                                                       |
| Tape electrical impedance | 10%                    | Electrical tape manufacturing tolerances                                     |
| Bias voltage              | 700 V                  | Increased bias voltage from nominal 500 V to maintain S/N                    |
| TID parametrization       | Nominal/Pessimistic    | Different data sets for fit of TID bump                                      |



Figure 11: Comparing the impact of different safety factors on (a) the sensor temperature and (b) the module power for the endcap R3-type module, using a flat cooling scenario ( $-30^{\circ}\text{C}$ ). The dotted line depicts the effect of all safety factors applied at once.

281    It is important to note that combining multiple safety factors can have a compounding effect on the  
 282    system. As an example, the effect of an increased bias voltage combined with a larger digital current will  
 283    result in a much higher sensor leakage current at the detector end-of-life than either situation occurring  
 284    individually. The analytical model allows for scenarios like these to be examined quickly and effectively.

285    7.2.1. Module properties

286    Several module properties predicted by the thermo-electrical model are shown in Figures 12 and 13 for  
 287    the barrel system. The different radiation-dependent effects occur on different timescales. The maximum  
 288    in the digital chip power due to the TID effect occurs relatively early (in year 1 to 4), although the bump  
 289    has a long tail, particularly in the outer layers of the barrel. The sensor leakage power, on the other hand,

grows towards the end of the lifetime of the ITk. If the leakage current continued to increase in the case of further irradiation, or if the cooling temperature were raised, this growth would ultimately lead to thermal runaway. Due to the radial dependence of the radiation environment, the radiation-induced effects are most pronounced in the innermost barrel layers.



Figure 12: Examples of barrel module performance predictions for a flat cooling scenario ( $-30^{\circ}\text{C}$ ) including safety factors. (a) Power per module. (b) Temperatures for different nodes of an end-of-stave barrel module in the innermost barrel. The discontinuities in year 5 and 9 are due to anticipated year-long shutdowns of the LHC.



Figure 13: Examples of barrel module performance predictions for the ramp cooling scenario including safety factors. (a) Sensor temperature in the innermost barrel modules. (b) Power in an end-of-stave barrel module in the innermost layer.

### 7.2.2. System properties

One of the key concerns for the design of the strip system is thermal stability of the system. If the cooling temperature is too high to limit the leakage power from the radiation-damaged sensors to a level where the heat can still be removed, the system is unstable (it goes into ‘thermal runaway’). To find the cooling temperature  $T_C$  at which this condition is reached, we run the thermo-electrical model repeatedly, increasing  $T_C$  in steps of  $5^{\circ}\text{C}$ , until the model finds thermal runaway (as described in Section 6). In the endcap strip system, this occurs at a cooling temperature of  $-15^{\circ}\text{C}$  under nominal conditions; in this scenario, thermal runaway would be reached in the 12<sup>th</sup> year of operation. With safety factors applied, thermal runaway would occur at a cooling temperature of  $-25^{\circ}\text{C}$  (in year 11). In the barrel system, where the radiation environment is slightly less intense, the conditions for thermal runaway occur at the same cooling temperatures but two years later than in the endcaps. As the design cooling temperature of the ITk cooling system is  $-35^{\circ}\text{C}$ , we have confidence that the ITk strip system has a sufficient margin for thermal stability.

306 Beyond the issue of stability, the thermo-electrical model delivers predictions for the development of  
 307 current and power requirements for the overall system. Some of the predictions are shown in figure 14. Again,  
 308 the different timescales of the various radiation-induced effects are visible; ignoring this time dependence could  
 309 lead to overspecification of some system aspects like the total cooling power.



Figure 14: Examples of system performance predictions. (a) Barrel total power requirements. The plot shows the stacked power requirements for the four barrel layers (orange: innermost barrel, blue: outermost barrel). Full colour indicates power from the front-end electronics, greyed parts are contributions from HV power for the four barrels. The discontinuities in year 5 and 9 are due to anticipated year-long shutdowns of the LHC. (b) The power requirements for each of the 36 simulated endcap modules, labeled according to their ring type and disk position. The solid black line indicates the average module power. Both predictions use a scenario with flat  $-30^{\circ}\text{C}$  cooling and including all safety factors.

310 The predictions from this model are now used throughout the strip project to consistently size the power  
 311 supply and cooling systems. Including safety factors in the predictions gives us some confidence that the  
 312 designs are robust; by using commonly agreed safety factors, we ensure a consistent use of safety factors  
 313 throughout the project and prevent safety factor creep.

314 Because of the different timescales for the peak power due to the TID effect and the radiation-induced  
 315 sensor leakage, there is room to optimize the cooling temperature profile to minimize the total power in the  
 316 strip system while avoiding thermal runaway. The thermo-electrical model is a powerful tool to plan such an  
 317 optimized cooling profile. In fact, the cooling ‘ramp’ scenario introduced in Section 7.1 is the result of such  
 318 an optimization (see Fig. 15).

## 319 8. Model performance verification

320 The accuracy of the predictions of the thermo-electrical model is affected by two major factors: the quality  
 321 of the input parameters, and the error introduced by reducing the complex 3D geometry into a linear thermal  
 322 impedance network. The former has been discussed throughout this paper where the different inputs have  
 323 been presented. For the latter, we have studied the agreement of predictions from the network model with  
 324 the more accurate results obtained from FEA for selected states of the system.

325 To verify the level of this agreement, we have calculated the sensor temperature curve for a barrel EOS-  
 326 type module up to thermal runaway, both in the full FEA and in the network model. For this exercise,  
 327 we do not vary any of the input parameters in the model other than the sensor leakage power with its  
 328 temperature dependence. The resistor values in the network model are the same as used throughout for  
 329 our model, obtained as described in section 4. For the power from the various electronics components, the  
 330 FEAST efficiency and the TID scale factor we have used representative nominal values.

331 As we do keep the variable inputs in the model constant for this study we can reduce the complex thermal  
 332 network to its Thévenin equivalent, which is identical to the network studied in Ref. [4], and use the analytical



Figure 15: (a) Sensor leakage current and (b) total power of the endcap R1-type module for eight different flat cooling profiles, ranging from 0°C to  $-35^{\circ}\text{C}$ , as well as the cooling ramp scenario specified in Fig. 10b (dashed curve). The curves that are discontinued before year 14 correspond to scenarios that have reached thermal runaway. The cooling ramp scenario has been selected to minimize the module power while keeping the sensor leakage current stable throughout the lifetime of the ITk.

333 expressions given there. The reduced network is described by the base temperature  $T_0$ , given by the coolant  
 334 temperature and the temperature rise due to the front-end electronics alone, and the total thermal impedance  
 335  $R_t$  from the sensor to the coolant. With the resistances and representative power numbers described the  
 336 former is  $-21.9^{\circ}\text{C}$ , and the latter 1.132 K/W in the network model, compared to  $-22.4^{\circ}\text{C}$  and 1.147 K/W  
 337 obtained directly from the FEA. The comparison of the predicted sensor temperatures for both cases is shown  
 338 in Fig. 16. Despite a large temperature variation of about  $10^{\circ}\text{C}$  across the sensor, the network model predicts  
 339 the runaway with good agreement with the FEA<sup>5</sup>. This gives us confidence that the use of a thermal network  
 340 model is not likely to significantly degrade the predictions beyond the errors introduced by other inputs to  
 341 the model.

## 342 9. Conclusions

343 We have developed a model of the ATLAS ITk strip system that is based on the interplay between a  
 344 thermal and an electrical network model. The set of equations in the model can be numerically solved using  
 345 standard data analysis software in a short time, allowing for a quick turn-around for systematic studies of the  
 346 system performance. The complexity of these networks is given by the number of interconnected components  
 347 between the networks, many of which have a non-linear dependence on the temperature or electrical power.  
 348 This approach can be easily adopted for any other silicon detector system.

349 In the case of the ATLAS strip system, several temperature-dependent heat sources had to be modeled.  
 350 In addition to the sensor leakage current, these are the radiation-induced increase of the digital front-end  
 351 power ('TID bump') and the efficiency of the DC-DC conversion system. The outputs of the model give  
 352 us confidence that the ITk strip system will be thermally stable until the end of LHC Phase-II operation,  
 353 even with the inclusion of safety factors on key inputs. Furthermore, the model provides information for  
 354 benchmark system parameters like cooling, supply power and currents in power cables, which is used in the  
 355 specification of these systems. The use of the model outputs throughout the strip project ensures consistent  
 356 specifications, including a common strategy on safety factors. Using the thermo-electrical model, we can also

<sup>5</sup>The critical temperature here is  $-12.4^{\circ}\text{C}$ , which is higher than the numbers given in section 7.2.2, because this study here ignores temperature effects like for example the FEAST efficiency, which can only be modelled in the network model.



Figure 16: (a) Thevenin equivalent of the thermal network. (b) Result of sensor surface temperature calculations using FEA. The EOS card is to the left and the cooling pipes run from top to bottom about a quarter of the module width from each edge. (c) Difference of average sensor and coolant temperature, comparing FEA (dots) and the network model prediction (line). The bars on the FEA data indicate minimum and maximum sensor temperature. The dotted vertical line indicates the critical temperature derived analytically using the network model ( $-12.4^\circ\text{C}$ ).

propose an optimized cooling temperature ‘ramp’ scenario, which stabilizes leakage power throughout the lifetime of the experiment while minimizing the TID bump.

We have verified the performance of the thermal network model compared to a full FEA treatment, and we are confident that the level of disagreement is smaller than the uncertainty introduced by the model inputs. Among the inputs, the most likely source of unknown error stems from the limitations in our understanding of the parametrization of the TID effect.

## 10. Acknowledgements

The evaluation of the thermo-electrical model depends critically on the input parameters to the model. To capture the whole of the system, these need to distill all that is known of the system, and we are therefore indebted to the whole of the ITk strip community. In particular, we would like to thank Tony Affolder, Kyle Cormier, Ian Dawson, Sergio Diez Cornell, Laura Gonella, Ashley Greenall, Alex Grillo, Paul Keener, Steve McMahon, Paul Miyagawa, Craig Sawyer, Francis Ward and Tony Weidberg for all their inputs to this work.

## References

- [1] A. Collaboration, Technical Design Report for the ATLAS Inner Tracker Strip Detector, Tech. Rep. CERN-LHCC-2017-005, ATLAS-TDR-025, CERN, Geneva (Apr 2017).  
URL <https://cds.cern.ch/record/2257755>
- [2] Radiation induced effects in the ATLAS Insertable B-Layer readout chip, Tech. Rep. ATL-INDET-PUB-2017-001, CERN, Geneva (Nov 2017).  
URL <https://cds.cern.ch/record/2291800>
- [3] A. Affolder, B. Allongue, G. Blanchot, F. Faccio, C. Fuentes, A. Greenall, S. Michelis, DC-DC converters with reduced mass for trackers at the HL-LHC, Journal of Instrumentation 6 (11) (2011) C11035.  
URL <http://stacks.iop.org/1748-0221/6/i=11/a=C11035>
- [4] G. Beck, G. Viehhauser, Analytic model of thermal runaway in silicon detectors, Nucl. Instrum. Meth. A618 (2010) 131–138. doi:10.1016/j.nima.2010.02.264.
- [5] N. Lehmann, Tracking with self-seeded Trigger for High Luminosity LHC, Master’s thesis, Section of Electrical and Electronical Engineering, École Polytechnique Fédérale de Lausanne, Lausanne Switzerland (2014).

- 384 URL [https://documents.epfl.ch/users/n/nl/nlehmann/www/SelfSeededTrigger\\_MasterThesis/SelfSeededTrigger\\_NiklausLehmann\\_Thesis.pdf](https://documents.epfl.ch/users/n/nl/nlehmann/www/SelfSeededTrigger_MasterThesis/SelfSeededTrigger_NiklausLehmann_Thesis.pdf)
- 385
- 386 [6] Atlas experiment - radiation simulation public results [cited 2018-11-17].  
387 URL [https://twiki.cern.ch/twiki/bin/view/AtlasPublic/RadiationSimulationPublicResults#FLUKA\\_Simulations](https://twiki.cern.ch/twiki/bin/view/AtlasPublic/RadiationSimulationPublicResults#FLUKA_Simulations)
- 388
- 389 [7] M. Smith, ABAQUS/Standard User's Manual, Version 6.9, Simulia, 2009.
- 390 [8] ANSYS, Inc., Ansys academic research mechanical, release 18.2.  
391 URL <http://www.ansys.com/>
- 392 [9] F. Faccio, G. Cervelli, Radiation-induced edge effects in deep submicron cmos transistors, IEEE Transactions on Nuclear Science 52 (6) (2005) 2413–2420. doi:10.1109/TNS.2005.860698.
- 393
- 394 [10] F. Faccio, H. J. Barnaby, X. J. Chen, D. M. Fleetwood, L. Gonella, M. McLain, R. D. Schrimpf, Total  
395 ionizing dose effects in shallow trench isolation oxides, Microelectronics Reliability 48 (7) (2008) 1000  
396 – 1007, 2007 Reliability of Compound Semiconductors (ROCS) Workshop. doi:<https://doi.org/10.1016/j.microrel.2008.04.004>.  
397 URL <http://www.sciencedirect.com/science/article/pii/S0026271408000826>
- 398
- 399 [11] M. Mikestikova, Internal ATLAS communication. Marcela.Mikestikova@cern.ch.