

# Thermoelectric modeling of the ATLAS ITk Strip Detector

Graham Beck<sup>a</sup>, Kurt Brendlinger<sup>b</sup>, Yu-Heng Chen<sup>b</sup>, Georg Viehhauser<sup>c</sup>

<sup>a</sup>*Queen Mary University of London, London, UK*

<sup>b</sup>*Deutsches Elektronen-Synchrotron DESY, Notkestraße 85, 22607 Hamburg*

<sup>c</sup>*Oxford University, Oxford, England, UK*

## Abstract

In this paper we discuss the use of linked electrical and thermal network models to predict the behaviour of a complex silicon detector system. We will use the silicon strip detector for the ATLAS Phase-II upgrade to demonstrate the application of such a model and its performance. With this example, the thermoelectric model is used to test design choices, validate specifications, and predict key operational parameters such as cooling system requirements. The model can reveal insights into the interplay of conditions and components in the silicon module, can be used to optimize operational aspects like the temperature profile over the lifetime of the experiment and it is a valuable tool for estimating the headroom to thermal runaway, all with very moderate computational effort.

*Keywords:* Silicon detector, Thermal runaway, Thermal management, Cooling

## 1. Introduction

The temperatures in silicon detector systems are critically important to their performance. Fundamentally, the leakage current of a silicon sensor has a pronounced temperature dependence

$$I \propto T_S^2 e^{-T_A/T_S}, \quad (1)$$

where  $T_S$  is the sensor temperature and  $T_A \simeq 7000$  K. Leakage currents can become particularly significant after irradiation of the silicon material. The heat generated by these leakage currents in the silicon sensor, together with the heat from front-end electronic components on the detector, needs to be removed by a cooling system. The capability of the cooling system to remove this heat is limited by the temperature of the local cold sink (typically a circulated fluid) and the thermal impedance of the heat path between the source (electronics and sensor) and the sink. Due to the strong growth of leakage power with temperature, there is a critical temperature  $T_{\text{crit}}$  above which the heat cannot be removed quickly enough, and the detector becomes thermally unstable ('thermal runaway')<sup>1</sup>. Understanding the thermal behaviour and the headroom to thermal runaway is crucial for the design of a silicon detector system. Furthermore, even before the limit of thermal stability is reached, knowledge of temperatures in silicon detector systems is important, as they have a major impact on key system parameters such as power supply capacity and cable dimensions.

In addition to the silicon, there can be aspects of the front-end electronics that have a temperature dependence. In the strip system for the ATLAS Phase-II upgrade [1], which is the subject of this case study, there are two additional temperature-dependent heat sources. The first is a radiation damage effect in the readout electronics, which leads to an increase in the digital power of the chip whose magnitude depends on the total ionisation dose (TID) and the temperature of the chip [1]. This phenomenon was first observed in the ATLAS IBL [2]. The other temperature dependence of a power source stems from the converter chip (FEAST [3]) used in the on-detector DC-DC converter system supplying power to the front-end electronics.

In principle, the temperatures in the system for a given set of operational parameters (power density, thermal conductivities, etc.) can be predicted by FEA to an accuracy that is limited only by the quality

<sup>1</sup>In a real detector system, the resulting growth of sensor temperature would be arrested by overcurrent limits in the power supplies, resulting in a reduction of the bias voltage. At the same time, the increased current leads to an increase of the noise, such that the overall result is a degradation of the S/N performance of the system.

31 of the input parameters. However, this is a time-consuming process and can be prohibitively difficult if a  
 32 number of local heat sources depend non-linearly on temperature. A simplification to this problem that  
 33 allows for an analytical solution in the case of a simple heat source topology has been developed in [4]. Here  
 34 we develop this method further to include several temperature-dependent non-linear heat sources in the front-  
 35 end electronics. The resulting set of equations cannot be solved analytically anymore, but the solution can  
 36 be found with little effort using numerical problem solvers. This enables us to predict with some confidence  
 37 the temperatures and power requirements in the ATLAS strip system throughout Phase-II operation. The  
 38 results from this prediction have been used throughout the ATLAS strip project to consistently dimension  
 39 the different systems (cooling, power, services, etc.), including an appropriate margin due to the inclusion  
 40 of a common set of safety factors. This method can be easily adapted to any other system by adjusting the  
 41 model to the system-specific geometries and parameters.

#### 42 1.1. The ATLAS strip system

43 The strip system for the ATLAS Phase-II upgrade [1] consists of two parts: the barrel system, comprised  
 44 of four concentric cylindrical barrels, and two endcaps consisting of six disks each.

45 In the barrel, the detector modules are made of square sensors ( $96.85 \times 96.72 \text{ mm}^2$ ) with a hybrid on  
 46 top, which hosts the front-end chips (ABC130 [5] and HCC [1]) as well as circuitry to convert the supply  
 47 voltage of larger than 10 V to the chip voltage of 1.5 V. This circuitry is controlled by the FEAST chip.  
 48 The modules are glued onto both sides of a composite sandwich that contains two parallel thin-wall titanium  
 49 cooling pipes embedded in carbon foam (Allcomp K9 - ref) between two facesheets of UHM carbon fibre  
 50 (3 layers of K13C2U/EX1515) with a co-cured Kapton/copper low-mass tape. A model of this geometry is  
 51 shown in Fig. 1. During final operation, cooling will be achieved by evaporating CO<sub>2</sub> in the cooling pipes  
 52 with a final target temperature no higher than  $-35^\circ\text{C}$  anywhere along the stave.

53 The geometry of the stave is uniform along its length, with the exception of the end region of the stave,  
 54 where an End-Of-Structure (EOS) card is mounted on both surfaces. The EOS card shares part of its heat  
 55 path with the module; underneath the EOS, the thermal path is degraded by the presence of electrically-  
 56 insulating ceramic pipe sections. The thermal and electrical properties of a module adjacent to the EOS card  
 57 (hereafter referred to as an ‘EOS module’) are sufficiently different from other modules along the length of the  
 58 stave (‘normal modules’) to warrant separate treatment in the thermo-electric model of the barrel.



Figure 1: Strip barrel local support geometry. On the left, a complete stave is shown (EOS card in the foreground). The right picture shows a cross-section of the stave with the two cooling pipes visible inside the core.

59 The endcap system consists of two endcaps composed of 6 disks each. Each disk contains 32 ‘petals,’ the  
 60 local substructure depicted in Fig. 2. Both sides of the petal are loaded with 6 silicon modules, each with

61 a distinct design, located at increasing radius from the beam pipe and labeled R0 through R5 (where ‘R’  
 62 stands for ring). Each endcap module consists of one or two irregularly-shaped silicon sensors, and a varying  
 63 number of front-end chips and DC-DC converters on each module. The EOS card is located adjacent to the  
 64 R5 module, but the cooling pipes run directly underneath it without a shared heat path, in contrast to the  
 65 barrel EOS. The remaining module and petal core design details are largely identical to the barrel module  
 66 description above.

Figure 2: Endcap strip geometry

### 67 1.2. Radiation environment

68 A key input to the calculation is the radiation environment of the strip system, as several inputs depend  
 69 on radiation damage effects. The sensor leakage current can be parametrized as a function of the fluence  
 70 expressed in 1 MeV neutron-equivalents, and the TID effect on the digital chip current will be described as  
 71 a function of the total ionizing dose rate (more details on its dependencies can be found in Section 7).

72 Predictions for both of these parameters for each point in the ITk are available which have been generated  
 73 using the FLUKA particle transport code and the PYTHIA8 event generator (Fig. 3) [6]. Both of these  
 74 distributions display a weak dependence on  $z$  in the barrel, whereas they vary significantly along  $r$  and  $z$  over  
 75 the length of the endcap petals. Because of this, and the linear uniformity of the stave design compared to  
 76 the more complex geometry along a petal, we modelled only two types of modules for the barrel (a generic  
 77 module along the linear part of the stave and the module next to the end-of-structure card), but six different  
 78 types of modules in a petal.



Figure 3: ATLAS ITk radiation environment. 1 MeV neutron equivalent fluence (left) and total ionizing dose (right). Both plots are for an integrated luminosity of  $4000 \text{ fb}^{-1}$  [6].

## 79 2. The electrical model

80 The electrical model consists of low-voltage (LV) and high-voltage (HV) circuits, depicted in Fig. 4. The  
 81 LV current (11 V) is used to power the hybrid controller chips (HCCs), ATLAS Binary Chips (ABCs) and  
 82 Autonomous Monitoring and Control chip (AMAC) located on PCBs that are glued directly onto the surface  
 83 of the sensor. These chips require between 1.5 and 3.3 V, which are provided by the temperature-dependent  
 84 FEAST DC-DC converter (labeled bPOL12V in Fig. 4) and an LDO regulator (labeled bPOL12). The num-  
 85 ber of chips and converters on each module vary according to the design of each different module type (barrel  
 86 short-strip and long-strip modules, and six different endcap module designs). A barrel or endcap module con-  
 87 tains 10–28 ABC chips, 1–4 HCCs, and 1–2 of each of the other components (linPOL12V/bPOL12V/AMAC).

88 The low-voltage current is also delivered to the EOS card to power various data transfer components  
 89 (the GBLD, LpGBT and GBTIA). A FEAST identical to the one used on the module is used to step the



Figure 4: The electrical model of the ITk Strip barrel and endcap modules. Green arrows represent temperature-dependent heat sources, while orange arrows are temperature-independent. Grey squares are chips.

90 voltage down from 11 V to 2.5 V, and an additional LDO regulator brings the voltage down further for some  
91 components. Some modules (the short-strip barrel staves) contain two GBLD and LpGBT chips.

92 The bus tape, which carries both LV and HV currents, has a small ohmic resistance, which impacts the  
93 module in two ways. First, the tape itself will generate some heat according to the amount of current passing  
94 through it; this source of heat is accounted for in the model, however the contribution to the total module  
95 power is negligible. Second, due to the voltage loss along the traces, a slightly lower voltage is supplied to  
96 modules on one end of the substructure (farthest from the EOS). The treatment of this effect is slightly  
97 different in the barrel and endcap models: in the barrel, the voltage delivered to every module is averaged  
98 to 10.5 V; in the endcap, the  $\Delta V$  is estimated based on the calculated expected power loss along the tape  
99 for each module. In both cases, the impact of using a different treatment is small.

100 Finally, the HV current provides the voltage bias on the silicon sensors. An HV multiplexer switch  
101 (HVMUX), can be used to disconnect the sensor from the bias line (it requires a  $10\text{ M}\Omega$  resistor parallel to  
102 the sensor in order to function). Two HV filters with an effective resistance of  $10\text{ k}\Omega$  are situated in series  
103 with the sensor. The nominal operating voltage of the sensor is expected to be 500V, but the system is  
104 designed to handle a voltage bias of up to 700V.

### 105 3. The thermal model

106 The thermal network consists of heat sources (some of which are temperature-dependent) and thermal  
107 resistances. The latter are given by the properties of the mechanical design (heat conductivities of the  
108 materials) and the geometry of the heat path. The geometry is generally 3-dimensional, but it is the strategy  
109 of the simple network models to lump the 3-dimensional behaviour into one thermal resistance parameter.  
110 In the models discussed here, we have used a granularity corresponding to single detector modules for which  
111 the thermal resistance has been modelled. The temperatures in the model are then given for the nodes in  
112 the network in analogy to the potentials in an electrical network.<sup>2</sup>

113 The complexity of the thermal network used in this study (see Fig. 5) is given by the variety of different  
114 temperature-dependent heat sources in the ATLAS strip system. These sources consist of the digital power  
115 for each type of chip, the heat generated by the FEAST chip providing the on-detector DC-DC conversion,

---

<sup>2</sup>Historically Fourier's description of heat conduction pre-dated and inspired Ohm's work on electrical resistive networks. Here we followed the opposite direction.

116 and the sensor leakage currents. In the ATLAS ITk strip modules, all of these components are located on top  
 117 of the sensors, such that the heat generated in them flows through the sensor into the support structure, the  
 118 stave (barrel) or petal (endcap) core with the embedded cooling pipe. In the network model, the heat flow  
 119 from these sources is combined and flowing through a common impedance  $R_M$  to the sink at a temperature  
 120  $T_C$ . For each of the temperature-dependent heat sources (ABC, HCC, FEAST and the sensor) we have added  
 121 a resistance from the common temperature  $T_{mod}$  to allow for a finite and different heat path for each of them.  
 122 Finally, the End-of-substructure (EOS) card adjacent to the last module on the barrel stave (endcap petal)  
 123 is modeled as an additional source of heat with an independent impedance for its unique thermal path.



Figure 5: Thermal network model.

124 This is a more complex thermal network than the one studied in Ref. [4], for which an analytical solution  
 125 for the determination of thermal stability is given. In particular because of the non-linear temperature  
 126 dependence of some of the heat sources it is not possible in the present case to solve the set of equations  
 127 describing the model analytically. However, the set of equations is still sufficiently small to solve numerically  
 128 using functional programming languages such as Mathematica (used in the barrel model) or Python (used  
 129 in the endcap system).

#### 130 4. Obtaining thermal impedances using FEA

131 The cooling path between the sources dissipating electrical power and the cooling fluid is 3-dimensional  
 132 and includes components with orthotropic thermal conductivity. Hence the prediction of temperature at any  
 133 node of the model requires a 3D thermal FEA [7, 8]. However, the thermal conductivities of the components  
 134 along the path are approximately constant, so that the temperature rise  $\Delta T_i$  above the coolant temperature  
 135 of any node  $i$  ( $i = \text{ABC}, \text{HCC}, \text{AMAC}, \text{FEAST}, \text{tape}, \text{RHV}$ , or sensor) in the thermal network model is  
 136 adequately described by a linear sum of contributions from individual sources, i.e:

$$\Delta T_i \equiv T_i - T_C = R_i P_i + (R_C + R_M) \sum_j P_j, \quad (2)$$

137 where we have momentarily ignored the EOS contribution.

138 In order to extract the thermal impedances for the thermal network model, the finite element model is run  
 139 multiple times, with each heat source (or group of similar sources) switched on in turn with a representative  
 140 amount of heat. In each of these cases, the temperature is calculated for all nodes in the thermal network  
 141 model (Figure 5). The temperature of a node is here taken as the average of the temperatures for all the

gridpoints in the FEA model within the volume of the object corresponding to the node<sup>3</sup>. The thermal impedances are then obtained from a fit of Eq. 2 using the temperature data for all nodes for all cases of heat injection.

Because of the nature of the network, the fitted value for the common impedance  $R_{CM} = R_C + R_M$  is determined by the observed temperature rises of components where no heat is injected. The linearity of this relationship is illustrated in Fig. 6. The values for the additional component-specific impedances are in turn driven by the observed temperature rise in the component with the heat injected in that particular component.



Figure 6: The relationship between the temperature rise observed in the FEA for a specific component and the heat injected in another component. The slope of the fitted line is the estimate for  $R_{CM}$ . (a) The fit for a short-strip barrel module adjacent to the EOS. (b) The fit for the endcap R0 module. For each data point marker, the source of power is indicated by the shape, and the measured component is indicated by the color. The blue band represents a ±20% error band on the fit for  $R_{CM}$ .

For a barrel module, the agreement of the network temperatures using the thermal impedances from the fit with the data from FEA is better than 0.5°C for all nodes. This procedure is performed for both an EOS module and a normal module. The thermal impedance from the sensor to the sink ( $R_{CM}$ ) is consistently between 1.1 and 1.4 °C/W, but higher values (between 10 and 20 °C/W) are found for other impedances in the network ( $R_{HCC}$  and  $R_{FEAST}$ ), mostly because these are for components with a small footprint constituting a bottleneck for the heat flow.

For the endcap modules, the procedure to determine the thermal impedances is performed for each of the 6 module types.  $R_{CM}$  ranges from 0.6 to 1.4 °C/W, with other nodes having between 5–20 °C/W. Because the location of powered components is more irregular on an endcap module, the difference between the predicted temperatures of the linear network and the FEA can reach up to 1.2 °C for key temperature-dependent nodes. To compensate for this additional degree of uncertainty, the thermal impedance safety factor is increased by a factor of 2 compared to the barrel modules (see Section 7.2).

There are two recognised departures from linearity of the thermal path: the rise in thermal conductivity of the silicon sensor with decreasing temperature, and the rise in heat transfer coefficient (HTC) of the evaporating CO<sub>2</sub> coolant with increasing thermal flux. The FEA models are run using mean values for these quantities appropriate to the operating conditions, and the thermoelectric model results are insensitive to the variations expected in practice. However, if this level of realism is required and if reliable parametrizations

<sup>3</sup>This is particularly interesting in the case of the sensor, which fills a large volume, with a potentially large range of temperatures. Ref. [4] is not clear about the exact definition of the sensor temperature to be used for the calculation of the thermal impedance. In fact, at the time we were still using the maximum sensor temperature for the calculation of thermal stability. Since then we have acquired more experience with thermal network models and found that the best agreement is achieved if the average sensor temperature is used.

167 for these dependencies can be obtained, then inclusion of such variations in the model is possible.

168 **5. Other model inputs**

169 The two temperature-dependent elements of the thermoelectric model—the radiation-induced digital cur-  
170 rent increase in the front-end chips, and the efficiency of the FEAST DC-DC converter—are described in  
171 this section. Both effects are studied experimentally and fit with functional forms in order to accurately  
172 represent them in the model. The uncertainty in the experimental data, and in our modeling assumptions,  
173 are estimated here and considered in the evaluation of safety factors, described in detail in Section 7.2.

174 *5.0.1. DC-DC converter*

175 The DC-DC converter (FEAST) supplies a low-voltage (1.5 V) current to the ABC130 and HCC front-  
176 end chips on the module. The efficiency of the FEAST depends on its temperature as well as the output  
177 (load) current load delivered to the front-end chips. To correctly model the FEAST efficiency, experimental  
178 measurements have been performed to characterize the dependence and fitted with a functional form.

179 To measure the FEAST efficiency, the FEAST power board was glued to an aluminum cold plate, cooled  
180 with CO<sub>2</sub>, and powered with the nominal working input and output voltages (11 V input, 1.5 V output).  
181 The temperature of the FEAST was measured with an NTC thermistor and PTAT sensor residing on the  
182 FEAST, for a range of load currents up to the maximum design current of 4A<sup>4</sup>.

183 The data was then fit with a function with sufficient parameters to ensure reasonable agreement; the  
184 choice of functional form has no physical interpretation. Figure 7 depicts the FEAST efficiency data and the  
185 parameterized fit used in the model. The parameterization fits the data with an accuracy better than 1%;  
186 this uncertainty in the FEAST efficiency modeling is small compared to other uncertainty sources, and is  
187 therefore neglected in our model.



Figure 7: The FEAST efficiency model based on experimental data. (a) The experimental data points characterizing the FEAST efficiency are plotted as dots and color coded for load current. The data is compared to the analytic fit, evaluated in curves of equal current. (b) The same analytic fit, presented as a function of current load for curves of equal temperature.

188 *5.0.2. Digital current increase of chips using 130 nm CMOS technology*

189 The ABC and HCC chips, designed using IBM 130 nm CMOS 8RF technology, are known to suffer  
190 from an increase in digital current when subjected to a high-radiation environment [1]. This phenomenon,

<sup>4</sup> FEAST data spreadsheet: [http://project-dcdc.web.cern.ch/project-dcdc/public/Documents/FEASTMod\\_Datasheet.pdf](http://project-dcdc.web.cern.ch/project-dcdc/public/Documents/FEASTMod_Datasheet.pdf). Cite?

known as the “TID bump,” is well-studied [9, 10] and has a characteristic shape whereby the effect reaches a maximum as a function of the accumulated dose and then gradually diminishes (see Fig. 8).



Figure 8: Parametrization of the impact of the total ionizing dose on the magnitude of the front-end chip digital current (the TID bump), presented as a function of time. The current is multiplied by a scale factor that is modeled as a function of total ionizing dose, dose rate, and temperature, based on experimental data.

In an effort to characterize the nature of the TID bump in the ABC and HCC chips empirically, many irradiation campaigns have been conducted using a variety of radiation sources, testing the effect at different temperatures and dose rates. The data collected from these studies was used to develop a model of the TID bump that estimates the digital current increase given the total ionizing dose, the dose rate, and the operating temperature of the chip. This parameterization, which is depicted in Fig 8, is used as an input to the thermoelectric model in order to correctly model the ABC and HCC currents. The TID bump is assumed to fully apply to the HCC digital current, and apply to 69% of the ABC digital current (according to our understanding of its digital circuitry).

The TID bump displays certain key features, which are reflected in the parameterization: first, the effect is larger at colder temperatures and higher dose rates. This means it can be mitigated by operating the chips at higher temperature (note that the dose rate is fixed by the LHC conditions). Second, the figure also illustrates how chips receiving different dose rates will reach their maximum digital current increase at different times. This feature is particularly important when modeling the total power consumed by the barrel and endcap systems. In both systems, the dose rate varies significantly depending on the position of the module in the detector. The effect means that the maximum system power will be smaller than the sum of the maximum power of each module, as each chip reaches its maximum at a different point in time.

The TID bump is an important source of uncertainty in our model. The experimental data suggests a relatively large variation in the TID bump effect, in particular between different batches of the same type of chip delivered by the manufacturer, suggesting an unknown effect in the fabrication process. To estimate the uncertainty in the TID bump, the parameterized function is fit again using only the worst-performing data (defined as having the largest TID bump effect). This “pessimistic” parameterization is used as a safety factor to estimate the detector performance in worst-case scenarios.

The irradiations of individual chips have typically been performed at constant dose rate and temperature. However, both of these parameters will vary as a function of time in the scenarios that we attempt to model. In our current parametrization, we use only the instantaneous value of these two parameters, thus neglecting any possible history of the TID effect for a given chip. We also ignore any short-term effects due to variations in the dose rate on the scale of hours or days. This approach is mandated by the lack of more varied experimental data and the absence of a good theoretical model for this effect. This probably constitutes the largest source of unknown error for our model.

### 5.0.3. Radiation-dependent leakage current

The radiation-induced leakage current can be parametrized as a function of the hadron fluence expressed in 1 MeV equivalent neutrons. The parametrizations we have used for the evaluation of our model are shown

225 in Fig. 9 for a reference sensor temperature of  $-15^{\circ}\text{C}$ . In our model, the leakage current is scaled to a given  
 226 sensor temperature using Eq. 1.



Figure 9: Parametrization used for the leakage current at  $-15^{\circ}\text{C}$  as a function of the fluence for two different sensor bias voltages [11].

## 227 6. Running the model

228 The thermoelectric model constructs a profile of the sensor module operation conditions over the lifetime of  
 229 the detector in the following manner. First, the total module power (including all components, but excluding  
 230 the sensor leakage power) and the sensor temperature assuming no leakage current ( $T_0$ ) are calculated using  
 231 a reasonable set of initial component temperatures. The initial value for the module power is used to solve  
 232 for the sensor power and temperature accounting for leakage current, using the thermal balance equation and  
 233 the relationship from Eq. 1. Using this calculated sensor leakage current and temperature, the power and  
 234 temperature of the module components are updated given the initial (year 0, month 0) startup parameters.

235 Next, the module conditions of the following month (year 0, month 1) are calculated. Using the component  
 236 temperatures calculated from the previous month and the operational parameters (ionizing dose and  
 237 dose rates) from the current month, the module total power (excluding sensor leakage) is again calculated,  
 238 and subsequently the sensor temperature and leakage current are computed. Following this, the module  
 239 component temperatures and power values are derived for this month. This process is repeated in one-month  
 240 steps until the final year of operation, or until a real solution for the sensor temperature does not exist,  
 241 indicating that thermal runaway conditions have been reached.

242 In the barrel subsystem, the above procedure is performed four separate times to represent the radiation  
 243 conditions of the four barrel layers located at different radii from the beam axis<sup>5</sup> for both a normal and an  
 244 EOS-type module. Thus, eight modules are simulated in total for the barrel (4 layers  $\times$  normal/EOS), and  
 245 they are combined in their proper proportion to simulate the entire barrel system.

246 In the endcap subsystem, the total ionizing dose and dose rates vary significantly depending on the position  
 247 of the module; furthermore, the design of each module on a petal differs significantly. Therefore, all 36 module  
 248 types (6 rings  $\times$  6 disks) are simulated independently, and combined to represent the full endcap.

249 We have implemented this algorithm in Mathematica (barrel) and Python (endcaps). In both cases, the  
 250 calculation for a set of operating conditions over the full lifetime of the LHC takes between 5 and 10 minutes  
 251 on a standard PC, thus enabling a quick turn-around for systematic studies of the parameter space.

---

<sup>5</sup>The correct module type, short-strip in the inner two layers and long-strip for the outer two layers, is used for each layer.

252 **7. Outputs of the thermoelectric model**

253 The thermo-electrical model provides a wide range of predictions for the operation of the strip system.  
 254 A detailed discussion of all results would only be of interest to ITk strip system experts and is beyond the  
 255 scope of this article. Instead, we present here a subset of results to demonstrate the capabilities and use of  
 256 the thermo-electrical model for the design of the detector system.

257 *7.1. Operational scenarios*

258 To study the different aspects of our predictions for the operation of the ITk strip system throughout  
 259 its lifetime, we performed the calculation of the system parameters over the expected 14 years of operation  
 260 in monthly steps as outlined in section 6. Time-dependent inputs to the calculations were given from the  
 261 expected performance of the LHC (Fig. 10a) and different profiles for the cooling temperature. We studied flat  
 262 cooling temperature scenarios at different temperatures with the lowest being  $-35^{\circ}\text{C}$ , the lowest evaporation  
 263 temperature achievable with the ITk evaporative  $\text{CO}_2$  cooling system, and a ‘ramp’ scenario in which the  
 264 cooling temperature starts at  $0^{\circ}\text{C}$  and gradually is lowered down to  $-35^{\circ}\text{C}$  over the course of 10 years  
 265 (Fig. 10b).



Figure 10: (a) Expected LHC performance and (b) ‘cooling ramp’ scenario for the coolant temperature. Year-long shutdowns of the LHC are anticipated in years 5 and 9.

266 **7.2. Safety factors**

267 To ensure the robustness of the system design against errors in the assumptions used in the model, we  
 268 also evaluate the model using a set of input parameters with some key inputs degraded. The set of safety  
 269 factors used is given in Table 1. Each safety factor has been estimated individually based on experience, the  
 270 complexity of the system aspect described by the parameter, and from available data or the absence of such  
 271 data. Note that the model can be evaluated with all the safety factors listed in Table 1 used together, a  
 272 situation that is unlikely to occur in the real system, to provide a worst-case estimate for the performance of  
 273 the ITk strip system. The individual effects of the different safety factors are demonstrated in Fig. 11.

274 *7.2.1. Module properties*

275 Several module properties predicted by the thermo-electrical model are shown in Figures 12 and 13. The  
 276 different radiation-dependent effects occur on different timescales. The maximum in the digital chip power  
 277 due to the TID effect occurs relatively early (in year 1 to 4), although the bump has a long tail, particularly  
 278 in the outer layers of the barrel. The sensor leakage power, on the other hand, grows towards the end of the  
 279 lifetime of the ITk. If the leakage current continued to increase in the case of further irradiation, or if the  
 280 cooling temperature were raised, this growth would ultimately lead to thermal runaway. Due to the radial  
 281 dependence of the radiation environment, the radiation-induced effects are most pronounced in the innermost  
 282 layers.

Table 1: Safety factors.

| Safety factor on          | Value                  | Reason                                                                       |
|---------------------------|------------------------|------------------------------------------------------------------------------|
| Fluence                   | 50%                    | Accuracy of fluence calculations and uncertainties in material distributions |
| Thermal impedance         | 10% barrel, 20% endcap | Local support build tolerances, thermal network assumptions                  |
| Digital current           | 20%                    | Final chip performance and parametrization of TID effect                     |
| Analog current            | 5%                     | Final chip performance                                                       |
| Tape electrical impedance | 10%                    | Electrical tape manufacturing tolerances                                     |
| Bias voltage              | 700 V                  | Increased bias voltage from nominal 500 V to maintain S/N                    |
| TID parametrization       | Nominal/Pessimistic    | Different data sets for fit of TID bump                                      |



Figure 11: Comparing the impact of different safety factors on (a) the sensor temperature and (b) the module power for the R3 endcap module. The dotted line depicts the effect of all safety factors applied at once.

### 283 7.2.2. System properties

284 One of the key concerns for the design of the strip system is thermal stability of the system. If the cooling  
 285 temperature is too high to limit the leakage power from the radiation-damaged sensors to a level where the  
 286 heat can still be removed, the system is unstable (it goes into ‘thermal runaway’). In the endcap strip system,  
 287 this occurs at a cooling temperature of  $-15^{\circ}\text{C}$  under nominal conditions; in this scenario, thermal runaway  
 288 would be reached in the 12<sup>th</sup> year of operation. With safety factors applied, thermal runaway will occur  
 289 at a cooling temperature of  $-25^{\circ}\text{C}$  (in year 11). In the barrel system, where the radiation environment is  
 290 slightly less intense, the conditions for thermal runaway occur at the same cooling temperatures but two  
 291 years later than in the endcaps. As the design cooling temperature of the ITk cooling system is  $-35^{\circ}\text{C}$ , we  
 292 have confidence that the ITk strip system has a sufficient margin for thermal stability.

293 Beyond the issue of stability, the thermo-electrical model delivers predictions for the development of  
 294 current and power requirements for the overall system. Some of the predictions are shown in figure 14. Again,  
 295 the different timescales of the various radiation-induced effects are visible; ignoring this time dependence could  
 296 lead to overspecification of some system aspects like the total cooling power.

297 The predictions from this model are now used throughout the strip project to consistently size the power  
 298 supply and cooling systems. Including safety factors in the predictions gives us some confidence that the  
 299 designs are robust; by using commonly agreed safety factors, we ensure a consistent use of safety factors  
 300 throughout the project and prevent safety factor creep.

301 Because of the different timescales for the peak power due to the TID effect and the radiation-induced  
 302 sensor leakage, there is room to optimize the cooling temperature profile for minimal total power in the strip



Figure 12: Examples of barrel module performance predictions for a flat cooling scenario ( $-30^\circ$ ) including safety factors. (a) Power per module. (b) Temperatures for different nodes of an end-of-stave barrel module in the innermost barrel.



Figure 13: Examples of barrel module performance predictions for the ramp cooling scenario including safety factors. (a) Sensor temperature in the innermost barrel modules. (b) Power in an end-of-stave barrel module in the innermost layer. The discontinuities in year 5 and 9 are due to anticipated year-long shutdowns of the LHC.

303 system. The thermo-electrical model is a powerful tool to plan such an optimized cooling profile. In fact, the  
304 cooling ‘ramp’ scenario introduced in Section 7.1 is the result of such an optimization (Fig. 15).

## 305 8. Model performance verification

306 The quality of the predictions of the thermo-electrical model is affected by two major factors: the quality  
307 of the input parameters, and the error introduced by reducing the complex 3D geometry into a linear thermal  
308 impedance network. The former has been discussed throughout this paper where the different inputs have  
309 been presented. For the latter, we have studied the agreement of predictions from the network model with  
310 the more accurate results obtained from FEA for selected states of the system.

311 To verify the level of this agreement, we have calculated the sensor temperature curve for a barrel end-  
312 of-stave module up to thermal runaway. For this exercise, we do not vary any of the input parameters in  
313 the model other than the sensor leakage power. We can therefore reduce the complex thermal network to its  
314 Thevenin equivalent, which is identical to the network studied in Ref. [4], and use the analytical expressions  
315 given there. The comparison of this prediction is shown in Fig. 16. Despite a large temperature variation  
316 of about  $15^\circ\text{C}$  across the sensor, the network model predicts the runaway within  $1^\circ\text{C}$  of the result from the  
317 FEA. This agreement gives us confidence that the use of a thermal network model is not likely to significantly  
318 degrade the predictions beyond the errors introduced by other inputs to the model.



Figure 14: Examples for system performance predictions. (a) Barrel total power requirements for flat  $-30^\circ$  cooling and including safety factors. The plot shows the stacked power requirements for the four barrels (orange: innermost barrel, blue: outermost barrel). Full colour indicates power from the front-end electronics, greyed parts are contributions from HV power for the four barrels. The discontinuities in year 5 and 9 are due to anticipated year-long shutdowns of the LHC. (b) The power requirements for each of the 36 endcap modules, labeled according to their ring type and disk position, for flat  $-30^\circ$  cooling and with safety factors. The solid black line indicates the average power of the modules.

## 319 9. Conclusions

320 We have developed a model of the ATLAS ITk strip system that is based on the interplay between a  
 321 thermal and an electrical network model. The set of equations in the model can be numerically solved using  
 322 standard data analysis software in a short time, allowing for a quick turn-around for systematic studies of the  
 323 system performance. The complexity of these networks is given by the number of interconnected components  
 324 between the networks, many of which have a non-linear dependence on the temperature or electrical power.  
 325 This approach can be easily adopted for any other silicon detector system.

326 In the case of the ATLAS strip system, several temperature-dependent heat sources had to be modeled.  
 327 In addition to the sensor leakage current, these are the radiation-induced increase of the digital front-end  
 328 power ('TID bump') and the efficiency of the DC/DC conversion system. The outputs of the model give  
 329 us confidence that the ITk strip system will be thermally stable until the end of LHC Phase-II operation,  
 330 even with the inclusion of safety factors on key inputs. Furthermore, the model provides information for  
 331 benchmark system parameters like cooling, supply power and currents in power cables, which is used in the  
 332 specification of these systems. The use of the model outputs throughout the strip project ensures consistent  
 333 specifications, including a common strategy on safety factors. Using the thermo-electrical model we can also  
 334 propose an optimized cooling temperature 'ramp' scenario, which equalizes leakage power throughout the  
 335 lifetime of the experiment while minimizing the TID bump.

336 We have verified the performance of the thermal network model compared to a full FEA treatment, and we  
 337 are confident that the level of disagreement is smaller than the uncertainty introduced by the model inputs.  
 338 Among the inputs, the most likely source of unknown error stems from the limitations in our understanding  
 339 of the parametrization of the TID effect.

## 340 10. Acknowledgements

341 The evaluation of the thermo-electrical model depends critically on the input parameters to the model.  
 342 To capture the whole of the system, these need to distill all that is known of the system, and we are therefore  
 343 indebted to the whole of the ITk strip community. In particular, we would like to thank Tony Affolder, Kyle



Figure 15: Performance of the cooling ‘ramp’ scenario specified in Fig. 10b. The dashed lines represents the ramp scenario, which has been selected so that the sensor leakage current (a) is stable throughout the lifetime of the ITk. A higher coolant temperature in the first few years reduces the TID effect, keeping the current load on the FEAST (b) well below its specified maximum of 4 A.

344 Cormier, Ian Dawson, Sergio Diez Cornell, Laura Gonella, Ashley Greenall, Alex Grillo, Paul Keener, Steve  
 345 McMahon, Paul Miyagawa, Craig Sawyer, Francis Ward and Tony Weidberg for all their inputs to this work.

## 346 References

- 347 [1] A. Collaboration, Technical Design Report for the ATLAS Inner Tracker Strip Detector, Tech. Rep.  
 348 CERN-LHCC-2017-005, ATLAS-TDR-025, CERN, Geneva (Apr 2017).  
 349 URL <https://cds.cern.ch/record/2257755>
- 350 [2] Radiation induced effects in the ATLAS Insertable B-Layer readout chip, Tech. Rep. ATL-INDET-PUB-  
 351 2017-001, CERN, Geneva (Nov 2017).  
 352 URL <https://cds.cern.ch/record/2291800>
- 353 [3] A. Affolder, B. Allongue, G. Blanchot, F. Faccio, C. Fuentes, A. Greenall, S. Michelis, DC-DC converters  
 354 with reduced mass for trackers at the HL-LHC, Journal of Instrumentation 6 (11) (2011) C11035.  
 355 URL <http://stacks.iop.org/1748-0221/6/i=11/a=C11035>
- 356 [4] G. Beck, G. Viehhauser, Analytic model of thermal runaway in silicon detectors, Nucl. Instrum. Meth.  
 357 A618 (2010) 131–138. doi:10.1016/j.nima.2010.02.264.
- 358 [5] N. Lehmann, Tracking with self-seeded Trigger for High Luminosity LHC, Master’s thesis, Section of  
 359 Electrical and Electronical Engineering, École Polytechnique Fédérale de Lausanne, Lausanne Switzerland (2014).  
 360 URL [https://documents.epfl.ch/users/n.nl/nlehmann/www/SelfSeededTrigger\\_MasterThesis/SelfSeededTrigger\\_NiklausLehmann\\_Thesis.pdf](https://documents.epfl.ch/users/n.nl/nlehmann/www/SelfSeededTrigger_MasterThesis/SelfSeededTrigger_NiklausLehmann_Thesis.pdf)
- 361 [6] Atlas experiment - radiation simulation public results [cited 2018-11-17].  
 362 URL [https://twiki.cern.ch/twiki/bin/view/AtlasPublic/RadiationSimulationPublicResults#FLUKA\\_Simulations](https://twiki.cern.ch/twiki/bin/view/AtlasPublic/RadiationSimulationPublicResults#FLUKA_Simulations)
- 363 [7] M. Smith, ABAQUS/Standard User’s Manual, Version 6.9, Simulia, 2009.
- 364 [8] ANSYS, Inc., Ansys academic research mechanical, release 18.2.  
 365 URL <http://www.ansys.com/>



Figure 16: (a) Thevenin equivalent of the thermal network. (b) Result of surface temperature calculations using FEA. (c) Average temperature above cooling, comparing FEA (dots) and the network model prediction (dotted line).

- [9] F. Faccio, G. Cervelli, Radiation-induced edge effects in deep submicron cmos transistors, IEEE Transactions on Nuclear Science 52 (6) (2005) 2413–2420. doi:10.1109/TNS.2005.860698.
- [10] F. Faccio, H. J. Barnaby, X. J. Chen, D. M. Fleetwood, L. Gonella, M. McLain, R. D. Schrimpf, Total ionizing dose effects in shallow trench isolation oxides, Microelectronics Reliability 48 (7) (2008) 1000 – 1007, 2007 Reliability of Compound Semiconductors (ROCS) Workshop. doi:<https://doi.org/10.1016/j.microrel.2008.04.004>. URL <http://www.sciencedirect.com/science/article/pii/S0026271408000826>
- [11] M. Mikestikova, Internal ATLAS communication. Marcela.Mikestikova@cern.ch.