



# ATLAS NOTE

February 9, 2015



Draft version 0.2

2

## Trigger Processor Design Review Report

3

The NSW Trigger Processor Working Group

4

### Abstract

5  
6  
7  
8  
9  
10  
11  
12

The trigger signals from both the small Thing Gap Chambers (sTGC) and the MicroMegas (MM) chambers will be processed by the New Small Wheel (NSW) Trigger Processor cards located in USA15. The trigger algorithms will be implemented in conventional FPGAs and housed in ATCA crates. The two chamber-technologies will share the same hardware and, as much as possible, the same firmware. The output of the trigger processor will be sent to the new Sector Logic boards, to be combined with the Big Wheel trigger and define the final Level 1 trigger signal. This document describes the current status of the NSW Trigger Processor project.

13

© 2015 CERN for the benefit of the ATLAS Collaboration.

Reproduction of this article or parts of it is allowed as specified in the CC-BY-3.0 license.

# 14 Contents

|    |                                                               |    |
|----|---------------------------------------------------------------|----|
| 15 | <b>1 Introduction</b>                                         | 4  |
| 16 | 1.1 Overview of the NSW trigger                               | 4  |
| 17 | 1.2 System granularity and terminology                        | 4  |
| 18 | 1.3 Requirements and Limitations                              | 5  |
| 19 | <b>2 Trigger Processor Specifications</b>                     | 6  |
| 20 | 2.1 Trigger Processor Latency                                 | 7  |
| 21 | 2.2 Interface to Micromegas                                   | 8  |
| 22 | 2.2.1 ART Data Protocol                                       | 8  |
| 23 | 2.2.2 Decoding ART Data                                       | 9  |
| 24 | 2.3 Interface to sTGC                                         | 9  |
| 25 | 2.4 Interface to Sector Logic                                 | 9  |
| 26 | 2.4.1 NSW Trigger Data Format                                 | 10 |
| 27 | 2.4.2 Combination of sTGC and MM trigger data                 | 12 |
| 28 | 2.4.3 Matching to Sector Logic Boards                         | 12 |
| 29 | 2.5 Ancillary Functions                                       | 15 |
| 30 | <b>3 Trigger Algorithms and Performance</b>                   | 16 |
| 31 | 3.1 Micromegas Trigger Algorithm                              | 16 |
| 32 | 3.1.1 MM Fitter Algorithm                                     | 17 |
| 33 | 3.1.1.1 Description                                           | 17 |
| 34 | 3.1.1.2 Implementation                                        | 17 |
| 35 | 3.1.1.3 Misalignment Configurations and Corrections           | 20 |
| 36 | 3.1.1.4 Performance                                           | 22 |
| 37 | 3.1.2 MM Look-Up-Table Algorithm                              | 27 |
| 38 | 3.1.2.1 Principle of the algorithm                            | 27 |
| 39 | 3.1.2.2 Algorithm Implementation                              | 29 |
| 40 | 3.1.2.3 Algorithm Performance                                 | 31 |
| 41 | 3.1.2.4 Summary                                               | 34 |
| 42 | 3.2 sTGC Trigger Algorithm                                    | 35 |
| 43 | 3.2.1 The pre-trigger from the pad towers                     | 35 |
| 44 | 3.2.2 Finding track segments and calculating their parameters | 35 |
| 45 | 3.2.3 Compensating for misalignments                          | 39 |
| 46 | <b>4 Trigger Processor Hardware Platforms</b>                 | 40 |
| 47 | 4.1 Specification comparison                                  | 40 |
| 48 | 4.1.1 ATCA Standard Interfaces                                | 40 |
| 49 | 4.1.2 Optical i/o for detector data and Sector Logic          | 40 |
| 50 | 4.1.3 AMC to AMC lateral communication                        | 41 |
| 51 | 4.2 Selection Criteria                                        | 42 |
| 52 | <b>5 Testing</b>                                              | 42 |
| 53 | 5.1 MM Implementation Initial Testing                         | 42 |
| 54 | 5.2 Pattern Generators                                        | 43 |
| 55 | 5.2.1 The Micromegas ART Pattern Generator                    | 43 |

|    |                 |                              |           |
|----|-----------------|------------------------------|-----------|
| 56 | 5.2.2           | The sTGC Pattern Generator   | 44        |
| 57 | 5.3             | Cosmic Ray Testing           | 44        |
| 58 | 5.4             | Vertical Slice and Test Beam | 45        |
| 59 | <b>6</b>        | <b>Phase-2 Compatibility</b> | <b>46</b> |
| 60 | <b>7</b>        | <b>Project rganization</b>   | <b>47</b> |
| 61 | 7.1             | Responsibilities             | 47        |
| 62 | 7.2             | Schedule                     | 48        |
| 63 | <b>8</b>        | <b>Conclusion</b>            | <b>48</b> |
| 64 | <b>Appendix</b> |                              | <b>50</b> |
| 65 | <b>A</b>        | <b>Fibers Layout</b>         | <b>50</b> |

## 66 1 Introduction

67 This document presents the status of the New Small Wheel (NSW) Trigger Processor (TP) project and it  
 68 is the main source of information for the ATLAS Design Review. More detailed information can be found  
 69 in the New Small Wheel Technical Design Report [1]. The two options for the hardware platform are de-  
 70 scribed in [2] and further details are provided in [3–5]. The trigger processor specifications are presented  
 71 in Section 2. These include a description of the interfaces to the two NSW trigger detectors (MM and  
 72 sTGC), of the interface to the muon end-cap Sector Logic and of the ancillary functionality of the trigger  
 73 processor. Section 3 presents the trigger algorithms under consideration for both detector technologies,  
 74 including studies of the corresponding expected trigger performance and information regarding trigger  
 75 duplicate handling. The plans for testing the trigger processor hardware and firmware are presented in  
 76 Section 5, followed by a brief discussion on the compatibility of the project with Phase-II. The document  
 77 concludes with a concise overview of the project organization in Section 7, including the schedule and  
 78 responsibilities.

### 79 1.1 Overview of the NSW trigger

80 The main goal of the NSW trigger is to provide additional information to the muon Level-1 (L1) trigger  
 81 in the endcap region  $1.0 < |\eta| < 2.4$ , in order to dramatically reduce fake triggers arising from particles  
 82 that are not high- $p_T$  muons originating in the interaction point (IP). A major source of fake triggers is  
 83 low energy particles, mainly protons, generated in the material located between the Small Wheel and the  
 84 end-cap middle station, the Big Wheel (BW). As shown in Fig. 1, these particles can cross the end-cap  
 85 trigger chambers at an angle similar to that of real high  $p_T$  muons. The NSW trigger signal is based  
 86 on track segments produced online by the small Thin Gap Chambers (sTGC) and Micromegas chambers  
 87 (MM) comprising the NSW detectors. These candidate track segments are input to the new Sector Logic  
 88 (SL) that uses the information to corroborate trigger candidates from the BW TGC chambers. The sector  
 89 logic sends Level-1 trigger candidates to the ATLAS Muon Central Trigger system.

### 90 1.2 System granularity and terminology

91 The endcap trigger system operates independently on each detector endcap (side A and side C). Several  
 92 other factors define the granularity of the endcap trigger system. The following recalls the terminology  
 93 and boundary conditions of the overall endcap trigger system:

- 94 • A **New Small Wheel sector** comprises 1/16<sup>th</sup> sector of a wheel (i.e. one endcap). Each wheel has  
 95 eight large and eight small sectors. The detector and trigger sectors are coincidental.
- 96 • The **Big Wheel trigger detector** (BW-TGC) has 12 detector sectors per wheel. Each sector is  
 97 further divided into trigger sectors, four trigger sections per detector sector in the endcap (larger  
 98 radius) region and two trigger sectors in the forward region. This segmentation results in a total of  
 99 48 trigger sectors in the endcap (larger radius) region and 24 trigger sectors in the forward region.
- 100 • A **Sector Logic board** serves two adjacent trigger sectors. For each side (A/C), there are 24 SL  
 101 boards for the endcap region and 12 SL boards for the forward region.
- 102 • A SL board receives data from at most three NSW sectors.



Figure 1: Schematic of the muon endcap trigger. The existing Big Wheel trigger accepts all three tracks shown. With the addition of the NSW to the muon end-cap trigger, only track ‘A’, the desired track, which is confirmed by both the Big Wheel and the NSW, will be accepted. Track ‘B’ will be rejected because the NSW does not find a track coming from the interaction that matches the Big Wheel candidate. Track ‘C’ will be rejected because the NSW track does not point to the interaction point (IP). The NSW logic restricts  $\Delta\theta$  to a value consistent with the track being provenient from the IP.

- For the NSW trigger processor, there is one **FPGA** per detector technology (MM or sTGC) per NSW sector. (MM) and sTGC information is processed by separate algorithms in separate FPGAs. Depending on the hardware platform chosen, the two FPGAs will be located in the same mezzanine card (SRS option) or on two separate mezzanine cards (LAr option).
- A **NSW trigger processor ATCA board** (corresponding to two NSW sectors) contains the mezzanine cards (two or four depending on the hardware platform ultimately chosen) and therefore serves a NSW octant.
- One NSW trigger sector needs to deliver data to up to seven SL boards. The maximum fan-out of seven is needed for large NSW sectors, due to the overlap with the BW trigger sectors when multiple scattering, misalignments and magnetic field deformations are taken into account.

### 1.3 Requirements and Limitations

The main requirement for the NSW trigger is to provide track vector candidates from the NSW detectors to be matched to track segments from the BW. The angular resolution on the NSW track vector candidates should be of 1 mrad. In the data-taking period immediately after the installation of the NSW, during the Long Shutdown 2 (LS2), the BW trigger granularity is limited to an angular resolution of 3 mrad or larger. During the Long Shutdown 3 (LS3), the BW trigger electronics and a new MDT Level-1 trigger will be deployed, allowing for 1 mrad angular resolution in the BW.

**Matching resolution requirements** The NSW measures the radial coordinates in two planes, the azimuthal coordinate,  $\phi$ , and the angle,  $\Delta\theta$ , of track segments inside the wheel, i.e. before the end-cap toroid.  $\Delta\theta$  is the angle of the segment with respect to an ‘infinite momentum track’, i.e. a line from the IP

123 to the segment's radial position in the NSW. The radial coordinate is measured by high-precision strips  
 124 in both detectors. For the sTGC,  $\phi$  is determined by the triggering tower of sTGC pads, and for the MM,  
 125 by small angle stereo strips. The angle  $\Delta\theta$  is to be measured to an accuracy close to 1 mrad. The NSW  
 126 trigger logic will rejects track segments with  $\Delta\theta > \pm(7 - 15)$  mrad (the final value will be determined  
 127 from future studies). The corroboration with the BW trigger is done by projecting the ‘infinite momentum  
 128 track’ through the  $R-\phi$  point of the segment in the NSW onto the BW’s  $R-\phi$  array of Regions-of-Interest  
 129 (RoI). This matching requires the NSW trigger candidates to have:

- 130 •  $\phi$ -resolution of 20 mrad  
 131 •  $\eta$ -resolution of 0.005

132 The angle  $\Delta\theta$  is passed to the Sector Logic but is not used in the Phase-I trigger decision.

133 **Simulation limitations** The trigger model in the Athena simulation of both the sTGC and the MM  
 134 trigger is not complete. There is also the need to simulate correlated backgrounds. The following studies  
 135 are particularly urgent:

- 136 • Probability of finding track segments as a function of radius, in particular the probability for more  
 137 than four candidates, in either MM or sTGC, and for a total of eight candidates when duplicates are  
 138 removed.  
 139 • The  $\phi$ -resolution needed for matching to the BW RoI's  
 140 • Effects due to misalignment within the NSW and between the NSW and BW

## 141 2 Trigger Processor Specifications

142 The context diagram in Figure 2 shows the different interfaces to the NSW Trigger Processor and the signal  
 143 flow through the trigger system. The NSW Trigger Processor is implemented in FPGAs on mezzanine  
 144 cards of an ATCA carrier card. One NSW sector (16 per end-cap wheel) is implemented on one FPGA  
 145 (Xilinx Virtex 7 XC7VX690T). Input and output fibers connect directly to the mezzanine cards. Some  
 146 services are on the carrier card which has an Ethernet connection. The plan for the fiber connections from  
 147 the front ends to the readout and trigger processors in USA-15 is shown in Appendix A.

### 148 Mezzanine card connections

- 149 • 32 input fibers from ADDC (MM) or Router (sTGC)  
 150 • Several output fibers to the Sector Logic  
 151 • An output fiber to FELIX which carries the following data flows on different E-links:  
 152   – Level-1 Accept event readout  
 153   – Exception messages  
 154   – Statistics  
 155   – Sampled events  
 156   – Algorithm parameters  
 157   – TTC and BC clock



Figure 2: Trigger Processor context diagram

158    **Services via the ATCA carrier**

- 159    • Configuration of the FPGA itself via Ethernet  
 160    • Temperature and voltage monitoring to DCS via Shelf Manager Ethernet

161    **2.1 Trigger Processor Latency**

162    The trigger signal delivery time to the MuCTPI has to happen within 57 Bunch Crossings (BCs) (an  
 163    increase of 3.5 BCs from the pre-Phase-I system). The New Sector Logic requires 16 BCs from the  
 164    time it receives the trigger signal from the NSW and BW until it delivers it to the MuCTPI (5 BCs for  
 165    serializer/deserializer, 9 BCs for trigger processing, and 2 BCs for transmission via fiber). This leaves  
 166    43 BCs (1075 ns) for the full NSW trigger processing chain (including 2 BCs to merge the MM and sTGC  
 167    trigger streams and time for the signal transmission to the Sector Logic).

168    The latency of the current design of the NSW trigger chain is at the boundary of the required value,  
 169    leaving no contingency. It is therefore imperative to keep the latency as low as possible at every step  
 170    of the chain, including at the Trigger Processor. The TP latency estimation is given in Table 1 for both  
 171    chamber technologies. The accounting includes the input/output serializers, the trigger algorithm, the  
 172    transmission time to the Sector Logic, and the algorithm that merges the MM and sTGC streams. The  
 173    merging is currently planned to occur in the sTGC FPGA.

|                                                        | sTGC        |             | MM          |             | Notes                                             |
|--------------------------------------------------------|-------------|-------------|-------------|-------------|---------------------------------------------------|
|                                                        | min<br>(ns) | max<br>(ns) | min<br>(ns) | max<br>(ns) |                                                   |
| Input deserializer (Rx)                                | 40          | 40          | 44          | 44          |                                                   |
| Trigger algorithm                                      | 56          | 56          | 56          | 56          | 320 MHz clock                                     |
| Stream merging algorithm                               | 25          | 50          | —           | —           | Assigned to sTGC                                  |
| Re-synch to 320 MHz clock<br>driving output serializer | 0           | 3.1         | 0           | 3.1         | 45° phase chosen to<br>best match pipeline length |
| Output to Sector Logic<br>serializer (Tx only)         | 25          | 30          | 25          | 30          | Deserializer on Sector Logic<br>latency budget    |
| Fiber to Sector Logic                                  | 20          | 25          | 20          | 25          | 4-5 m fiber @ 5ns/m                               |
| Total                                                  | 166         | 204.1       | 145         | 158.1       |                                                   |

Table 1: Trigger Processor latency for the MM and sTGC streams. The time required to merge the two streams is assigned only to the sTGC, in which FPGA the merging would occur.

## 2.2 Interface to Micromegas

The ART (address in real time) data from an entire sector will be transmitted to a single trigger processor via 32 ADDCs. Each ADDC will transmit its data on a single fiber optic link. The trigger processor therefore uses 32 fibers to receive the ART data from one sector. Since the MM and sTGC trigger processors will share the same ATCA carrier card, each carrier will support two sectors.

### 2.2.1 ART Data Protocol

The ART Data from the ADDC will be transmitted using the GigaBit Transceiver (GBT) architecture and transmission protocol in a low-latency widebus mode at a rate of 4.8 Gb/s. The trigger processor will take advantage of the GBT firmware developed by GBT Project to implement the receivers.

The GBT packet in widebus mode will provide 112 data bits and arrives once every bunch crossing. One ADDC will service 32 VMMs and each packet can contain ART data from a maximum of eight triggered VMMs. Each VMM will be uniquely identified to determine which MM strip on the sector was hit.

There are two options for how data packet bits will be defined. The difference between the two is how the VMM ID information is encoded. The first data protocol option will provide the VMM IDs of every VMM that was triggered by asserting a bit in a 32-bit hit list as shown in Figure 3. The second option will encode each VMM ID in a list. For both options, the triggered strip number within each VMM will be provided in an encoded list. The first option would move the VMM ID encoding task from the ADDC ASIC to the trigger processor FPGA. Both options, shown in Tables 2 and 3, use the full 112 bits provided by the GBTx' wide mode.

|        |          |              |              |                   |               |
|--------|----------|--------------|--------------|-------------------|---------------|
| 0b1010 | BCID(12) | ERR_FLAGS(8) | HIT_LIST(32) | ARTDATA_PARITY(8) | 8xART_DATA(6) |
|--------|----------|--------------|--------------|-------------------|---------------|

Table 2: Option 1 ADDC GBT packet format.

- BCID = 12 bit bunch crossing ID



Figure 3: Option 1 VMM ID encoding using 32-bit hit list.

|            |          |            |                   |               |
|------------|----------|------------|-------------------|---------------|
| HIT_CNT(3) | BCID(12) | 8xVMMID(5) | ARTDATA_PARITY(8) | 8xART_DATA(6) |
|------------|----------|------------|-------------------|---------------|

Table 3: Option 2 ADDC GBT packet format.

- HIT\_LIST = 32-bit list of flags corresponding to each of the 32 VMMs. 0 - no hit, 1 - hit. A register controls if this is a filtered (i.e. 8 hits max) or an unfiltered copy of the VMM flags registered in a particular BC.
- HIT\_CNT = 4-bit number of hits (range 0 - 8; 9 - 15 invalid)
- VMMID = 5-bit address of triggered VMM
- ART\_DATA = 6-bit triggered VMM strip number
- ARTDATA\_PARITY = 8-bit parity the ART data computed by each of the 32 ART de-serializer units. Each bit corresponds to one of the ART data field selected by the priority unit.

### 2.2.2 Decoding ART Data

Once the ADDC GBT packet is received, the ART data is decoded into a strip number. This number represents the strip's distance to the beam line. Each fiber will have an associated geographic address that is used in the decoding process to set the location of strip 0. The strip number is then multiplied with a constant to calculate the slope of a line with the interaction point. The slopes for each ART hit are then sent, along with the strip number, to the trigger processor algorithm. Since the fiber location will provide information used to calculate the strip number, the ADDC will have a debug mode that can be used to diagnose cabling issues.

## 2.3 Interface to sTGC

## 2.4 Interface to Sector Logic

The track vector information from the NSW is combined with the results from the Big Wheel TGC (BW-TGC) by the new Sector Logic board located in USA-15. The partitioning of the Trigger Sectors and granularities of the Regions-of-Interest (RoI) for the new Sector Logic board are the same as for the current pre-Phase-I system. The same optical links and data format is used for signals from the BW-TGC, while new input optical links have been introduced to receive the NSW trigger information.

217 The BW-TGC, which covers the range of  $1.0 < |\eta| < 2.4$ , consists of three stations (TGC1, TGC2 and  
 218 TGC3). The trigger algorithm extrapolates pivot-plane (TGC3) hits to the IP to construct roads following  
 219 the infinite-momentum (straight) path for a track. Deviations ( $\Delta R$  and  $\Delta\phi$ ) from this path of hits in the  
 220 trigger planes are related to the momentum of the track. Coincidence signals<sup>1</sup> are generated independently  
 221 for  $R$  and  $\phi$ . The hit position information with granularity of RoIs and deviations ( $R$  and  $\Delta\phi$ ) is sent to  
 222 the Sector Logic board.

223 The NSW information on the candidate track vectors, which are pointing to the IP within  $\pm(7 - 15)$  mrad  
 224 deviations<sup>2</sup>, are provided to the Sector Logic: the position ( $R$  and  $\Delta\phi$ ) and the deviation of the incidence  
 225 angle at the NSW from a straight line to the IP ( $\Delta\theta$ ). In Phase-I, the final trigger decision is taken solely by  
 226 merging the  $R-\phi$  coincidence of signals from the BW-TGC and the NSW. In Phase-II, the  $\Delta\theta$  information  
 227 will be combined with similar information from the BW chambers to improve the background rejection  
 228 and sharpen the trigger threshold. Since the NSW trigger system needs to be compatible with the Phase-II  
 229 requirements, the angle  $\Delta\theta$  is required to be measured with an accuracy close to 1 mrad even in Phase-I.

#### 230 2.4.1 NSW Trigger Data Format

231 The output data format of a track segment from the NSW trigger processor is shown in Table 4. One  
 232 track segment is represented as 24 bits of data, which consist of 2 bits of segment-type information for  
 233 each detector (sTGC and MM), 5 bits for  $\Delta\theta$ , and 6 bits and 8 bits for  $\phi$  and  $R$  position information,  
 234 respectively. Required resolutions (1 bit) are approximately 1 mrad ( $\pm 15$  mrad full scale) for  $\Delta\theta$ , 20 mrad  
 235 for  $\phi$  and 0.005 in pseudo-rapidity  $\eta$ .

236 The 2-bit segment-type information can provide an indication of the segment candidate quality, in addition  
 237 to the detector. Any value other than 00 indicates the segment was found by the corresponding detector.  
 238 Up to a 3-level categorization can be encoded (01, 10 and 11) but specifics about its definitions still need  
 239 to be established.

240 Even though the sector logic is blind to the track segment origin (either MM or sTGC) and quality, the  
 241 segment-type bits should still be transmitted. They can be used to monitor the trigger operation, *e.g.* to  
 242 study the BW matches for each segment-type. The latency saved by omitting these four bits per candidate  
 243 is small.

| Field:       | sTGC type | MM type | $\Delta\theta$ (mrad) | $\phi$ index | $R$ index | spare |
|--------------|-----------|---------|-----------------------|--------------|-----------|-------|
| Num of bits: | 2         | 2       | 5                     | 6            | 8         | 1     |

Table 4: Data format of the output of the trigger processor sent to the Sector Logic. Format of a track vector candidate from the NSW (24-bits/track vector). The sTGC and MM type information can encode the quality of the candidate.

244 The data is transmitted from the sTGC and MM trigger processors to the Sector Logic via optical fibres.  
 245 The baseline is to send eight candidates per LHC bunch crossing (40 MHz) on two fibers, four candidates  
 246 per stream, at 6.4 Gb/s, 128 bit, using 8b/10b encoding. Figure 4 shows the format of the transmitted

<sup>1</sup> The TGC1 station has three layers and the outer two stations (TGC2 and TGC3) each have two layers, resulting in a total of seven layers. A 3-out-of-4 coincidence is required for the doublet planes of TGC2 and TGC3, for both wires and strips; a 2-out-of-3 coincidence is required for the triplet (TGC1) wire planes; and 1-out-of-2 possible hits for the triplet strip planes.

<sup>2</sup> The value of this requirement is configurable and it will be optimized once better trigger simulation is available.

247 data. It is comprised of two IDLE codes for alignment purposes, three bytes of data for up to four track  
 248 segments, the NSW Sector ID (4 bits) and a BCID number (12 bits). Each 2-byte word is transferred after  
 249 8b/10b encoding at 320 MHz.

| word | byte0     | byte1     |
|------|-----------|-----------|
| 0    | comma     | comma     |
| 1    | segment-1 |           |
| 2    |           | segment-2 |
| 3    |           |           |
| 4    | segment-3 |           |
| 5    |           | segment-4 |
| 6    |           |           |
| 7    | SectorID  | BCID      |

Figure 4: Data format from the NSW trigger processor to the Sector Logic. The data is transmitted with 8b/10b encoding in one bunch crossing (16 bytes at 6.4 Gbps). The 24-bit segment format is shown in Table 4. The comma character is an idle code for alignment purposes in the 8b/10b encoding. The NSW Sector ID is 4 bits and the BCID is 12 bits.

250 The model of the trigger in the Athena simulation of both the sTGC and the MM trigger is incomplete.  
 251 There is also a need to simulate correlated backgrounds. As a result, the probability to find track segments  
 252 as a function of the radius is not known. The main quantities of interest are the probability of finding  
 253 more than four sTGC or MM candidates, as well as more than eight total candidates when duplicates are  
 254 removed.

255 **Data overflow** Information about data overflow is planned to be added. An overflow bit would be set if,  
 256 for a BCID, more than the maximum number of transmittable candidates is found, and thus incomplete  
 257 information is sent. One possibility is to use the reserved bit of the last track candidate word sent to the  
 258 Sector Logic. This option requires no additional bits.

259 If the simulation shows that eight candidates are insufficient, the following possibilities, although highly  
 260 undesirable, could be considered to enable transmitting up to 12 candidates:

- 261 • Increase from two to three fibres. This option would exceed the number of serializers available with  
 262 the intended Kintex FPGA and increase the complexity of the Sector Logic. Recently, however, it  
 263 was found that the Sector Logic boards may in fact be able to handle up to 10 links from the NSW,  
 264 not six as initially thought. This means that a third fiber might be available in case of need. More  
 265 studies would be necessary to explore the eventual use of this possibility, which is not our baseline.
- 266 • Transmission at 9.6 Gb/s, however the Kintex GTX cannot operate at this rate (higher or lower is  
 267 OK). Note that this does not require any change at the Trigger Processor end.
- 268 • Transmission at 8 Gb/s which would be sufficient to provide 10 candidates.

269 **2.4.2 Combination of sTGC and MM trigger data**

270 The MM and sTGC trigger processors compute track vectors independently. The sTGC trigger produces  
 271 a maximum of four candidates per sector, driven by its hardware design. The MM trigger has no such  
 272 limitation and can theoretically find more candidates if they are present.

273 The sTGC and MM trigger information from one NSW 1/16<sup>th</sup> sector will be processed in the same Trigger  
 274 Processor ATCA board [4, 6]. The two algorithms (see Section 3) will run in separate FPGAs, connected  
 275 by several high-speed, low latency differential LVDS pairs. Depending on the hardware platform chosen  
 276 the two FPGAs will be located in one AMC mezzanine card (SRS option [5]) or two (LAr option [3]).  
 277 A NSW trigger processor ATCA board serves one NSW octant, i.e. two NSW sectors, and therefore  
 278 contains two or four mezzanine cards (depending on the hardware platform).

279 Given that the MM trigger system is expected to be faster than the sTGC one, the MM trigger results  
 280 will be sent to the sTGC FPGA for the stream merging stage. This procedure eliminates the impact  
 281 on the latency due to the data transfer. The merging algorithm, which includes duplicate removal, is  
 282 expected to take up to 50 ns, increasing the overall NSW latency budget to 43 BCs. The actual algorithm  
 283 implementation is however needed for a full evaluation of its impact on the latency.

284 In order to implement the merging algorithm, the fast connectivity between the MM and sTGC FPGAs  
 285 is imperative. In the current design, the SRS option offers 64 LVDS high-speed connections between the  
 286 FPGAs, which would fully satisfy the bandwidth requirements, while the LAr option only offers eight.  
 287 Studies are on-going to determine by how much this number could be increased.

288 The trigger stream merging will limit the number of NSW track vector candidates to eight (subject to  
 289 simulation results, see Section 2.4.1). Possible selection criteria of track vectors are:

- 290 • Remove duplicates, where a duplicate is a candidate with the same  $R$ , and  $\phi$  and similar  $\Delta\theta$ . The  
 291 conditions for a  $\Delta\theta$  match still need to be evaluated taking into account the intrinsic resolutions and  
 292 the relative alignment of the two detectors.
- 293 • Select according to a ‘quality’ flag defined by the trigger algorithm. For example, if the vector  
 294 was produced from only one of the two four-layer modules of the sTGC because the track passed  
 295 through a support in the second module, its resolution is degraded and the vector would have a  
 296 lower quality.

297 Although both sTGC and MM trigger modules are in the same ATCA board with high bandwidth con-  
 298 nectivity, an algorithm that would merge hits and then find vectors has been disfavored at this time due to  
 299 the additional latency requirements and increased complexity.

300 **2.4.3 Matching to Sector Logic Boards**

301 Figure 5 shows the projection of the NSW sectors into the BW-TGC pivot plane which is divided into  
 302 two regions, end-cap ( $|\eta| < 1.9$ ) and forward ( $|\eta| > 1.9$ ). The end-cap region is divided into 48 trigger  
 303 sectors in  $\phi$ , while the forward region is divided into 24 trigger sectors. A trigger sector, represented in  
 304 the figure by the green lines, is a logical unit that is treated independently in the trigger<sup>3</sup>.

---

<sup>3</sup> Remember that the TGC has 12 detector sectors. Thus, each TGC detector sector comprises four end-cap trigger sectors and two forward trigger sectors.

305 The thick red lines in Figure 5 show the projective boundaries of the NSW detector, which covers  $1.3 < | \eta | < 2.4$  and whose structure has octant symmetry. Each octant has a large NSW sector and a small  
 306 NSW sector<sup>4</sup>, corresponding to 16 NSW Trigger Sectors per endcap. The boundaries of the NSW sectors  
 307 (indicated by the red lines) do not coincide with the segmentation of the BW-TGC trigger sectors. Each  
 308 large NSW sector covers geometrically six BW-TGC Trigger Sectors, while each small NSW sectors  
 309 covers seven (notice the very thin overlap regions in the NSW small sector case).

311 To take into account deformations in the endcap magnetic field, multiple scattering and misalignments,  
 312 one NSW sector needs to deliver information also to the BW-TGC sectors adjacent to the geometrically  
 313 overlapping ones. This means that the information from a large NSW sector has to be corroborated with  
 314 that of six endcap trigger sectors and four forward trigger sectors. Similarly, information from a small  
 315 NSW sector has to be combined with four endcap trigger sectors and three forward trigger sectors.

316 The granularity of the Regions-of-Interest is indicated by the thin red lines. The sizes of the RoIs are  
 317 approximately  $0.025 \times 0.030$  in  $\eta - \phi$ . There are 560 (366) RoIs corresponding to each large (small) NSW  
 318 sector.



Figure 5: Trigger Sector segmentation (green lines) and projection of the NSW sectors (thick red lines) into the pivot plane (TGC3). The blue shapes represent the coverage of a SL board. The number of SL boards covered by each NSW sector is indicated in blue.

319 There are two types of Sector Logic boards, the “Endcap Sector Logic” board and the “Forward Sector

<sup>4</sup> Note that in the NSW the detector and trigger sectors are coincidental, unlike in the case of the BW-TGC.

320 Logic” board. A single Sector Logic board serves two adjacent trigger sectors, therefore 24 Endcap and  
 321 12 Forward Sector Logic boards per side are required. Figure 5 shows the mapping of the SL boards  
 322 (in blue) to the trigger sectors. The trigger information from each NSW trigger sector is fanned-out and  
 323 delivered to the corresponding SL boards. There will be five to seven replicas required from each NSW  
 324 sector because a single SL board serves two trigger sectors. This maximum number of signal fan-out is  
 325 needed for the NSW large sector.

326 Several replication possibilities are currently being considered:

327 **1:7 fan-out with active fan-outs:** slightly higher latency due to optical-electrical-optical conversion and  
 328 additional fiber length

329 **1:7 fan-out with passive optical splitter:** rejected since signal loss is too high

330 **2×(1:4) passive optical splitter:** perhaps possible, depending if the transmitter optical power is suffi-  
 331 cient. Would need to be checked and/or tested.

332 **4×(1:2) passive optical splitter:** OK, but only if the trigger processor has enough outputs. Requires  
 333 eight output optical links.

334 **7 fibers directly from TP:** OK, but only if the trigger processor has enough outputs. Requires 14 output  
 335 optical links.

336 At this point, both platform options should be able to provide 14 optical output links for this purpose.  
 337 Pending a deeper evaluation, the baseline is that the TP will provide the seven streams (2 x 7 fibers)  
 338 directly to the SL, without the need for a fan-out, as seen in Figure 6. Note however that neither the  
 339 electro-optics for additional links nor fan-outs have been included in the NSW costing.



Figure 6: Interface between the NSW trigger electronics and the Sector Logic. One NSW trigger sector will connect to up to seven SL boards via optical links.

340 It has also been proposed that splitting the fibers, at the NSW Trigger Processor, as belonging to different  $\phi$   
 341 regions within a sector might reduce the need to fan-out the output, without compromising the maximum  
 342 number of candidates transmitted. This would however introduce undesirable hard boundaries into the  
 343 SL. A third fiber from inner regions of the large sectors might also be used. More studies would be needed  
 344 to explore these possibilities.

## 345 2.5 Ancillary Functions

346 In addition to implementing the trigger algorithms for the MM and sTGC systems, the NSW Trigger  
 347 Processor has to perform several ancillary functions. These include time synchronization, configuration,  
 348 monitoring and debugging mode, among others. Several functions are common to both MM and sTGC  
 349 and their firmware can be shared as well-defined packages. These functions are:

### 350 Interface for algorithm configuration parameters

351 Parameters for the algorithms must be stored at runtime. Examples are the  $\Delta\theta$  and other cuts, the  
 352 BCID offset, alignment parameters, the parts of the detector to be considered as disabled, road size,  
 353 etc. Configuration can be done via Ethernet and the carrier board or via an E-link from FELIX that  
 354 is available on the link that brings the TTC information. Readback of the parameters must also be  
 355 provided.

### 356 TTC interface

357 FELIX provides the following TTC information on an E-link:

358 Level-1 Accept, BCR, ECR, system[3..0], user[7] from the 8-bit TTC broadcast packet

359 The E-link provides the 40 MHz BC clock which is used to synchronize the output to the Sector  
 360 Logic. Included here is logic to synchronize a local bunch-crossing ID to the Sector Logic by  
 361 means of a configurable BC offset loaded on BCR into a local BCID register. Note that the various  
 362 input links will not have the same phase and may not even be matched to the same BC clock. Input  
 363 processors must ensure that all sources are aligned to the same BC clock. Both technologies include  
 364 a BCID or its low bits in the packet sent on every bunch-crossing. Should the offset of this BCID  
 365 from the local BCID differ from what is expected and exception message (see below) must be sent.  
 366 The firmware to do this alignment is not shared.

### 367 Level-1 output buffer

368 For bunch-crossings in which at least one segment has been found, the input data and the output  
 369 segment data that is sent to the Sector Logic is stored, along with its BCID for later matching to the  
 370 BCID of a Level-1 Accept. Those bunch-crossings that have Level-1 Accepts (and possibly those  
 371 preceding and following) are transferred to the Level-1 output buffer (aka derandomizer). The data  
 372 must be stored for the duration of the Level-1 latency. The output bandwidth should be sufficient  
 373 for the rather small fixed input and output data lengths at the full Level-1 rate of 400 kHz. If not,  
 374 this logic could provide a BUSY output to the RODBUSY system when its output buffer becomes  
 375 close to full.

### 376 Monitored event buffer

377 A random sample of complete events are collected for sending to a monitoring process. Example  
 378 criteria are: any event, event with at least one segment found, events with segments outside the  $\Delta\theta$   
 379 cut, ...) The data buffered as one event includes all the input data and the output segment data that  
 380 is sent to the Sector Logic for a given BC.

### 381 Statistics buffer

382 Statistics are continuously collected and periodically transferred to the Statistics buffer. Statistics  
 383 includes the number of bunch-crossings that have candidates that are not accepted by Level-1, their  
 384 distribution in  $R\phi$ , the multiplicity of segments per bunch-crossing, etc.

### 385 Exception buffer

386 In the course of processing, exceptional conditions may be found, usually due to corrupted data. A

387 convenient way to handle these is to store an exception code and some context data into a buffer  
 388 which will be passed to the monitoring PC via FELIX.

389 **Playback mode to fake input links**

390 For development and testing we require that simulated data can be injected in place of the data  
 391 received by the links to the Front End. One way to do this is via an E-link from FELIX that is  
 392 available on the link that brings the TTC information. Full-speed testing.

393 **Segment output to Sector Logic and to the “other” detector**

394 Segments are sent out either to the Sector Logic via the FPGA serializer or to the “other” detector  
 395 via a parallel LVDS bus. The candidate packet to be sent to the Sector Logic must be prepared from  
 396 the segments found. Clones must be made and the output links to the Sector logic must be driven.  
 397 If the segments found are to be sent to the “other” detector’s Trigger Processor, the segment data  
 398 must be sequenced out onto the parallel LVDS bus.

399 **Merge buffers into the output GBT link to FELIX**

400 The Level-1, Monitoring, Statistics and Exception buffers are merged, using different FELIX “stream-  
 401 IDs” onto a fiber link to FELIX. FELIX then routes them to the ROD and Monitoring PCs.

402 Since the MM uses the GBTx to transmit the data from the Front End and the sTGC uses native FPGA  
 403 serializers, the link interface firmware cannot be shared. Note that the monitoring of board temperatures  
 404 and voltages is done by the ATCA Shelf Manager using IPMI. The fiber plant is described in A.

405 

## 3 Trigger Algorithms and Performance

406 

### 3.1 Micromegas Trigger Algorithm

407 Two MM trigger algorithms are considered, the “MM Fitter algorithm” and the “MM Look-Up-Table  
 408 algorithm”. They are described in detail below in Section 3.1.1 and Section 3.1.2 respectively. Comparing  
 409 and contrasting these algorithms will allow the development of an optimized final algorithm.

410 The MM Fitter Algorithm is consider the baseline algorithm for now. It is currently the one that has  
 411 been studied more extensively, including studies on the impact of misalignments in its efficiency and  
 412 resolution. The algorithm is mostly complete and it has been implemented on an FPGA evaluation board.  
 413 It is also the algorithm currently implemented in the ATLAS Trigger Simulation software.

414 The studies are however not complete since they don’t include a proper background simulation, and in  
 415 particular, effects from coherent background. A more realistic implementation of the trigger simulation  
 416 is also required. Work is on-going on the trigger and cavern background simulations. Once these are  
 417 finalized, we plan to test both algorithms with the conditions expected at high luminosity. If the MM  
 418 Look-Up-Table algorithm is shown to be more robust, changes to the final algorithm will be considered  
 419 and implemented as necessary.

420 **3.1.1 MM Fitter Algorithm**

421 This section describes the algorithm introduced in Ref. [7] and its performance. It also includes details of  
 422 its implementation on an evaluation board and measurements performed in that set up, as well as recent  
 423 studies of the impact of misalignments on the algorithm.

424 **3.1.1.1 Description**

425 This algorithm has been described in detail in Ref. [7]. Here, its main features are summarized. The  
 426 algorithm has four functionally distinct sets of operations:

- 427 1. translation of hardware addresses into equivalent track slopes fixed to the IP,
- 428 2. determination of the presence of a multi-plane coincidence,
- 429 3. parallel calculation of global  $\theta$  (azimuth of the track position at the entrance of the NSW) and local  
 430  $\theta$  (direction, at the entrance of the NSW, referred to as  $\theta_{rec}$  in Section 3.1.2) angles with parallel  
 431 strips and global average stereo strips, using the multi-plane coincidence,
- 432 4. calculation of  $\Delta\theta$ , global  $\theta$  (referred to as  $\theta$  in what follows) and  $\phi$ .

433 The first two items are performed by many *finders*, which don't consume significant amount of resources,  
 434 but reduce significantly the throughput towards the second half of the algorithm. Items 3 and 4 are  
 435 performed by *fitters* that consume most of the resources allocated to the algorithm, but only performed  
 436 upon the presence of a solid track candidate. Figure 7 shows the algorithmic flow with smaller functional  
 437 units. They are labeled for easy reference when discussing the algorithm implementation.

438 **3.1.1.2 Implementation**

439 A 1/16<sup>th</sup> sector wide slice of the full algorithm above has been implemented and the design is being  
 440 tested using a Xilinx VC707 Development board. This board includes a Virtex XC7VX485T FPGA.  
 441 The implementation includes two ADDC GBT interfaces and associated trigger processor algorithm.  
 442 Extrapolating from this implementation, the resources are estimated to be ~70% of the Xilinx V7485  
 443 chip. There are pin-compatible upgrades to the target chip if more resources are needed. Specifically,  
 444 each of the steps shown in Figure 7 has been implemented as follows.

- 445 (A) Incoming strip hit addresses are converted to global slope values using a multiplication with a  
 446 constant. A strip's stored slope value is defined as the orthogonal distance between a given strip  
 447 and the beam line divided by the  $z$  location of the relevant detection plane. It is precomputed taking  
 448 into account a strip offset and a  $z$  position stored for each of the 8 planes and 16 radial segments of  
 449 each wedge (one segment is read-out per MMFE8).
- 450 (B) Hit slope values are stored in a circular buffer defined as  $(N \text{ slope-roads}) \times (8 \text{ planes}) \times (T)$ ,  
 451 where  $T$  is the cyclical buffer depth and corresponds to the number of bunch crossings over which  
 452 coincidences between planes are allowed. A track candidate is identified once a minimum hit  
 453 threshold is met. The value of  $N$  has been optimized to maximize efficiency while being resilient to  
 454 backgrounds[7], and corresponds to about 4 (56) strips per slope road for horizontal (stereo) strips.  
 455 Hits are kept in the buffer for a fixed number of bunch crossings that is configurable. For the studies  
 456 in this document, two bunch crossings are used.



`algorithms-harvard/block_detailed_V02.pdf`

Figure 7: The block diagram is constructed with time flowing downward; therefore tasks on the same horizontal line are accomplished in parallel. Blocks correspond to operations comprising the algorithm, solid flow lines represent the flow of data, and light dotted lines represent fit abandonment signals, which can be triggered at multiple points throughout the algorithm. X in this diagram refers to horizontal strips, while U and V refer to the two sets of stereo strips (with a  $+1.5^\circ$  and  $-1.5^\circ$  stereo tilt respectively). Blocks after step D are approximately sized to represent their relative processing times.

457 (C) Each slope-road of the buffer is checked once per bunch crossing to determine if a coincidence  
 458 threshold has been met. Coincidence requires a minimum number of planes to be hit and the oldest  
 459 piece of data to be timing out. Coincidence identification is accomplished using the binary hit  
 460 configuration for a given slope-road as the address of a lookup table that is pre-populated with pass  
 461 or no-pass signals for various hit configurations of the eight planes. Rather than searching the entire  
 462 buffer for hits, only active areas are interrogated for confidence verification.

463 (D) slope-road contents containing the track candidate are read and cleared from the buffer and relevant  
 464 track components are forwarded for processing.

465 Once a candidate track is identified, the following steps (E-I) are completed in parallel:

466 (E) A local slope is calculated using a least squares fit of available horizontal-strip hits in the proposed  
 467 track. Several constants are stored in a look-up table for the 11 possible combinations (indexed by  
 468  $k = \{1..11\}$ ) of  $n = \{2, 3, 4\}$  horizontal hits to speed up the fit ( $M_X^{\text{local}}$ ).

469 (F) A global horizontal hit slope, which is anchored to the IP, is calculated as the average of registered  
 470  $n = \{2, 3, 4\}$  horizontal-strip hits in the proposed track candidate ( $M_X^{\text{global}}$ ).

471 (G) A global stereo (U) hit slope, which is anchored to the IP, is calculated as the average of registered  
 472  $n = \{1, 2\}$  U hits in the proposed track candidate.

473 (H) A global stereo (V) hit slope, which is anchored to the IP, is calculated as the average of registered  
 474  $n = \{1, 2\}$  V hits in the proposed track candidate.

475 (I) Stereo-strip background hits are further filtered from proposed tracks by judging how correlated two  
 476 stereo-strip hits are with one another. In particular, strips with the same stereo tilt are compared  
 477 between the two multiplets. If they are consistent between the two multiplets, the pair is kept, while  
 478 otherwise it is discarded. If only one multiplet registers a hit on a strip of a given stereo tilt, it is  
 479 kept.

480 (J)  $\Delta\theta$  is calculated using previously fitted local and global horizontal slopes ( $M_X^{\text{local}}$  and  $M_X^{\text{global}}$ ). This  
 481 calculation is accomplished using a small  $\phi$  angle approximation. In local coordinates, for which  
 482  $\phi = 0$  at the middle of the module, this approximation introduces at most a 4% bias in the  $\Delta\theta$   
 483 calculation. In order to speed up the calculation, the quantity  $1/(1 + M_X^{\text{local}} M_X^{\text{global}})$  is calculated  
 484 using a reciprocal look-up table that introduces a negligible error. Tracks with negative local slopes  
 485 (originating from outside the detector) are rejected at this step. Candidates with  $\Delta\theta > 15$  mrad are  
 486 also rejected at this step.

487 (K) A  $\theta$  and a  $\phi$  are calculated using previously calculated stereo and horizontal slopes. In particular, if  
 488 hits exist for both stereo tilts, the cartesian position along the horizontal strip direction is calculated  
 489 using the two stereo hit slopes (U and V). If only one exists, the intersection point of the stereo strip  
 490 and the horizontal strip is calculated. This requires the storage of two quantities:  $A \equiv \csc(1.5^\circ)$   
 491 and  $B \equiv \cot(1.5^\circ)$ , two products and an addition. The horizontal slope provides the other cartesian  
 492 coordinate. The two cartesian coordinates are transformed into  $\theta$  and  $\phi$  using a look-up table. If the  
 493 two cartesian coordinates do not correspond to a  $\theta$  and  $\phi$  in the wedge (which can happen in cases  
 494 with significant background contamination), the candidate is rejected.

495 (L) A  $\Delta\theta$ ,  $\theta$  and  $\phi$  are offered as a trigger signal.

498 Timing estimates for these steps have been performed using the evaluation board described at the be-  
 499 ginning of this section. These estimates do not include the last look-up table to go from cartesian to  
 500 cylindrical coordinates in step K. The trigger algorithm's longest path is  $\approx 56$  ns, assuming all necessary  
 501 hits arrive promptly and track fitting begins immediately. The splitting of the latency at each step of  
 the algorithm in clock ticks is summarized in Figure 8. The last look-up table should only increase this



Figure 8: Latency of the Harvard algorithm in clock ticks in each major step of the algorithm.

502  
 503 latency by 1 clock tick, that is to 59 ns. Therefore, the latency time of the algorithm is slightly above two  
 504 bunch crossings.

505 **3.1.1.3 Misalignment Configurations and Corrections**

506 The performance in Section 3.1.1.4 is evaluated for ideal conditions and for samples simulated with sev-  
 507 eral misalignment effects. This section describes the types of misalignment effects considered, how they  
 508 are simulated and the implementation of misalignment corrections in the FPGA algorithm. All misalign-  
 509 ments considered refer to misalignments of one multiplet with respect to the other in the same sector.  
 510 Misalignments affecting the full sector are not addressed, because they only affect the  $\theta$  and  $\phi$  determina-  
 511 tion, which requires less precision. However, similar correction techniques as those explored here can be  
 512 applied to full sector position inaccuracies.

513 The following misalignment configurations are considered:

- 514 1.  $r$ , displacements orthogonal to the beam axis—one or part of a multiplet shifted up or down with  
 515 respect to the other,
- 516 2.  $\phi$ , rotations along an axis perpendicular to the wedge of one multiplet with respect to the other,  
 517 with the axis going through the center of the lower edge of the chamber,
- 518 3.  $\theta_{\text{tilt}}$ , one plane tilted towards the IP with respect to the other,
- 519 4.  $z$ , one multiplet displaced along the beam axis.

520 Figure 9 illustrates each of these cases. Figures 9 (a) and (b) correspond to two cases of the first misalign-  
 521 ment type considered above.



Figure 9: Illustration of the position of the different multiplets in the different misalignment configurations consid-  
 522 ered in this section. In all cases except (c), the IP is to the left and the image shows the  $r$ - $z$  plane passing through  
 523 the center of the wedge. In (c), the  $r$ - $\phi$  plane is shown as it would look from the IP.

524

525 These misalignments are simulated through the use of true hits in the Athena simulation. The true hit  
 526 position that causes the trigger to fire is known. It is thus trivial to overlay a misaligned geometry over  
 527 the true hits and recalculate the strips that were hit. Misalignments corresponding to up to 5 mm shifts in  
 528 the relative positions of the two wedges are considered. For the  $\phi$  rotations and the  $\theta_{\text{tilt}}$  this corresponds  
 529 to up to 1.36 mrad.

530 The performance of the trigger algorithm is studied with and without misalignments. In addition, cor-  
 531 rections have been implemented for all cases illustrated in Figure 9 except case (c). The implementa-  
 532 tion of the corrections happens in steps A and E of the algorithm, described in Section 3.1.1.2, without any  
 533 additional resource overhead. In particular, as already detailed,  $2 \times 8 \times 16$  constants, corresponding to 8  
 534 planes and 16 MMFEs, store the radial offset and  $z$  position of the corresponding strips. These constants

532 can be updated to perfectly counteract the effects of the cases illustrated above in (a), (b) and (e). Case  
 533 (d) can also be corrected with some loss of accuracy, since only 16 z positions are stored along the tilted  
 534 plane. The z positions stored can be optimized, but for the studies in this document the middle z position  
 535 of the strips read out by each MMFE is used. The loss of accuracy from this simplified corrections is  
 536 quantified in the next section. Corrections for case (c) have not been studied yet, but the effects of such a  
 537 misalignment have been studied and are shown in the next section.

### 538 3.1.1.4 Performance

539 The performance of the algorithm is studied using single muon events generated with Athena full simula-  
 540 tion. A geometry with two equal quadruplets each with two horizontal strips and two different-tilt stereo  
 541 strips (xxuv) was used. Most of the studies used here use a geometry in which the quadruplets are placed  
 542 with horizontal strips placed closer to the IP (xxuv xxuv).

543 The samples were generated using muons generated at the IP. However, additional samples with typical  
 544 parameters of the ATLAS beamspot during Run 1 using for the origin of the muons Gaussian distributions  
 545 of mean  $(x, y, z) = (0.05, 0.06, -1)$  mm and  $(\sigma_x, \sigma_y, \sigma_z) = (0.01, 0.01, 70)$  mm have also been checked.  
 546 The results with the samples with realistic beamspot (not shown here) result in a degradation of the  $\theta$   
 547 resolution of about a factor of two, but no noticeable effects on the  $\Delta\theta$  or  $\phi$  resolutions or the trigger  
 548 efficiency. Two samples using muons with energies of 200 and 1000 GeV have been studied. Each sample  
 549 has 20000 events with muons pointing to one large sector of the NSW. Of the digitized hits found in  
 550 the simulation, the first registered signal per channel was chosen for trigger construction. A function  
 551 mimicking a VMM chip's 100 ns deadtime and coverage of 64 strips was also applied to the data. This  
 552 function serves to guarantee a maximum of one hit, true or background, is registered for each 64 grouping  
 553 of strips for each event.

554 Three set of studies complement the performance studies in ideal conditions:

- 555 • a study of the effects of incoherent background,
- 556 • a study of the impact of misalignments and the corrections designed to mitigate them,
- 557 • a study of the performance with quadruplets with mirror-image placement (xxuv uvxx), proposed  
 558 recently to increase the lever-arm for fitting the bending coordinate using horizontal (x) strips.

559 The following variables are used to parameterize performance. Fit efficiency is defined as the efficiency  
 560 for a track to trigger the detector given that a minimum number of trigger hits before including back-  
 561 ground exist. In particular, n-horizontal/n-stereo refers to events for which there were at least n hits in the  
 562 horizontal strips and the same amount in the stereo strips before addition of background hits. Distribu-  
 563 tions are calculated with respect to truth definitions. In particular, the distributions of  $\theta^{\text{fit}} - \theta^{\text{true}}$ ,  $\phi^{\text{fit}} - \phi^{\text{true}}$ ,  
 564 and  $\Delta\theta^{\text{fit}} - \Delta\theta^{\text{true}}$ , for which true coordinates and direction are defined at the entrance of the NSW, are  
 565 studied. The distributions are fit using Gaussian fits with a one-step recursive fit. The raw results are first  
 566 fitted to a gaussian. The quoted resolutions are obtained through a fit in the range  $[\mu_1 - 3\sigma_1, \mu_1 + 3\sigma_1]$ ,  
 567 where  $\mu_1$  and  $\sigma_1$  are the mean and standard deviation of the first fit. Tails are defined as the fraction of  
 568 events outside the range  $[\mu_1 - 3\sigma_1, \mu_1 + 3\sigma_1]$ .

569 Sample distributions, integrated over all  $\eta$ , are shown for the  $\phi$ ,  $\theta$  and  $\Delta\theta$  reconstruction in Figures 10, 11  
 570 and 12.



Figure 10: Distribution of reconstructed  $\phi$  minus true  $\phi$  value of the track at the entrance of the NSW for muons of  $E = 200 \text{ GeV}$  (a) and  $E = 1 \text{ TeV}$  (b). The xxuv xxuv configuration without background is used.



Figure 11: Distribution of reconstructed global  $\theta$  minus true global  $\theta$  value of the track at the entrance of the NSW for muons of  $E = 200 \text{ GeV}$  (a) and  $E = 1 \text{ TeV}$  (b). The xxuv xxuv configuration without background is used.



Figure 12: Distribution of reconstructed  $\Delta\theta$  minus true  $\Delta\theta$  value of the track at the entrance of the NSW for muons of  $E = 200$  GeV (a) and  $E = 1$  TeV (b). The xxuv xxuv configuration without background is used.

571 The distributions do not change significantly with energy, except for an increase in the tails for the highest  
 572 energy studied. This is caused by an increase in events with large number of secondary particles, which  
 573 throw off the fit when they occur.

574 Figure 13 (a) shows the efficiency of the algorithm for events for which truth hits are found for different  
 575 thresholds. The  $E = 200$  GeV sample and an ideal geometry without background is used. The efficiency  
 576 increases with the number of hits as expected because of the extra redundancy added by the additional hits,  
 577 which makes the fit more likely to succeed even if some hits are not usable. The efficiency of the algorithm  
 578 is very close to 100%. Figure 13 (b) shows the true efficiency of each category, where the denominator  
 579 includes all muons at the entrance of the spectrometer. In this case the x-axis should be interpreted as an  
 580 inclusive axis (the 2X, 1U or 1V case refers to tracks with at least that many hits, so it includes all events  
 581 falling in the other bins). This figure thus includes the algorithmic efficiency in Figure 13 (a) and the  
 582 detector efficiency. It also gives a quantitative answer about the fraction of tracks with a given number of  
 583 coincidence threshold requirements. Finally, Figure 13 (c) shows the efficiency with the same definition  
 584 as was used in Figure 23 for easy comparison. It should be noted that the minimum fit requirement in  
 585 Figure 13 (c) is two horizontal strips and one stereo strip. This is a rather loose requirement, so it should  
 586 be no surprise that the efficiencies are higher than for Figure 23. However, tighter requirements can also  
 587 be observed in the figure as the x axis increases.

588 The  $\phi^{\text{fit}}$ ,  $\theta^{\text{fit}}$  and  $\Delta\theta^{\text{fit}}$  resolutions for the ideal geometry for  $E = 200$  GeV muons without backgrounds,  
 589 obtained from a fit are summarized in Table 5.

### 590 Effects from incoherent background

591 The impact of backgrounds is studied in the geometry with perfect alignment. Incoherent background  
 592 extrapolated from measurements performed in Run 1[8] is added outside of Athena to each generated  
 593 event as hits after the digitization step. Background hits are uniformly distributed in  $\phi$ , and in time across  
 594 two bunch crossings (50 ns). Once generated for each event, the background hits are combined with true



Figure 13: Algorithmic efficiency (a) showing for different coincidence thresholds (CT) at the truth trigger digit level, the fraction of tracks that are fit. Total efficiency (b) showing for different coincidence thresholds at the reconstructed level, the fraction of total muons incident in the NSW surface that are fit with those thresholds. The efficiency for different thresholds given that at least six hits are found at the truth level, as defined in Figure 23 are shown in (c).

Table 5: Efficiency and resolution (mrad) for muons of  $E = 200$  GeV for different coincidence thresholds without background.

| Track Type            | Fit Efficiency | $\sigma(\theta^{\text{fit}})$ (tails) | $\sigma(\phi^{\text{fit}})$ (tails) | $\sigma(\Delta\theta^{\text{fit}})$ (tails) |
|-----------------------|----------------|---------------------------------------|-------------------------------------|---------------------------------------------|
| 4-horizontal/4-stereo | 99.8%          | 0.27 (0.83%)                          | 2.3 (1.5%)                          | 1.6 (0.65%)                                 |
| 3-horizontal/3-stereo | 99.5%          | 0.28 (0.89%)                          | 3.0 (0.97%)                         | 1.7 (0.47%)                                 |
| 2-horizontal/2-stereo | 99.0%          | 0.29 (0.97%)                          | 3.2 (1.1%)                          | 1.9 (1.0%)                                  |

595 event hits and a VMM-mimicking function is called by the simulation to choose only the earliest arrival  
 596 hit for each VMM chip, which allows for the background to sometimes mask a true trigger hit.

597 The  $\phi^{\text{fit}}$ ,  $\theta^{\text{fit}}$  and  $\Delta\theta^{\text{fit}}$  resolutions for the ideal geometry for  $E = 200 \text{ GeV}$  muons with backgrounds are  
 598 summarized in Table 6.

Table 6: Efficiency and resolution (mrad) for muons of  $E = 200 \text{ GeV}$  for different coincidence thresholds with background.

| Track Type            | Fit Efficiency | $\sigma(\theta^{\text{fit}})$ (tails) | $\sigma(\phi^{\text{fit}})$ (tails) | $\sigma(\Delta\theta^{\text{fit}})$ (tails) |
|-----------------------|----------------|---------------------------------------|-------------------------------------|---------------------------------------------|
| 4-horizontal/4-stereo | 99.3%          | 0.30 (1.4%)                           | 4.2 (1.5%)                          | 1.7 (0.37%)                                 |
| 3-horizontal/3-stereo | 99.0%          | 0.31 (1.9%)                           | 4.8 (2.1%)                          | 2.0 (0.58%)                                 |
| 2-horizontal/2-stereo | 98.2%          | 0.32 (2.0%)                           | 5.1 (2.2%)                          | 2.3 (0.82%)                                 |

### 599 Effects from misalignments within the NSW

600 For the misalignments studied in this section, no changes in efficiency or in the resolution of  $\theta$  and  $\phi$  have  
 601 been observed. However, a degradation in the resolution of the  $\Delta\theta$  reconstruction has been observed.

602 Figure 14 shows the relative change in  $\Delta\theta^{\text{fit}}$  resolution as a function of misalignment for the different  
 603 misalignment configurations illustrated in Figure 9.

604 The degradation shows a certain periodicity related to strip width for displacements. This is clear for  
 605 Figures 14 (a) and (b). The degradation is higher when only half of the detector is radially shifted (b)  
 606 because a double peak structure is formed, which increases more significantly the peak width. Rotations (c)  
 607 and (d) cause monotonic changes in performance. The degradation is more significant when the tilt is away  
 608 from the IP (positive axis) because more strips are forced to cover the same solid angle, increasing the  
 609 effective distance between the strip that should have been hit and the strip that is actually hit. Figure 14 (e)  
 610 shows at lower levels of misalignment a fixed level of degradation, again caused by a discrete strip width.  
 611 At around 1 mm a two-peak structure appears in the distribution, which causes an increase in the RMS.  
 612 As the distribution gets further smeared the double peak structure disappears, yielding a monotonically  
 613 increasing degradation. Additional results can be found in Ref. [9].

614 It should be noted that corrections for all cases, except (c) have been developed. Based on these cur-  
 615 rent corrections, since only case (c) is relevant, a degradation of up to 25% on  $\Delta\theta^{\text{fit}}$  resolution (or from  
 616 1.7 mrad to 2.1 mrad in the 4-horizontal/4-stereo case) is expected for 5 mm misalignments.

### 617 Performance with the xxuv uvxx configuration

618

619 Recently, it has been decided to build sectors with the mirror-image placement (resulting in a xxuv uvxx  
 620 horizontal/stereo strip configuration). From the perspective of the trigger, new simulations have not been  
 621 produced yet. However, estimates of the improvement in the  $\Delta\theta^{\text{fit}}$  resolution have been obtained[10].

622 The estimate has been performed using a toy simulation of the trigger algorithm, which does not include  
 623 any ionization or scattering effects. This simulation only demonstrates resolution effects arising from  
 624 the width of the strips. Based on this simulation, an ideal geometrical  $\Delta\theta^{\text{fit}}$  resolution of 0.89 mrad has  
 625 been estimated for the 4-horizontal/4-stereo hit category in the standard geometry. This number can



Figure 14: Relative change in  $\Delta\theta^{\text{fit}}$  resolution as a function of misalignment for different misalignment configurations. Figures (a)-(e) are laid out to match the corresponding misalignment as illustrated in Figure 9.

be used together with the resolution obtained in the full simulation (and shown in Table 5) to estimate the impact of different physics effects (ionization, multiple scattering...) in the resolution. In particular, using  $\sigma_{\text{athena}}^2 = \sigma_{\text{physics}}^2 + \sigma_{\text{geometry}}^2$ , one can estimate  $\sigma_{\text{physics}}$ . The toy simulation can then be used to calculate the new  $\sigma_{\text{geometry}}^{\text{new}} = 0.79$  for the mirror-image geometry. This, combined with  $\sigma_{\text{physics}}$  provides the estimate of the quantity of interest  $\sigma_{\text{athena}}^{\text{new}}$ . Using this procedure, one estimates that with the new geometry the ideal  $\Delta\theta^{\text{fit}}$  resolution for the 4-horizontal/4-stereo hit category improves to 1.65 mrad from 1.7 mrad, or less than 5%, in the case with backgrounds. These studies are reported in more detail in Ref. [10] and need to be repeated using the ATLAS full simulation.

### 3.1.2 MM Look-Up-Table Algorithm

#### 3.1.2.1 Principle of the algorithm

The eight MM layers that form a large sector are numbered from 1 to 8 (1, 2, 5 and 6 being the X layers, 3 and 7 the U layers, and 4 and 8 the V layers) and grouped in pairs: (1,5),(2,6),(3,7),(4,8),(1,6) and (2,5), so six pairs in total. When a muon passes through the detector, hits are created on each of the layers. For each considered pair, the first strips that triggered for both layers of the pair are used to calculate the corresponding slope with the formula:

$$\text{slope}_{\text{pair}} = \frac{y_{\text{layer}2} - y_{\text{layer}1}}{z_{\text{layer}2} - z_{\text{layer}1}}$$

where  $y_{\text{layer}1,2}$  and  $z_{\text{layer}1,2}$  are the strip positions in the ( $y,z$ ) plane. Figure 15 illustrates the pair selection and the angle calculation with respect to the interaction point (IP). The distance between layer pairs (1,5), (2,6), (3,7) and (4,8) is of 126 mm, 137 mm for pair (1,6) and 115 mm for pair (2,5) respectively as shown in Figure 16.



Figure 15: Illustration of pair selection and angle calculation with respect to the interaction point.

The so calculated slopes are then compared to each other as shown in Figure 16. If a certain number of them (above some threshold value defined in the selection logic) are equal within a given precision, they are considered as forming a unique track corresponding to one muon and retained for further processes. All these calculations are done locally; the condition of a pointing track (i.e. coming from the interaction point) is then added in order to determine the Regions of Interest (RoI).

In this study, the detector is segmented in several panel regions which correspond originally to the segmentation in four chambers. With the present design, with only two chambers, the panels correspond to



Figure 16: Representation of pairs formed by the algorithm (top) and comparison between the slopes (bottom).

the logical segmentations (for instance, two or more panel or regions per chamber). Figure 17 shows the electronic implementation of the trigger for two bunch crossings (BC) for one panel region. In order to allow for noise, multiple hits and multiple muons in a given panel region, up to 8 hits per layer and per BC can be considered by the algorithm. Hence, for each layer pair, a maximum of  $8 \times 8 = 64$  slopes is calculated. For 2 BC there are 4 sets of 6 layer pairs with 64 slopes calculation for each pair. The algorithm described above intervenes at the “SLOPE SELECT” step in Figure 17, pre-selecting the track candidate for further processing. Performance and latency are also given in Figure 17 assuming 320 MHz clock frequency.

### 3.1.2.2 Algorithm Implementation

The algorithm performance is essential to respect the requirements on the trigger response time: a decision has to be made in less than 100 nanoseconds. So to limit the number of calculations, the solution of a Look Up Table (LUT) has been retained. Its operating principle is as follows. For each layer pair, a panel region corresponds to a given range aperture angles from the IP as illustrated in Figure 18. These aperture angles are delimited by the borders between the adjacent panel regions. Based on simple geometrical considerations, it is possible to store all the possible values of the slopes (difference between strip numbers of the two layers) that a pointing track can have in a given panel region



Figure 17: Electronic implementation for one panel region.

with a certain granularity in a 32 bits register. Then for each layer pair, a Look Up Table is used to check whether each of the 64 slopes is within the range of allowed values for this panel as presented in Figure 19. As an output, each time a matching is found, a bit is filled in a 32-bit register which corresponds to the slope value. Bit 0 is filled in case of no-match. For each layer pair, there are four such registers corresponding to the four possible BC combinations.

In the following step of “Slope Select Logic” in Figure 17, the registers corresponding to the six layer pairs are compared. If there are minimum number of compatible values within a given precision (for example within two slope bits), the combination is considered as valid and is kept for further process. The track candidate slope is then determined by the average slope of the accepted segments. Track constraint to the IP could be implemented in the future.

In order to account for the larger distance from the IP of the second multilayer, projective panel region are defined which are slightly different from the physical segmentation. The principle is depicted on Figure 20.

**Construction of the Look Up Table :** the aperture angles are obtained from the simulation as shown in Figure 18. The Table 7 gives these aperture angles as well as the expected maximum value of difference of slopes expressed in term of number of strips ( $N_s$ ).  $N_s$  depends of the distance between the two layers of the pair; the numbers given in the Table 7 are for the pairs of 126 mm distance. These numbers are



Figure 18: Angle aperture for different panel regions.



Figure 19: Operating principle of the LUT.



Figure 20: Implementation of projective panel regions.

680 slightly smaller for the pair of 115 mm and slightly higher for the pair of 137 mm.

| Panel region | Aperture angle ( $^{\circ}$ ) | Number of strips ( $N_s$ ) |
|--------------|-------------------------------|----------------------------|
| 0            | 4.73                          | 31                         |
| 1            | 5.93                          | 36                         |
| 2            | 6.45                          | 36                         |
| 3            | 6.86                          | 36                         |

Table 7: Number of strips for the LUT in each panel region

### 681 3.1.2.3 Algorithm Performance

682 The algorithm is tested using simulated sample of dimuon events. The simulated samples consist of  
 683 385 000 dimuon events with one muon per end-cap with the old MM layout containing four panel re-  
 684 gions. The muons are originating from the IP and are uniformly distributed both in transverse momentum  
 685 ( $4 < p_T < 100$  GeV) and in the  $\phi$  coordinate. In the  $\eta$  coordinate, they are flat in the range  $1 < |\eta| < 3.2$ .  
 686 This study is performed without any background simulation.

687 **Slope comparison** : assuming the ideal case of events with eight hits (with one hit per layer), the differ-  
 688 ence between the six slopes values is shown in Figure 21(a). The differences are spiky in zero, but the  
 689 RMS of 0.005 shows that the hit strip in the multilayers can only be known with a precision of  $\pm 2$  strips.  
 690 The tail goes up to  $\pm 10$  strips. This is due to the known ionization problem : the earlier strip hit within a  
 691 VMM, is not always the closest to the real track. So in the LUT, it often happens that more than one bin  
 692 match the track candidate. The right plot in Figure 21(b) shows the number of bins hit in the LUT (the  
 693 bin 2, for example, means that the six slopes of the track candidate span two different bins in the LUT).  
 694 This problem complicates the electronic implementation but fortunately the bottom plot in Figure 21(c)  
 695 shows that most of the time, if different bins are hit in the LUT, they are directly adjacent. This property  
 696 makes the electronics scan of hit bins in the LUT feasible.

697 **Intrinsic Efficiency** : The efficiency of the algorithm is defined as the ratio between the number of  
 698 tracks which passed the algorithm requirement and the total number of simulated tracks. The efficiency is  
 699 calculated for the ideal case of events with eight hits (with one hit per layer). Figure 22 shows the intrinsic  
 700 efficiency as a function of  $\eta$  and  $p_T$ . The holes observed in the efficiency distribution versus  $\eta$  are explained  
 701 by the gaps between the different panel regions located at  $\eta \approx 1.43$ ,  $\eta \approx 1.68$  and  $\eta \approx 2.05$ . Indeed some  
 702 track candidates overlap two panel regions and so cannot be taken into account by the LUT (which is  
 703 defined per panel region). The global efficiency of the algorithm is 99.6% for projective panel regions  
 704 and 99.2% otherwise. This inefficiency cannot be fully recovered since the electronic implementation is  
 705 not fully projective.

706 **Efficiency for six hits** : the efficiencies for different pair requirements are shown in Figure 23: the pairs  
 707  $U$  and  $V$  with four, three or two pairs  $X$ , and the pair  $U$  or  $V$  with four, three, two or one pair  $X$ . It is  
 708 mandatory to have at least one pair  $X$  to have the  $\eta$  coordinate with enough precision and one of the two  
 709 pairs  $U$  and  $V$  to determine the  $\phi$  coordinate. These efficiencies are calculated in the case of only six hits  
 710 (it may happen that hits are missing because the track passes through a gap between two panel regions).  
 711 For comparison, the efficiency of the Harvard algorithm with a requirement of only two layers  $X$  and the  
 712 layers  $U$  and  $V$  is of 99%. The Saclay algorithm loses more information when hits are missing (because  
 713 when one hit is missing in one layer, the pair cannot be built, so the informations of two hits is lost (and  
 714 of four hits if the hit is missing on a  $X$  layer), which explains this efficiency difference. However, the  
 715 method of pairs should be more robust and precise for background filtering and rejection.



Figure 21: (a) : Difference of slope values. (b) Number of bins hit in the LUT. (c) Difference between hit bins in the LUT.

716 **Resolution** : the resolution at the muon spectrometer entrance is calculated as the difference between the  
 717 angle  $\theta_{rec}$  as reconstructed by the algorithm and  $\theta_{truth}$  as simulated at the muon spectrometer entrance.  
 718 The RMS of this distribution shown in Figure 24 gives a resolution of 1.7 mrad.

**Calculation of the x-coordinate** : What still has to be done is the calculation of the  $x$ -coordinate using the stereo strips and the firmware implementation. The  $x$ -coordinate will be used to determine the  $\phi$  angle of the particle in order to build the Region of Interest characterized by  $(\Delta\eta, \Delta\phi)$ . This calculation has not been performed in this study because the stereo strips were not implemented in the simulated samples. However, the formula that should have been used for this calculation is:

$$\Delta X = \frac{Y_U - Y_X - \Delta Y_\theta}{\tan(\phi_0)}$$

719 where  $Y_U$  is the average between the two  $y$ -coordinates of the hit strips of the two layers  $U$ ,  $Y_X$  is the  
 720 average between the two layers of one pair  $X$ ,  $\phi_0$  is the stereo angle ( $\phi_0 = 1.5^\circ$ ). The term  $\Delta Y_\theta = (Z_U -$   
 721  $Z_X) \times \tan(\theta)$  (with  $Z_U$  the average between the two  $z$ -coordinates of the hit strips of the two layers  $U$ ,  
 722  $Z_X$  of the two layers  $X$ , and  $\tan(\theta)$  the average of the slopes of the six pairs). This calculation, done for



Figure 22: Distribution of efficiency versus  $\eta$  (left) and  $p_T$  (right) in the ideal case of eight hits (one hit per layer).



Figure 23: Efficiencies for different pair requirements in the case of six hits.

723 the four  $X$  pairs once with the  $U$  pair and once with the  $V$  pair allows to determine the  $x$ -coordinate of  
724 the hits on each layer.

#### 725 3.1.2.4 Summary

726 The proposed algorithm for the MicroMegas New Small Wheel trigger processor, based on the Look Up  
727 Table optimization and strengthen with electronic tests, has an intrinsic efficiency of more than 99% and  
728 can detect a particles with a precision better than 2 mrad on the polar angle, which fulfill the requirements  
729 for the trigger. The construction of the Region of Interest of a track is still to be implemented to check if  
730 the total response time is less than 100 ns.

731 The algorithm has still to be tested in presence of background which both increases the efficiency and  
732 the apparition of fakes, i.e. detection of muons that are in reality other products of the collision. The  
733 system of pairs should filter the background because the detection of one hit is confirmed by the hit four



Figure 24:  $\theta$  resolution at the muon spectrometer entrance.

734 layers farther, and the fake particles have trajectories that are not geometrically compatible with the LUT  
 735 content. However, the ionization problem and the intrinsic resolution of the detector are some limitations  
 736 of the background rejection.

737 Recently, we have started activities on the production of cavern background simulation. This simulation  
 738 can be performed using two different approaches: one is fully based on GEANT 4, the second is using  
 739 FLUGG. Both were used in the past for simulation of physics events. The second approach is currently  
 740 under validation and the samples are expected to be produced in the next two months.

## 741 3.2 sTGC Trigger Algorithm

### 742 3.2.1 The pre-trigger from the pad towers

743 The sTGC trigger is based on determining track coordinates from the centroids of strip charges. Trans-  
 744 mitting  $\approx 280,000$  strip charges off-detector would require bandwidths not practical for today's optical  
 745 interconnect technology. To reduce the amount of data sent to the off-detector trigger processors, uses  
 746 an 8-fold coincidence along detector pads towers to identify regions where a possible muon candidate  
 747 has passed. See Figure 25 and Ref.[[padTower](#)]. Information only from sTGC strips passing through the  
 748 tower selected by the Pad Trigger is transmitted off-detector to the Trigger Processor.

749 It is expected that sometimes a layer of strips will not be useable because its cluster of strips is very wide  
 750 due to a  $\delta$ -ray, or its signals are saturated due to a neutron track, or the detector layer has become defective.  
 751 The algorithm described below discards such clusters and defective layers. There are enough layers that  
 752 precise centroids can none-the-less be calculated with less than the full four layers of a quadruplet.

753 An important parameter is the number of track segments expected to be found, given the rate of back-  
 754 ground tracks that successfully pass through all layers of the NSW and could trigger the pad trigger. An  
 755 estimate of the probability of 1, 2, and 3 segments in a sector is shown in Figure 26. See Ref. [11] for  
 756 details of the calculation and assumptions. The planned sTGC trigger path supports transfer of the strip  
 757 data for up to four track segments. The figure shows that the losses from this limit are negligible.



Figure 25: The Pad Trigger selects the strips passing through a pad tower made from a coincidence of overlapping pads. One layer of the selected strips is shown.



Figure 26: Left: The fraction of BCs with one segment (candidate) for each of the three quadruplets. Right: The fraction of BCs with one, two, and three segments in a sector. From Ref. [11]

### 758 3.2.2 Finding track segments and calculating their parameters

The sTGC algorithm calculates the  $\Delta\theta$ ,  $\phi$  index and R index of a track segment. The algorithm input is an active band of strips for each of the 8 layers. The information of a band consists of 17 strip “charges” measured by the 6-bit flash ADC in the VMM front end ASIC. The 17 strips include 13 strips of the band itself plus two strips from each of neighboring bands that provide for charge spreading to adjacent bands. First, the centroid for each of 8 layers is calculated using the FPGA’s built-in DSP (Digital Signal Processing) blocks. Figure 27 shows the centroid calculation algorithm for a single layer.



Figure 27: The centroid algorithm for a single layer

765 Two separate configurable thresholds are provided, one for filtering the input strip signal and another for  
 766 defining the signal width in number of strips. A valid strip band is defined as a band that has one to five  
 767 active strips, in all other cases the layer centroid calculation result will have a “width valid” flag set to  
 768 zero. When “width valid” flag of a layer is zero, this layer does not participate in further calculations.

769 A local center of mass is calculated from the five values in the 5-strip window. The calculation formula is  
 770 a weighted mean:  $\frac{\sum_{n=1}^5 n \times Q_n}{\sum_{n=1}^5 Q_n}$ . The band’s global offset comes from the band-id, i.e. the row index of the  
 771 active pad tower, and is added to the local-to-band calculation result.

772 For both of the 4-layer quadruplets, a centroid is calculated as an average of its valid layer centroids. This  
 773 is shown on Figure 28.



Figure 28: Calculation of the quadruplet centroid

774 It is possible that during the algorithm execution it is found that there is only one valid quadruplet cen-  
 775 troid (i.e. it has at least one valid layer centroid). This situation occurs, for example, when the track  
 776 passes through the frame of one of the quadruplets. In this case we use the coordinates within this valid  
 777 quadruplet to calculate a  $\Delta\theta$ , albeit with much poorer accuracy. In order that the algorithm have the same  
 778 latency in either case, the low quality result for each of the quadruplets is calculated in parallel to the

779 main algorithm. If the main algorithm fails one of the low quality results maybe used instead.

780 Having the pivot quadruplet centroid value allows the algorithm to define a range for the valid values  
 781 of the confirmation quadruplet centroid. A valid range for the confirmation quadruplet centroid value  
 782 is defined by the maximum deviation of  $\pm 15$  mrad from the infinite momentum track angle as shown in  
 783 Figure 29. The maximum allowed deviation,  $\Delta\theta_{cut}$ , is configurable.



Figure 29: Definition of the valid range of the confirmation quadruplet centroid values, i.e. within  $\Delta\theta_{cut}$

784 The  $\Delta\theta$  margins LUT defines valid RB (quadruplet B centroid) value ranges for each possible RA (quadruplet A centroid) value. Each range maps RB onto one of the valid values of  $\Delta\theta$ ; out-of-range values of RB  
 785 are marked by setting “valid” flag to zero. Figure 30 shows a block diagram of the  $\Delta\theta$  calculation path.  
 786



Figure 30: The Look-up-Table scheme for producing  $\Delta\theta$  within the desired cut

787 R index is a trajectory projection on a Big Wheel and is calculated directly from RA and RB values using  
 788 mapping conversion LUTs, as shown on Figure 31.

789 **FPGA resources needed** One segment finder consumes 1.2% of the FPGA Virtex 7 690T LUT re-  
 790 sources, 9% of the Block RAMs and 1.8% of the DSPs. This allows ample resources to add two single  
 791 quadruplet segment finders and the misalignment correction calculator, all replicated four times. The  
 792 ancillary functions must also be added.



Figure 31: Calculation of the R index

### 793 3.2.3 Compensating for misalignments

794 The sTGC algorithm provides quadruplet misalignment correction for the cases that are shown in Figure  
 795 [32](#):

- 796 (1) displacement along r-axis (axis orthogonal to the beam axis),  
 797 (2) displacement along the z-axis (beam axis),  
 798 (3) rotational displacement around the detector edge orthogonal to the r-axis,  
 799 (4) rotational displacement around the r-axis  
 800 (5) rotational displacement around the axis parallel to the z-axis.



Figure 32: The possible misalignments of a sTGC quadruplet

801 Each of the six quadruplets of a sector are divided into sections by band-ID and  $\phi$ -ID. For each of these  
 802 sections, four constant geometrical parameters ( $dr$ ,  $dr^2$ ,  $drd\phi$ ,  $d\phi$ ) are stored in six 2-D look-up tables.  
 803 Each cell of the LUT contains a set of the four values for each combination of band-ID (or predefined  
 804 contiguous set of band IDs) and  $\phi$ -ID (or predefined contiguous set of  $\phi$ -IDs). Five measured mechanical  
 805 alignment values for each of six quadruplets of a sector are stored in a RAM and updated by the calibration  
 806 process. The total correction offset for one quadruplet is calculated as a dot product of the two entries.  
 807 This misalignment calculation, as shown on [Figure 33](#), is done in parallel with the layer centroid algorithm  
 808 and is added to its result, i.e. one additional (320 MHz) clock latency.



Figure 33: LUT-based misalignment correction algorithm. Numbers in orange represent the possible chamber misalignment cases according to Figure 32.

## 809 4 Trigger Processor Hardware Platforms

810 There are two candidate platforms for the NSW Trigger Processor. This section summarizes their differ-  
 811 ences. There are several documents [3–6] available for this review that describe the different boards in  
 812 detail. All boards use Xilinx Virtex FPGAs and the two mezzanine card options use the same FPGA, a  
 813 Xilinx Virtex 7 690T.

814 **4.1 Specification comparison**

815 **4.1.1 ATCA Standard Interfaces**

816 Both platforms adhere to current ATCA standards and provide all required interfaces.

817 **4.1.2 Optical i/o for detector data and Sector Logic**

818 It is required that each Trigger Processor must be capable of operating independently of the other as well  
 819 as in the normal operating mode of MM/sTGC pairs in tandem. Each TP must therefore be capable of  
 820 driving sufficient number of fibers to the Trigger Logic. That requirement is two independent channels to  
 821 carry up to four trigger candidates each, as well as a 7-fold fan-out to accommodate up to seven partitions.  
 822 Thus, each TP requires 14 fiber outputs running at >6.4 Gbps each. Each TP also requires 32 fiber inputs  
 823 to accommodate one sector for either the MM or sTGC. For the MM\_TP, input bandwidth of 6.4 Gbps is  
 824 required. A fiber link to FELIX for TTC, Level-1 data and monitoring/config data is also required.

825 **SRS AMC cards.**

826 The SRS AMC card provides 3x Avago  $\mu$ Pod optical transmitter modules and the same  
 827 number of receiver modules. Each module has 12 channels which are specified to operate at  
 828 up to 10 Gbps, for a total of 36 channels each of transmitters and receivers.

---

**829 LAr AMC cards**

830 The LAr AMC card provides 4x each Avago  $\mu$ Pod transmitter and receiver modules for a total  
 831 of 48 i/o channels. For use in the NSW Trigger Processor, only three each of the transmitter  
 832 and receiver modules would be populated, for a total of 36 i/o channels. Data rates of >10  
 833 Gbps have been demonstrated on the LAr card as this is a requirement for operation in the  
 834 LAr detector.

**835 4.1.3 AMC to AMC lateral communication**

836 Each MM TP must transfer its hit candidates to the neighboring sTGC TP within one bunch crossing. The  
 837 requirements document states that up to eight candidates must be transferred in that time. Each transfer  
 838 requires 21 bits of data plus 6 for BCID for a total of 175 bits/BC. Thus, aggregate bandwidth for lateral  
 839 communication between AMC cards is  $175/25\text{ns} = 7.0 \text{ Gbps}$ .

**840 SRS AMC cards**

841 The SRS AMC card provides two independent Trigger Processors per card, with one dedicated  
 842 to MicroMegas and the other to sTGC. Each has its own FPGA and there are 64 LVDS  
 843 pairs connecting the two. To transmit the required hit data from MM to sTGC using all 64  
 844 lines, each would have to run at 100 MHz. Since the two FPGAs are within ~10 cm of each  
 845 other and on the same board, speeds beyond ~1 Gbps should be easily achievable. As the  
 846 SRS AMC card is still in development, it is not available for testing at this moment.

**847 LAr AMC cards**

848 Each AMC card will be configured as a MM TP or as a sTGC TP. The FPGA on each card  
 849 has, at the present time, 8 LVDS lines which are transmitted to the ATCA carrier card through  
 850 a connector. The LVDS output lines from one MM AMC connect to an FPGA on the ATCA  
 851 carrier, are transferred to corresponding LVDS outputs on that FPGA, and then to the sTGC  
 852 AMC through its connector. Thus, the LVDS lines travel from MM AMC to the sTGC AMC  
 853 through two AMC connectors and an FPGA. Bandwidth for this arrangement has been tested  
 854 to ~ 600 Mbps, but without the intervening FPGA present. The designers plan to test to  
 855 >1 Gbps and are confident they can achieve higher bandwidth than has been demonstrated  
 856 at present. For present purposes, a comfortable operating speed for these connections of  
 857 500 Mbps is assumed until such time as higher rates have been verified.

858 The present iteration of the AMC (called OTC for Optical Test Card) has 8 such LVDS pairs.  
 859 At 500 Mbps each, aggregate bandwidth is 4.0 Gbps which is sufficient for transferring up to  
 860 5 trigger candidates per BC but not 8. The LAr AMC designers will attempt to expand the  
 861 number of LVDS lines in subsequent iterations to accommodate the required bandwidth. At  
 862 currently achieved bandwidth, a minimum number of LVDS lines in future board iterations  
 863 would be 16, with slightly more being desirable.

---

## 864 4.2 Selection Criteria

865 The main difference between the platforms appears to be one of topology. The LAr platform has four  
 866 independent and identical AMC cards which can implement one trigger processor of each flavor. The two  
 867 cards need to communicate laterally through the ATCA card and, at the moment, there are only 8 LVDS  
 868 lines available for that purpose. These have been tested to 600 MHz and perhaps can go higher. The  
 869 designers may also be able to increase the number of lines to 16, thus enabling operation at a lower clock  
 870 frequency. Assuming that the required lateral bandwidth can be achieved, it is unlikely that operation  
 871 could be extended to higher rates of hit candidates.

872 The SRS implements the system with “double wide” AMC cards with the two flavors of trigger processor  
 873 on each and 64 LVDS pairs between them. Each can be expected to operate at ~1 GHz which is signifi-  
 874 cantly higher than required for 8 hit candidates per BC and there would be no problem extending that to  
 875 significantly higher throughput. The board design and layout has been completed, however, it has not yet  
 876 been produced, so there is no module in existence yet for testing.

877 The selection criteria boils down to the ability of either platform to deliver the required lateral bandwidth  
 878 of 7.0 Gbps or higher for upgrades. The LAr platform will need additional design/fabrication cycles as  
 879 will the SRS, and the latter has not yet completed one cycle. Given that, the selection would require that  
 880 a development time be allowed before the decision is finalized.

## 881 5 Testing

882 The testing of the trigger processor and algorithm implementation in hardware, for both micromegas and  
 883 sTGC, will be performed in four ways. Initially, signals from upstream electronics have been generated  
 884 in the trigger processor FPGA and looped back in order to test the implementation of the algorithm in  
 885 hardware. Testing of the full electronics chain requires further work. In the absence of realistic chambers,  
 886 the detector and front-end output will be emulated by pattern generators in order to characterize the full  
 887 trigger chain, including the ADDC. Additionally, the trigger system will be tested on real detectors with  
 888 cosmic rays and in a test beam.

### 889 5.1 MM Implementation Initial Testing

890 A slice containing all of the elements of the Trigger Processor design has been implemented with no  
 891 timing errors and is being tested using a Xilinx VC707 Development board. This board includes a Virtex  
 892 XC7VX485T-2FFG1761C FPGA. The implementation includes two ADDC GBT interfaces and associ-  
 893 ated trigger processor algorithm. We are currently working on the timing closure of a full design.

894 To exercise the trigger processor design we have developed an evaluation board based ADDC emulator.  
 895 This design can be used to supply properly formatted ADDC GBT packets through an optical transmitter  
 896 as sent from an actual ADDC. The same ART data used for simulations is being used for hardware testing.  
 897 We are also generating pseudo random ART data to test the GBT communication and timing of packet  
 898 decoder. A block diagram of the data flow can be seen in Figure 34.

899 To evaluate the hardware implementation, we compare the hardware results with results generated using  
 900 a computer simulation of the algorithm. The initial testing of the hardware has shown the entire algorithm  
 901 is working functionally. There are some differences between the hardware and software results due to the



Figure 34: Initial testing configuration using an ADDC emulator

902 number of significant bits used in the hardware. It is likely we will want to increase the precision used in  
 903 some of the hardware calculations. We expect this would increase the latency by roughly 6 to 9 ns. Since  
 904 the calculations are done in an area of the implementation that has a low multiplication factor, increasing  
 905 the precision will have a low impact on the resources used.

906 We have begun integration testing with the BNL ADDC and have successfully transmitted data to the  
 907 Trigger Processor using the ADDC's VTTx ASIC.

## 908 5.2 Pattern Generators

### 909 5.2.1 The Micromegas ART Pattern Generator

910 In order to properly test the full trigger electronics chain without access to a large number of chambers, it  
 911 is necessary to implement a pattern generator to simulate the ART (address in real time) output from the  
 912 VMM chips. This pattern generator will interface with the ART Data Driver Card (ADDC), which will  
 913 then transmit information via fiber optic link to the trigger processor (TP).

914 The firmware for the ART Pattern Generator (APG) has been written and is currently being tested on a  
 915 Xilinx Virtex 7 FPGA in a VC707 development board. Each development board has FMC output connec-  
 916 tors. A mezzanine card has been developed to adapt these FMC connectors to the MiniSAS connectors  
 917 expected by the ADDC card.

918 The signal emulation will be accomplished by reformatting simulated muon events in the NSW in Athena  
 919 and sending them via an ethernet interface to the FPGA. On the FPGA, the hits will be sorted to the  
 920 correct ART output and clocked out according to the timing indicated in the event. Finally, the 6 bits of  
 921 the strip number are output serially through the FMC connector according to the LVDS standard expected  
 922 by the ADDC inputs. Figure 35 shows the general chain of the testing configuration using the APG and  
 923 36 shows the full program flow of the APG from simulation to serialized output on the evaluation board  
 924 FPGA.

925 The APG will be used in conjunction with the BNL ADDC and the trigger processor to test the perfor-  
 926 mance of the electronics and trigger algorithm as a function of various parameters of interest (including  
 927 track slope, hit rates, etc.). Performance metrics for the algorithm will include efficiency and fake rates.  
 928 Performance metrics for the electronics will include latency measurements and stress tests with large  
 929 track and/or background rates.



Figure 35: Testing configuration for APG.

### 930 5.2.2 The sTGC Pattern Generator

931 A Matlab program generates patterns for testing the sTGC algorithm. Values for track angle, hit radius,  
 932 hit intensity and hit  $\phi$  position are taken from a uniform distribution within predefined limits. The event  
 933 parameters are then parsed into trigger processor algorithm inputs for each of the eight layers. Figure 37  
 934 shows a graphical representation of one quadruplet image of one generated event. The expected algorithm  
 935 output is calculated for each generated event for verification.

936 This mechanism was implemented on two identical sTGC trigger demo FPGA boards. One board one  
 937 emulated the detector outputs using preloaded generated event data patterns for several events and other  
 938 was configured as the Trigger Processor. As a next step towards algorithm testing simulated muon events  
 939 data will be parsed into an sTGC trigger processor input pattern. This will allow verification of the  
 940 algorithm functionality and its optimization. Using same simulated events will also allow comparison  
 941 between sTGC and Micromegas trigger processing algorithms. As a final test, playback pattern generation  
 942 will be employed. In this test the real event data can be recorded and parsed as an input pattern.

## 943 5.3 Cosmic Ray Testing

944 With real detectors available, it will be possible to do more detailed testing of the trigger processor.  
 945 Testing with cosmic rays is an important aspect of the trigger testing plan. A cosmic ray test stand  
 946 is not subject to test beam schedules, allowing for rapid iteration and debugging in the early stages of  
 947 development. This is especially important because it allows both the MM and sTGC systems to debug the  
 948 interface with a real VMM chip. Thus, part of the testing plan is to assemble realistic detector prototypes  
 949 for the MM and sTGC and place them in cosmic ray telescopes.

950 It is important to ensure that the cosmic ray muons used have sufficient momentum ( $O(1 \text{ GeV})$ ) so as  
 951 to avoid effects of multiple scattering. Additionally, it is useful for the cosmic ray test setup to have a  
 952 large angular acceptance, a good angular resolution, and good timing resolution to allow precise char-  
 953 acterization of the trigger's behavior. Such test setups are available at Harvard for the MM chambers  
 954 and Weizmann for the sTGC chambers. The test stands will include small-scale realistic detector setups,  
 955 including a small assembled MM octuplet and an sTGC prototype **more info for sTGC?**.

956 The Harvard cosmic ray test setup consists of three planes of scintillators sandwiching a 2-m thick con-  
 957 crete block. It can detect cosmic ray muons of greater than 0.8 GeV with a time resolution better than 1.5  
 958 ns. It has a coverage of  $2.5 \text{ m}^2$ , an angular acceptance of up to 25 degrees from vertical and a 1.5 degree  
 959 angular resolution.

960 **some info about Weizmann stand here**



Figure 36: Block diagram representing the detailed flow of the APG.

Once the realistic detector setups are assembled, the ultimate goal is to test the full trigger chain with the chambers. First, the chain will be tested with the trigger processor still in the evaluation board platform, mainly testing the full communication pathway from chamber readout to trigger processor. Once this is found to be working, the trigger processor will move to its eventual home on the hardware platform in an ATCA crate. There, more detailed tests with the cosmic rays can be performed, including full measurements of the latency, trigger efficiency, and hit throughput. Additionally, tests for the signal integrity and bit errors through the whole electronics chain can be done.

The key asset of the cosmic ray test setup is to be able to iterate rapidly and debug issues that may arise without having to wait for beam in a test beam setting.

#### 5.4 Vertical Slice and Test Beam

Part of the testing strategy for the micromegas trigger includes the installation of the Micromegas Small Wheel (MSW) in the ATLAS cavern. This will allow the micromegas trigger to be tested under real cavern background conditions. The electronics chain for the MSW will include the VMM2 readout chip and the ADDC card for transmitting GBT signals to an evaluation board located in USA15. The evaluation board will receive the GBT signals from the ADDC and decode them. On an ATLAS level 1 accept, the evaluation board will format all of the ART data within 4 bunch crossings of the ATLAS trigger and send them to the ATLAS DAQ system via ethernet. Events will also be sent via a second ethernet



Figure 37: An image on the background showing an active pad tower (defined by the overlapping pads shown in grey) as a dark square, where the dots indicate which of the layers in the pad tower have pad signals above threshold. The foreground image shows the pulse heights of the strips passing through the pad tower for this event.

link to a desktop PC to allow for real time monitoring. Recording the data in this way will allow for testing of the trigger algorithm and implementation using ART data from micromegas in realistic cavern conditions. The trigger algorithm for segment reconstruction can be implemented in the evalauation board and recorded. The collected ART data can also later be fed through existing pattern generators to allow for quick debugging of the trigger electronics, offering a more realistic set of tests than those generated in Athena. Figure 38 shows the trigger chain envisioned for the MSW.



Figure 38: MSW Trigger Chain proposal

## 6 Phase-2 Compatibility

The current latency budget for the Phase 1 trigger processor for the New Small Wheel, and the forward muon system in general is extremely tight. While the NSW trigger as designed now for installation in Long Shutdown 2 (LS2) already meets the Phase 2 requirement of an angular resolution of 1 mrad, it is

988 quite possible that one may further lower the thresholds for muon momentum by taking advantage of the  
 989 increased Level-1 latency time to do a more refined calculation of muon pointing and momentum, or to  
 990 improve robustness and redundancy, e.g. in case of missing layers.

991 Currently, prompt signals from the Micromegas detectors and sTGC are used to form track segments  
 992 in the NSW. A cut is made on the pointing back to the interaction region in a set of trigger processors,  
 993 and the resulting  $\Delta\theta$ , and RoI information is transmitted to Sector Logic which looks for a coincidence  
 994 with prompt signals from the Big Wheel. Currently, the latency budget for the NSW trigger processor  
 995 algorithm is approximately 100 nsec, when fiber optic delays, serialization, deserialization and other fac-  
 996 tors are taken into consideration, along with the processing and transmission time associated with Sector  
 997 Logic.

998 With a much larger amount of latency available in Phase 2, and anticipated advances in FPGAs, it makes  
 999 sense to seriously consider a more powerful trigger processing scheme that can take advantage of addi-  
 1000 tional processing time to make a much more refined trigger algorithm. In this case, the Phase 1 trigger  
 1001 processor hardware (ATCA cards + mezzanines), excluding the ATCA crates and optical fibers, would  
 1002 be replaced. Since such proposal is for equipment in USA 15, it has little to no impact on the current  
 1003 plans, but can provide for the possibility of a much more refined Level-1 trigger that should be capable of  
 1004 pushing down the momentum threshold for forward muons significantly by including more fine-grained  
 1005 information available, given the latency.

## 1006 7 Project rganization

1007 The NSW Trigger Processor project is being carried out by a Working Group coordinated by João  
 1008 Guimarães da Costa. The working group activities are integrated in the NSW Electronics group coor-  
 1009 dinated by Lorne Levinson. The group consists of institutes that have been involved in building the  
 1010 current muon trigger system and new institutes. There are several physicists and engineers that have al-  
 1011 ready contributed to the project and more are joining now. The following sections cover the commitment  
 1012 and responsibilities of the participants and the current schedule.

### 1013 7.1 Responsibilities

1014 The NSW Trigger Processor project is being carried out by a collaboration of several ATLAS institutions.  
 1015 The project is comprised of three major aspects: the hardware platform, the firmware and trigger studies  
 1016 and testing. The hardware area also includes the input and output fibers. The firmware is divided in the  
 1017 trigger algorithm and ancillary functions. In addition there are many studies and testing required to the  
 1018 success of the project. Whenever feasible tasks are common to the MM and sTGC technologies. Below is  
 1019 an organigram of the different tasks with the corresponding manpower. This is a snapshot of the current  
 1020 situation and how we expect the manpower to evolve in the near future. Physicists (P) include senior  
 1021 and postdoc level staff, while engineers (E) and students (S) are listed separately. The estimated full-time  
 1022 equivalent commitment for each individual is also indicated.

1023 Work to be done during the final installation at CERN is not included in this graph. It is expected that  
 1024 much of the manpower listed here will eventually transit to those tasks.



Figure 39: Organization and manpower dedicated to the Trigger Processor project, including engineers (E), physicists (P) and students (S). The full-time equivalent estimations for each individual are also provided.

## 1025 Software

- 1026 • Michigan...

## 1027 7.2 Schedule

## 1028 8 Conclusion

1029 This document describes the current status of the NSW trigger processor. It documents the specifications,  
 1030 the trigger algorithms developed and the testing procedures. There are some open issues to be investi-  
 1031 gated, including a study of the bandwidth between FPGAs in the LAr card, the relative alignment of the  
 1032 BW and NSW and a study of the expected trigger rates.

1033 TO DO:

- 1034     1. update latency section (DONE but need Lorne's input)
- 1035     2. update trigger platform text and selection criteria (waiting on John Oliver)
- 1036     3. improve phase-2 compatibility
- 1037     4. Organization section
  - 1038         a) update responsibilities section (need text update)
  - 1039         b) add schedule
- 1040     5. Algorithm sections
  - 1041         a) include comparison of MM trigger algorithms? (waiting on Samira and David)
- 1042     6. Testing section
  - 1043         a) improve fig. 34 and its description (waiting on Nathan)
  - 1044         b) cosmic section missing sTGC information (waiting on Lorne)
  - 1045         c) vertical slice at CERN (waiting on Lorne)
- 1046     7. conclusions

<sup>1047</sup> **Appendix**

<sup>1048</sup> **A Fibers Layout**



Figure 40: Optical fiber plant for the sTGC, including both trigger and readout fibers. The current plan is for four fibers per layer (i.e. from each Router), instead of three.

Micromegas fibre plan

512 bi-dir GBT links  
1024 uni-dir trigger links



Figure 41: Optical fiber plant for the Micromegas, including both trigger and readout fibers

1049 **References**

- 1050 [1] ATLAS New Small Wheel collaboration, *New Small Wheel Technical Design Report*,  
 1051 CERN-LHCC-2013-006; ATLAS-TDR-020 (2013),  
 1052 URL: <https://cds.cern.ch/record/1552862>.
- 1053 [2] Stony Brook University, BNL, Arizona University, Bucharest University,  
 1054 *Trigger Processor Hardware Platform Options* (2015),  
 1055 URL: <https://svnweb.cern.ch/cern/wsvn/NSWELX/TriggerProcessor/documentation/DesignReviewFeb2015/TPHardwarePlatform/>.
- 1057 [3] Stony Brook University, BNL and Arizona University, *LAr Optical Test Card Description* (2015),  
 1058 URL: <https://svnweb.cern.ch/cern/wsvn/NSWELX/TriggerProcessor/documentation/DesignReviewFeb2015/TPHardwarePlatform/opticalTestCard.docx>.
- 1060 [4] Stony Brook University and Arizona University,  
 1061 *The ATLAS Phase-I Upgrade LAr System ATCA Carrier Board* (2015), URL:  
 1062 <https://svnweb.cern.ch/cern/wsvn/NSWELX/TriggerProcessor/documentation/DesignReviewFeb2015/TPHardwarePlatform/lar-atca-carrier.docx>.
- 1064 [5] Sorin Martoiu, *High-Density Optical Receiver Mezzanine board for ATCA-SRS* (2015),  
 1065 URL: <https://svnweb.cern.ch/cern/wsvn/NSWELX/TriggerProcessor/documentation/DesignReviewFeb2015/TPHardwarePlatform/HORX%20specs.docx>.
- 1067 [6] eicSys GmbH, *EATCA-100, FPGA based ATCA blade for RD51 SRS system, User Manual*  
 1068 (2014), URL: [https://svnweb.cern.ch/cern/wsvn/NSWELX/TriggerProcessor/documentation/DesignReviewFeb2015/TPHardwarePlatform/EATCA-100\\_UM\\_V0.pdf](https://svnweb.cern.ch/cern/wsvn/NSWELX/TriggerProcessor/documentation/DesignReviewFeb2015/TPHardwarePlatform/EATCA-100_UM_V0.pdf).
- 1070 [7] B Clark et al., *An Algorithm for Micromegas Segment Reconstruction in the Level-1 Trigger of the*  
 1071 *New Small Wheel*, ATL-UPGRADE-INT-2014-001 (2014),  
 1072 URL: <https://cds.cern.ch/record/1753329>.
- 1073 [8] *Letter of Intent for the Phase-I Upgrade of the ATLAS Experiment*,  
 1074 CERN-LHCC-2011-012. LHCC-I-020 (2011).
- 1075 [9] S. C. et al., *Micromegas Trigger Misalignment* (2014), URL:  
 1076 <https://indico.cern.ch/event/357663/contribution/3/material/slides/0.pdf>.
- 1077 [10] B. C. et al., *NSW MicroMegas trigger simulations comparing stereo strip configurations* (2014),  
 1078 URL:  
 1079 <https://indico.cern.ch/event/356850/contribution/2/material/slides/0.pdf>.
- 1080 [11] D. Lellouch, *NSW Estimates of Data Rates*, ATL-UPGRADE-INT-2015-??? (2015),  
 1081 URL: <https://cds.cern.ch/record/????>.