



**CERN/LHCC 2000 - 38**  
**CMS TDR 6.1**  
**December 15, 2000**

# C M S

## The TriDAS Project

# Technical Design Report, Volume 1: The Trigger Systems

Also available at <http://cmsdoc.cern.ch/cms/TDR/TRIGGER-public/trigger.html>

---

### CMS TriDAS Project

---

**Chairperson Institution Board:** Paris Sphicas, MIT-CERN, [paris.sphicas@cern.ch](mailto:paris.sphicas@cern.ch)

| Project Manager                                                                                 | Trigger Project Manager                                                                                                    | Resource Manager                                                                          |
|-------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------|
| Sergio Cittolin<br>CERN<br><a href="mailto:sergio.cittolin@cern.ch">sergio.cittolin@cern.ch</a> | Wesley H. Smith<br>University of Wisconsin<br><a href="mailto:wsmith@hep.physics.wisc.edu">wsmith@hep.physics.wisc.edu</a> | Joao Varela<br>LIP/Lisboa<br><a href="mailto:joao.varela@cern.ch">joao.varela@cern.ch</a> |

| CMS Spokesperson                                                                                         | CMS Technical Coordinator                                                           |
|----------------------------------------------------------------------------------------------------------|-------------------------------------------------------------------------------------|
| Michel Della Negra<br>CERN<br><a href="mailto:Michel.Della.Negra@cern.ch">Michel.Della.Negra@cern.ch</a> | Alain Herve<br>CERN<br><a href="mailto:Alain.Herve@cern.ch">Alain.Herve@cern.ch</a> |

---

## CMS Trigger TDR Editorial Board

---

W.H. Smith, Chair

|                |            |
|----------------|------------|
| Ph. Busson     | S. Dasu    |
| J. Hauser      | G. Heath   |
| J. Krolikowski | J. Varela  |
| A. Taurok      | G. Wrochna |
| P. Zotto       |            |

---



---

## CMS Trigger TDR Chapter Editors

---

|      |                                                                                                                     |
|------|---------------------------------------------------------------------------------------------------------------------|
| 1.   | W. H. Smith                                                                                                         |
| 2.   | G. Wrochna                                                                                                          |
| 3.   | J. Varela, S. Dasu                                                                                                  |
| 4.   | Ph. Busson, J. Varela, D. Baden                                                                                     |
| 5.   | S. Dasu                                                                                                             |
| 6.   | G. Heath                                                                                                            |
| 7.   | J. Varela                                                                                                           |
| 8.   | G. Wrochna                                                                                                          |
| 9.   | P. Zotto, A. Montanari, R. Martinelli, G. M. Dallavalle, F. Odorici                                                 |
| 10.  | J. Erö, M. Fierro, M. Brugger, G. M. Dallavalle                                                                     |
| 11.  | J. Hauser , P. Padley                                                                                               |
| 12 . | D. Acosta                                                                                                           |
| 13.  | J. Krolikowski, G. Wrochna, M. Konecki, M. Kudla, A. Ranieri,<br>E. Pietarinen, K. Banzuzi, K. Pozniak, P. Zalewski |
| 14.  | C.-E. Wulz, A. Taurok, H. Sakulin                                                                                   |
| 15.  | C.-E. Wulz, A. Taurok                                                                                               |
| 16.  | J. Varela                                                                                                           |
| 17.  | J. Varela                                                                                                           |
| 18.  | W. H. Smith                                                                                                         |
| 19.  | F. Szoncso                                                                                                          |
| 20.  | W. H. Smith, J. Varela                                                                                              |

---

## CMS Collaboration

**Yerevan Physics Institute, Yerevan, ARMENIA**

G.L. Bayatian, N. Grigorian, V.G. Khachatryan, A. Margarian, A.M. Sirunyan, S. Stepanian

**Institut für Hochenergiephysik der ÖAW, Wien, AUSTRIA**

W. Adam, M. Brugger, J. Erö, M. Fierro, M. Friedl, R. Fruehwirth, J. Hrubec, A. Jeitler, M. Krammer, M. Pernicka, P. Porth, H. Rohringer, L. Rurua<sup>1</sup>, A. Taurok, G. Walzel, R. Wedenig, C.-E. Wulz

**Research Institute for Nuclear Problems, Minsk, BELARUS**

V.G. Baryshevsky, A. Fedorov, N. Gorodichenine, M. Korzhik, O. Mishevitch<sup>2</sup>, V. Panov, R. Zuyeuski

**National Centre for Particle and High Energy Physics, Minsk, BELARUS**

I. Akushevich, N. Chekhlova, V. Chekhovsky, O. Dvornikov, I. Emeliantchik, A. Khomitch, V. Kolpaschikov, M. Kryvamaz, V. Kuvshinov, A. Litomin, V. Mossolov, A. Panfilenko, S. Reutovich, N. Shumeiko, A. Solin, R. Stefanovitch, V. Strazhev, S. Vetokhin, Y. Yurenja, V. Zalessky

**Research Institute of Applied Physical Problems, Minsk, BELARUS**

F. Ermalitsky, P. Kuchinsky, V. Lomako

**Byelorussian State University, Minsk, BELARUS**

V. Petrov, V. Prosolovich

**Vrije Universiteit Brussel, Brussels, BELGIUM**

O. Devroede, R. Goorens, J. Lemonne, S. Tavernier, F. Udo<sup>3</sup>, W. Van Doninck<sup>2</sup>, L. Van Lancker

**Université Libre de Bruxelles, Brussels, BELGIUM**

D. Bertrand, G. De Lentdecker, J. Stefanescu, C. Vander Velde, P. Vanlaer

**Université Catholique de Louvain, Louvain-la-Neuve, BELGIUM**

D. Favart, J. Govaerts, G. Gregoire, V. Lemaitre, A. Ninane, K. Piotrkowski, O. Van der Aa

**Université de Mons-Hainaut, Mons, BELGIUM**

I. Boulogne, E. Daubie, P. Herquet

**Universitaire Instelling Antwerpen, Wilrijk, BELGIUM**

W. Beaumont, T. Beckers, E. De Langhe, F. Moortgat, F. Verbeure, V. Zhukov<sup>4</sup>

**Institute for Nuclear Research and Nuclear Energy, Sofia, BULGARIA**

K. Abadjiev, T. Anguelov, I. Atanassov, J. Damgov, L. Dimitrov, V. Genchev, G. Georgiev, P. Iaydjiev, B. Kounov, L. Penchev, P. Raykov<sup>5</sup>, G. Sultanov, I. Vankov, P. Vankov

**University of Sofia, Sofia, BULGARIA**

N. Darmenov, A. Gritskov, A. Jordanov, L. Litov, M. Mateev, P. Petev, V. Spassov, M. Tchijov, R. Tsenov, G.V. Velev<sup>6</sup>

**Institute of High Energy Physics, Beijing, CHINA PR**

J.G. Bian, C. Chen, G.M. Chen, Y.N. Guo, J.T. He, C.H. Jiang, B.N. Jin, Z.J. Ke, J. Li, W.G. Li, X.N. Li, J.F. Qiu, B.W. Shen, X.Y. Shen, H.Y. Sheng, Y.Y. Wang, R.S. Xu, M. Yang, B.Y. Zhang, J.W. Zhang, S.Q. Zhang, W.R. Zhao, Z. Zhao, J.P. Zheng, G.Y. Zhu, Y.S. Zhu

**Peking University, Beijing, CHINA PR**

Y. Ban, J. Cai, J.E. Chen, H.T. Liu, S.Q. Liu, B.Q. Lou, S.J. Qian, Y.L. Ye

**University for Science and Technology of China, Hefei, Anhui, CHINA PR**

Q. An, Z. Bian, H. Chen, Z.F. Gong, C. Li, C.S. Shi, L.Z. Sun, X.L. Wang, Z.M. Wang, J. Wu, S.W. Ye, Z.P. Zhang

**Shanghai Institute of Ceramics, Shanghai, CHINA PR (Associated Institute)**

Q. Deng, P.J. Li, D.Z. Shen, Z.L. Xue, H. Yuan

**National Central University, Chung-Li, CHINA (TAIWAN)**

Y.H. Chang, A.E. Chen, A. Go, W. Lin

**National Taiwan University, Taipei, CHINA (TAIWAN)**

P. Chang, G.W.S. Hou, K. Ueno

**Technical University of Split, Split, CROATIA**

N. Godinovic, M. Milin<sup>7</sup>, I. Puljak, I. Soric, M. Stipcevic<sup>7</sup>, J. Tudoric-Ghemo

**University of Split, Split, CROATIA**

Z. Antunovic, M. Dzelalija, K. Marasovic

**University of Cyprus, Nicosia, CYPRUS**

A. Hasan<sup>5</sup>, P.A. Razis, A. Vorvolakos

**National Institute of Chemical Physics and Biophysics, Tallinn, ESTONIA**

A. Hall, E. Lippmaa, J. Lippmaa, M. Raidal, J. Subbi

**Laboratory of Advanced Energy Systems, Helsinki University of Technology, Espoo, FINLAND**

P.A. Aarnio

**Helsinki Institute of Physics, Helsinki, FINLAND**

K. Banzuzi, M.A. Heikkinen, J.V. Heinonen, A. Honkanen, V.J. Karimäki, H.M. Katajisto, R. Kinnunen, K. Lassila-Perini, V. Lefébure<sup>2</sup>, E. Pietarinen, E. Tuominen, J. Tuominiemi, D. Ungaro, T.P. Vanhala, C. Williams

**Department of Physics, University of Helsinki, Helsinki, FINLAND**

S. Lehti, T. Linden

**Department of Physics, University of Jyväskylä, Jyväskylä, FINLAND**

J. Aystö, R. Julin, V. Ruuskanen

**Dept. of Physics & Microelectronics Instrumentation Lab., University of Oulu, Oulu, FINLAND**

S. Kallijarvi, A.J. Keranen, L. Palmu, K. Remes, E. Suhonen, T. Tuuva

**Digital and Computer systems Laboratory, Tampere University of Technology, Tampere, FINLAND**

J. Niittylahti, O. Vainio

**Laboratoire d'Annecy-le-Vieux de Physique des Particules, IN2P3-CNRS, Annecy-le-Vieux, FRANCE**

Y.W. Baek, D. Boget, J. Ditta, G. Drobychev, J.P. Guillaud, M. Maire, J.P. Mendiburu, P. Nedelec, J.P. Peigneux, M. Schneegans<sup>2</sup>, D. Sillou

**DSM/DAPNIA, CEA/Saclay, Gif-sur-Yvette, FRANCE**

M. Anfreville, P. Bonamy, C. Bouchand, R. Chipaux, M. Dejardin, D. Denegri, F.X. Gentit, A. Givernaud, F. Kircher, Y. Lemoigne, E. Locci, J.P. Lottin, M.C. Nguyen, J.P. Pansart, A. Payn, J. Rander, J.M. Reymond, F. Rondeaux, A. Rosowsky, P. Roth, P. Verrecchia

**Laboratoire de Physique Nucléaire des Hautes Energies, Ecole Polytechnique, IN2P3-CNRS, Palaiseau, FRANCE**

J. Badier, M. Bercher, J. Bourotte<sup>2</sup>, P. Busson, D. Chamont, C. Charlot, L. Dobrzynski, J. Gilly, M. Haguenauer, A. Karar, G.B. Kim, L. Kluberg, D. Lecouturier, P. Matricon, G. Milleret, P. Mine, R. Morano, P. Paganini, P. Poilleux, T. Romantreau

**Institut de Recherches Subatomiques, IN2P3-CNRS - ULP, LEPSI Strasbourg, UHA Mulhouse, Strasbourg, FRANCE**

A. Albert<sup>8</sup>, J.D. Berst<sup>9</sup>, R. Blaes<sup>8</sup>, J.M. Brom, F. Charles<sup>8</sup>, J. Coffin, F. Didierjean, F. Drouhin<sup>8</sup>, J.P. Ernenwein<sup>8</sup>, J.C. Fontaine<sup>8</sup>, W. Geist, U. Goerlach, J.M. Helleboid, D. Huss<sup>8</sup>, C. Illinger<sup>9</sup>, P. Juillot, A. Lounis, C. Maazouzi, S. Moreau, Y. Riahi, I. Ripp-Baudot, T. Todorov<sup>2</sup>, D. Vintache, A. Zghiche

**Institut de Physique Nucléaire de Lyon, IN2P3-CNRS, Univ. Lyon I, Villeurbanne, FRANCE**  
M. Ageron, M. Bedjidian, E. Chabanat, C. Combaret, D. Contardo, P. Depasse, O. Drapier, M. Dupanloup, H. El Mamouni, J. Fay, S. Gascon-Shotkin, N. Giraud, C. Girerd, M. Goyot, R. Haroutounian, B. Ille, P. Lebrun, M. Lethuillier, J.P. Martin, H. Mathez, L. Mirabito, S. Muanza, S. Perries, P. Sahuc, G. Smadja, S. Tissot, J.P. Walder, F. Zach**High Energy Physics Institute, Tbilisi State University, Tbilisi, GEORGIA**

N. Amaglobeli, Y. Bagaturia, L. Glonti, V. Kartvelishvili, R. Kvataladze, D. Mzavia, T. Sakhelashvili<sup>10</sup>, R. Shanidze, Z. Tsamalaidze

**Institute of Physics Academy of Science, Tbilisi, GEORGIA**

N. Djaoshvili, I. Iashvili<sup>11</sup>, A. Kharchilava, N. Roinishvili, V. Roinishvili

**RWTH, I. Physikalisches Institut, Aachen, GERMANY**

C. Berger, W. Braunschweig, J. Breibach, W.H. Gu, A. Heister, W. Karpinski, T. Kirn, S. Konig, C. Kukulies, A. Ostaptchouk, D. Pandoulas, G. Pierschel, F. Raupach, S. Schael, D. Schmitz, A. Schultz Von Dratzig, R. Siedling, W. Wallraff, B. Wittmer

**RWTH, III. Physikalisches Institut A, Aachen, GERMANY**

K. Banicz, S. Bechstein, A. Bohm, K. Bosseler, H. Faissner, H. Fesefeldt, S. Hermann, A. Ivannikov, D. Rein, H. Reithler, H. Szczesny, M. Tonutti, M. Wegner

**RWTH, III. Physikalisches Institut B, Aachen, GERMANY**

M. Axer, F. Beissel, V. Commichau, G. Flüge, K. Hangarter, D. Macke, J. Mnich, A. Nowack, M. Petertill, P. Schmitz, R. Schulte, L. Sonnenschein, A. Zander

**Humboldt-Universität zu Berlin, Berlin, GERMANY**

M. Grunewald, T. Hebbeker, K. Hoepfner, A. Rosca

**Institut für Experimentelle Kernphysik, Karlsruhe, GERMANY**

V. Bartsch, P. Blum, W. De Boer, A. Dierlamm, G. Dirkes, V. Drollinger, M. Erdmann, M. Feindt, E. Grigoriev, F. Hartmann, F. Hauler, A. Heiss, T. Muller, F. Roederer, H.J. Simonis, A. Skiba, A. Theel, W.H. Thummel, T. Weiler, S. Weseler

**University of Athens, Athens, GREECE**

L. Resvanis

**Institute of Nuclear Physics "Demokritos", Attiki, GREECE**

P. Adzic, M. Barone, I. Bozovic-Jelisavcic, G. Fanourakis, T. Geralis, S. Harissopoulos, P. Kokkinias, A. Kyriakis, D. Loukas, A. Markou, C. Markou, N. Mastroyiannopoulos, J. Mousa, I. Siotis, M. Spyropoulou-Stassinaki, A. Staveris Polykalas, A. Tsirigotis, S. Tzamarias, A. Vayaki, K. Zachariadou, M. Zupan

**University of Ioánnina , Ioánnina , GREECE**

A. Asimidis, I. Evangelou, P. Kokkas, N. Manthos, O. Mitropoulos, K. Prouskas, F.A. Triantis, N. Tzoulis

**KFKI Research Institute for Particle and Nuclear Physics, Budapest, HUNGARY**

Z. Bagoly, G. Bencze<sup>2</sup>, A. Csilling, C. Hajdu, P. Hidas, D. Horvath<sup>12</sup>, G. Odor, A. Ster, L. Urban, G. Vesztergombi, P. Zalan, M. Zsenei

**Institute of Nuclear Research ATOMKI, Debrecen, HUNGARY**

G. Dajko, A. Fenyvesi, J. Molnar, J. Palinkas, D. Sohler, Z.L. Trocsanyi, J. Vamosi, J. Vegh

**Kossuth Lajos University, Debrecen, HUNGARY**

L. Baksay, T. Bondar, S. Juhasz, G. Marian, S. Nagy, P. Raics, J. Szabo, Z. Szabo, S. Szegedi, Z. Szillasi, T. Sztaricskai, P. Tarjan, G. Zilizi

**Panjab University, Chandigarh, INDIA**

S.B. Bala, V. Bhatnagar, M. Kaur, J.M. Kohli, J. Singh

**Bhabha Atomic Research Centre, Mumbai, INDIA**

S. Borkar<sup>2</sup>, V. Chandraatre, R.K. Chaudhury, M.Y. Dixit, M. Ghodgaonkar, B. John, S.K. Kataria, A.K. Mohanty, A. Topkar

**Tata Institute of Fundamental Research - EHEP, Mumbai, INDIA**

T. Aziz, Sunanda Banerjee, S. Chendvankar, P.V. Deshpande, S.N. Ganguli, A. Gurtu, S. Katta, M. Maity, K. Mazumdar, M.R. Patil, S.C. Tonwar

**Tata Institute of Fundamental Research - HEGR, Mumbai, INDIA**

B.S. Acharya, Sudeshna Banerjee, S. Bheesette, S. Dugad, S.D. Kalmani, M.R. Krishnaswamy, V.R. Lakkireddi, N.K. Mondal, N. Panyam, N. Vemuri, P. Verma

**University of Delhi South Campus, New Delhi, INDIA**

A. Bhardwaj, R.K. Shivpuri, V.K. Verma

**Università di Bari, Politecnico di Bari e Sezione dell' INFN, Bari, ITALY**

M. Abbrescia, A. Colaleo, D. Creanza, M. De Palma, L. Fiore, G. Iaselli, F. Loddo, G. Maggi, B. Marangelli, M. Menegotto, S. My, S. Natali, S. Nuzzo, G. Pugliese, V. Radicci, A. Ranieri, F. Romano, F. Ruggieri, G. Selvaggi, L. Silvestris<sup>2</sup>, P. Tempesta, G. Zito

**Università di Bologna e Sezione dell' INFN, Bologna, ITALY**

A. Benvenuti, P. Capiluppi, F. Cavallo, M. Cuffiani, I. D'Antone, G.M. Dallavalle, F. Fabbri, A. Fanfani, G. Giacomelli, P. Giacomelli<sup>13</sup>, C. Grandi, M. Guerzoni, S. Marcellini, P. Mazzanti, A. Montanari, C. Montanari, F. Navarria, F. Odorici, A. Perrotta, A. Rossi, T. Rovelli, G. Siroli, R. Travaglini

**Università di Catania e Sezione dell' INFN, Catania, ITALY**

S. Albergo, V. Bellini, P. Castorina, S. Cavalieri, M. Chiorboli, S. Costa, L. Lo Monaco, R. Potenza, V. Russo, A. Tricomi, C. Tuve

**Università di Firenze e Sezione dell' INFN, Firenze, ITALY**

L. Bellucci, U. Biggeri, E. Borchi, M. Bruzzi, A. Buffini, S. Busoni, G. Castellini, C. Civinini, R. D'Alessandro, E. Focardi, G. Landi, M. Lenzi, M. Meschini, C. Minelli, G. Parrini, M. Pieri, S. Pirollo, S. Sciortino

**Università di Genova e Sezione dell' INFN, Genova, ITALY**

M. Bozzo, P. Fabbricatore, S. Farinon, R. Musenich, C. Priano

**Laboratori Nazionali di Legnaro e Sezione dell' INFN, Legnaro, ITALY (Associated Institute)**

L. Berti, M. Biasotto, U. Gastaldi, M. Gulmini, G. Maron, N. Toniolo

**Università di Padova e Sezione dell' INFN, Padova, ITALY**

P. Azzi, N. Bacchetta, M. Bellato, M. Benettoni, D. Bisello, A. Candelori, A. Castro, P. Checchia, E. Conti, M. De Giorgi, A. De Min, U. Dosselli, C. Fanin, F. Gasparini, U. Gasparini, F. Gonella, A. Kaminski, S. Lacaprara, I. Lippi, M. Loreti, A.T. Meneguzzo, M. Michelotto, F. Montecassiano, A. Neviani, A. Paccagnella, S. Paoletti, M. Passaseo, M. Pegoraro, P. Ronchese, I. Stavitski, E. Torassa, L. Ventura, S. Ventura, M. Verlato, P. Zotto<sup>14</sup>, G. Zumerle

**Università di Pavia e Sezione dell' INFN, Pavia, ITALY**

S. Altieri, G. Belli, G. Bruno, R. Guida, M. Merlo, S.P. Ratti, C. Riccardi, P. Torre, P. Vitulo

**Università di Perugia e Sezione dell' INFN, Perugia, ITALY**

M.M. Angarano, E. Babucci, M. Biasini, G.M. Bilei, M.T. Brunetti, F. Ceccotti, B. Checcucci, M. Giorgi, P. Lariccia, G. Mantovani, D. Passeri, P. Placidi, V. Postolache, R. Santinelli, A. Santocchia, A. Scorzoni, L. Servoli, G. Tommasi

**Università di Pisa e Sezione dell' INFN, Pisa, ITALY**

F. Angelini, G. Bagliesi, A. Bardi, A. Basti, R. Bellazzini, J. Bernardini, L. Borrello, F. Bosi, P.L. Braccini, A. Brez, R. Carosi, R. Castaldi, G. Chiarelli, V. Ciulli, M. Dell'Orso, R. Dell'Orso, S. Donati, S. Dutta, L. Foa, S. Galeotti, P. Giannetti, A. Giassi, S. Giusti, G. Iannaccone, L. Latronico, F. Ligabue, N. Lumb, G. Magazzu, M. Mariani, M.M. Massai, A. Messineo, O. Militaru<sup>15</sup>, A. Moggi, F. Morsani, F. Palla, G. Punzi, F. Raffaelli, L. Ristori, G. Sanguinetti, A. Sciaba, G. Segneri, G. Sguazzoni, G. Spandre, M. Spezziga, F. Spinella, A. Starodumov, R. Tenchini, L. Teodorescu<sup>15</sup>, G. Tonelli, A. Venturi, P.G. Verdini, J. Wang, Z. Xie

**Università di Roma I e Sezione dell' INFN, Roma, ITALY**

S. Baccaro<sup>16</sup>, L. Barone, A. Bartoloni, M. Castellani, A. Cecilia, I. Dafinei, F. De Notaristefani, M. Diemoz, A. Festinesi<sup>16</sup>, E. Longo<sup>2</sup>, P. Meridiani, M. Montecchi<sup>16</sup>, G. Organtini, M. Puccini<sup>16</sup>, E. Valente, A. Zullo

**Università di Torino e Sezione dell' INFN, Torino, ITALY**

N. Amapane, M. Arneodo, F. Bertolino, C. Biino-Palestini, R. Cirio, M. Costa, D. Dattola, F. Daudio, V. Del Duca, L. Demaria, G. Favro, M.I. Ferrero, S. Maselli, V. Monaco, C. Peroni, A. Romero, R. Sacchi, A. Solano, A. Staiano, A. Vitelli

**Cheju National University, Cheju, KOREA**

Y.J. Kim

**Chungbuk National University, Chongju, KOREA**

Y.U. Kim

**Kangwon National University, Chunchon, KOREA**

S.K. Nam

**Wonkwang University, Iksan, KOREA**

S.Y. Bahk

**Chonnam National University, Kwangju, KOREA**

H.I. Jang, J.Y. Kim, T.I. Kim, I.T. Lim

**Dongshin University, Naju, KOREA**

M.Y. Pac

**Seonam University, Namwon, KOREA**

S.J. Lee

**Konkuk University, Seoul, KOREA**

J.T. Rhee

**Korea University, Seoul, KOREA**

S. Ahn, B. Hong, S.J. Hong, Y.J. Kim, K.S. Lee, H.K. Park, S.K. Park, K.S. Sim

**Seoul National University of Education, Seoul, KOREA**

D.G. Koo

**Seoul National University, Seoul, KOREA**

B.J. Kim, S.B. Kim, I.H. Park

**Sungkyunkwan University, Suwon, KOREA**

B.G. Cheon, Y.I. Choi

**Kyungpook National University, Taegu, KOREA**

W.H. Chung, S.W. Ham, H.M. Jeon, D.H. Kim, G. Kim, W.Y. Kim, S.K. Oh, S. Ro, D.C. Son

**National Centre for Physics, Quaid-I-Azam University, Islamabad, PAKISTAN**Z. Aftab, M.A. Ahmad, J. Alam Jan, N. Bhatti, K. Hasanain, H.R. Hoorani<sup>2</sup>, M.K. Khan, S.M. Khan, A. Niaz, R. Riazuddin, T. Solaija**Ghulam Ishaq Khan Institute of Engineering Sciences and Technology, Swabi, PAKISTAN**

J. Ahmad, I.U. Awan, N. Iftikhar, M.A. Khan, M.U. Mirza, A. Muhammad, J. Zeb

**Institute of Experimental Physics, Warsaw, POLAND**

M. Cwiok, H. Czyrkowski, R. Dabrowski, W. Dominik, M. Kazana, J. Krolikowski, I. Kudla, P. Majewski, M. Pietrusinski, K. Pozniak, P. Zych

**Soltan Institute for Nuclear Studies, Warsaw, POLAND**R. Gokieli<sup>2</sup>, M. Gorski, L. Goscilo, G. Wrochna, P. Zalewski**Laboratório de Instrumentação e Física Experimental de Partículas, Lisboa, PORTUGAL**C. Almeida<sup>30</sup>, N. Almeida, J. Augusto<sup>30</sup>, T. Barata Monteiro, P. Bordalo<sup>31</sup>, N. Cardoso<sup>30</sup>, S. Da Mota Silva, J. Da Silva, O.P. Dias<sup>30</sup>, J. Gomes, F.M. Goncalves<sup>30</sup>, S. Ramos<sup>31</sup>, M. Santos<sup>30</sup>, J. Semiao<sup>30</sup>, S. Sequeira Lopes Tavares<sup>2</sup>, S. Silva, C. Simoes Azevedo, I. Teixeira<sup>30</sup>, J.P. Teixeira<sup>30</sup>, J. Varela<sup>2, 31</sup>

**Joint Institute for Nuclear Research, Dubna, RUSSIA**

S. Afanasiev, I. Anissimov, D. Bandurin, A. Belkov, S. Chatrchyan, A. Cheremukhin, C.V. Cheshkov, A. Chvyrov, A. Dmitriev, V. Elsha, Y. Erchov, M. Finger, M. Finger, I. Golutvin, N. Gorbunov, I. Gramenitsky, I. Ivantchenko, A. Janata, V. Kalagin, V. Karjavin, S. Khabarov, V. Khabarov, Y. Kiryushin, V. Kolesnikov, V. Konoplyanikov, V. Korenkov, I. Kossarev, A. Koutov, T. Kracikova, V. Krasnov, A. Litvinenko, V. Lysiakov, A. Malakhov, G. Mechtcheriakov, I. Melnichenko, V. Mitsyn, P. Moissenz, S. Movchan, V. Palichik, V. Perelygin, Y. Petukhov, M. Popov, D. Pose, R. Pose, A. Samoshkin, M. Savina, S. Sergeev, S. Shmatov, A. Skatchkova, V. Smirnov, D. Smolin, E. Tikhonenko, V. Uzhinskii, N. Vlasov, A. Volodko, A. Yukaev, N. Zamiatin, A. Zarubin, P. Zarubin, E. Zubarev

**Petersburg Nuclear Physics Institute, Gatchina (St Petersburg), RUSSIA**

A. Atamanchuk, V. Barashko, N. Bondar, L. Chtchipounov, A. Denissov, G. Gavrilov, V. Golovtsov, Y. Gusev, Y. Ivanov, O. Kisilev, V. Kozlov, E. Kouznetsova, E. Lobatchev, G. Makarenkov, E. Orichtchine, A. Petrunin, O. Prokofiev<sup>17</sup>, V. Rasmislovich, B. Razmyslovich, V. Sknar, I. Smirnov, S. Sobolev, V. Soulimov, V. Souvorov, A. Vassiliev, G. Velitchko, S. Volkov, A. Vorobyov

**Institute for Nuclear Research, Moscow, RUSSIA**

G.S. Atoyan, V. Bolotov, S. Gninenko, N. Goloubev, E.V. Gushin, M. Kirsanov, A. Kovzelev, N. Krasnikov, S. Laptev, V. Matveev, A. Pashenkov, A. Poliarouch, V.E. Postoev, A. Proskouriakov, A. Sadovski, I. Semeniouk, V. Shmatkov, A. Skassyrskaya<sup>6</sup>, A. Toropin<sup>6</sup>

**Institute for Theoretical and Experimental Physics, Moscow, RUSSIA**

E. Dorochkevitch, V. Gavrilov, V. Kaftanov, A. Khanov<sup>18</sup>, I. Kisselievitch, V. Kolosov, M. Kossov, S. Koulechov, A. Krokhotine, A. Oulianov, N. Stepanov, V. Stoline, S. Uzunyan

**P.N. Lebedev Physical Institute, Moscow, RUSSIA**

E. Devitsin, A.M. Fomenko, N. Konovalova, V. Kozlov, A.I. Lebedev, N. Loktionova, N. Lvova, S. Potashov, S.V. Rusakov, A. Terkulov

**Moscow State University, Moscow, RUSSIA**

A. Belsky, V. Bodyagin, E. Boos, A. Cherstnev, A. Demianov, M. Dubinin, L. Dudko, A. Erchov, R. Gloukhov, A. Gribushin, V. Ilin, O.L. Kodolova, V. Korotkikh, A. Krioukov, N.A. Kruglov, I.P. Loktin, V. Mikhailin, S. Petrouchanko, A. Poukhov, L. Sarycheva, V. Slad, A. Sniguirev, I. Vardanyan, A. Vassiliev

**High Temperature Technology Center of Research & Development Institute of Power Engineering (HTTC RDIPE), Moscow, RUSSIA (Associated Institute)**

D. Chmelev, A. Ivanov, V. Koudinov, O. Logatchev, S. Onishchenko, A. Orlov, V. Sakharov, V. Smetannikov, S. Zavodthikov

**Institute for High Energy Physics, Protvino, RUSSIA**

A. Abramov, V. Abramov, I. Azhgirey, S. Bitioukov, K. Datsko, A. Dolgopolov, V. Evdokimov, V. Falaleev<sup>2</sup>, P. Goncharov, A. Inyakin, V. Katchanov, Y. Kharlov, V. Klioukhine, E. Kolatcheva, A. Korablev, Y. Korneev, A. Kostritski, A. Krinitsyn, V. Kryshkin, O. Lapyguina, A. Levine, A. Markov, V. Medvedev, M. Oukhanov, V. Pak, D. Patalakha, V. Petrov, V. Pikalov, V. Potapov, A. Riabov, A. Sannikov, P. Shagin, Z. Simonova, E. Skvortsova, S. Slabospitski, A. Sobol, V. Solovianov, V. Sougonyaev, S. Stepouchkine, A. Surkov, A. Sytin, B. Tchuiko, S. Tereschenko, S. Troshin, L. Turchanovich, N.E. Tyurin, A. Uzunian, A. Volkov, A. Zaichenko, S. Zelepoukine

**Russian Federal Nuclear Centre - Scientific Research Institute for Technical Physics (RFNC-VNIITF), Snezhinsk, RUSSIA (Associated Institute)**

A. Andriyash, A. Chtcherbakov, D. Gorchkov, D. Griaznykh, O. Gueinak, D. Korotchine, S. Kotegov, A. Maloiaroslavtsev, M. Naoumenko, I. Pavlov, S. Samarine, R. Skripov

**Slovak University of Technology, Bratislava, SLOVAK REPUBLIC**

P. Ballo, J. Lipka, V. Necas, M. Seberini, K. Vitazek

**Centro de Investigaciones Energeticas Medioambientales y Tecnologicas, Madrid, SPAIN**

M. Aguilar-Benitez, J. Alberdi, J.M. Barcala, M. Cerrada, N. Colino, M. Daniel, M. Fernandez Garcia, A. Ferrando, M.C. Fouz, M.I. Josa, P. Ladron De Guevara, J. Marin, A. Molinero, J.C. Oller, J.L. Pablos, J. Puerta Pelayo, L. Romero, J. Salicio, C. Willmott

**Universidad Autónoma de Madrid, Madrid, SPAIN**

C. Albajar

**Universidad de Oviedo, Oviedo, SPAIN**

J. Cuevas

**Instituto de Física de Cantabria (IFCA), CSIC-Universidad de Cantabria, Santander, SPAIN**

P. Arce, E. Calvo, C. Figueroa, G. Gomez Ceballos, I. Gonzalez, J.M. Lopez, M.A. Lopez Virto, J. Marco, F. Matorras, T. Rodrigo, A. Ruiz Jimeno, I. Vila

**Universität Basel, Basel, SWITZERLAND**

P. Garcia-Abia, L. Tauscher, S. Vlachos, M. Wadhwa

**CERN, European Organization for Nuclear Research, Geneva, SWITZERLAND**

D. Abbaneo<sup>6</sup>, R. Alemany-Fernandez, A. Annenkov<sup>19</sup>, P. Aspell, E. Auffray, P. Azzurri, P. Baillon, A. Ball, R. Barillere, D. Barney, D. Blechschmidt, P. Bloch, M. Bosteels, S. Braibant, H. Breuker, P. Brooks<sup>20</sup>, D. Campi, A. Caner, E. Cano, F. Carena, A. Cattai, F. Cavallari, G. Cervelli, R. Chierici, J. Christiansen, S. Cittolin, B. Cure, C. D'Ambrosio, A. De Roeck, T. De Visser, D. Delikaris, M. Della Negra, A. Elliott-Peisert, B. Faure, A. Favara, P. Figueiredo, H. Foeth, R. Folch, A. Frey, W. Funk, A. Furtjes, A. Gaddi, J.C. Gayde, H. Gerwig, K. Gill<sup>21</sup>, W. Glessing, R. Goudard, P. Gras, J.P. Grillet, J. Guteleber, F. Hahn, S. Hameed Khan<sup>22</sup>, R. Hammarstrom, M. Hansen, E.H.M. Heijne, A. Hervé, A. Honma, M. Huhtinen, V. Innocente, W. Jank, P. Janot, P. Jarron, M. Kado, K. Kloukinas, C. Koch, M. Konecki, Z. Kovacs, V. Lara, C. Lasseur, J.M. Le Goff, M. Lebeau, P. Lecoq, M. Letheren, M. Liendl, C. Ljuslin, B. Lofstedt, R. Loos, R. Mackenzie, R. Malina, M. Mannelli, E. Manola-Poggiali, A. Marchioro, J.C. Marin, C. Mariotti, C. Martinez Rivero, J. Matheson, J.M. Maugain, F. Meijers, M. Mermoud, E. Meschi, E. Migliore, J. Mocholi Moncholi<sup>23</sup>, A. Moutoussi, N. Neumeister, A. Nikitenko<sup>24</sup>, A. Oh, A. Onnela, M. Oriunno, L. Orsini, G. Pa'Sztor, L. Pape, C. Palomares Espiga, G. Passardi, P. Petagna, A. Pfeiffer, M. Pimiä, R. Pintus, E. Piotto, B. Pirollet, A. Placci, J.P. Porte, H. Postema, J. Pothier, R. Principe, A. Quadt, A. Racz, P. Rebecchi, P. Reis, S. Reynaud, H. Rezvani Naraghi, R. Ribeiro, J. Roche, P. Rodrigues Simoes Moreira, T. Rohe, G. Rolandi, H. Sakulin, D. Samyn, J.C. Santiard, W. Schleifer, R. Schmidt, M. Schroder, C. Schwick, P. Sempere Roldan, P. Sharp<sup>25</sup>, P. Siegrist, F. Sikler, A. Simma, P. Spagnolo, P. Sphicas<sup>26</sup>, H. Stockinger, F. Szoncso, B.G. Taylor, D. Tchougounov<sup>19</sup>, T. Toifl, N. Toth<sup>20</sup>, D. Treille, J. Troska, E. Tsesmelis, A. Tsirou, F. Van Lingen<sup>20</sup>, F. Vasey, T. Virdee<sup>21</sup>, H. Voss, W. Weingarten, J.P. Wellisch, P. Wertelaers, M. Wilhelmsson, I.M. Willers, M. Winkler<sup>27</sup>, S. Wynhoff

**Paul Scherrer Institut, Villigen, SWITZERLAND**

M. Barbero, R. Baur, W. Bertl, K. Deiters, P. Dick, A. Dijksman, K. Gabathuler, J. Gobrecht, G. Heidenreich, R. Horisberger, Q. Ingram, D. Kotlinski, R. Morf, D. Renker, R. Schnyder

**Institut für Teilchenphysik, Eidgenössische Technische Hochschule (ETH), Zürich, SWITZERLAND**

H. Anderhub, G. Antchev<sup>28</sup>, A. Badertscher, A. Barczyk, B. Betev, A. Biland, B. Blau, D. Bourilkov<sup>28</sup>, A. Bueno, M. Campanelli, P. Cannarsa, C. Carpanese, N. Chivarov<sup>28</sup>, M. Dittmar<sup>2</sup>, L. Djambazov, R. Eichler, W. Erdmann<sup>10</sup>, G. Faber, J.L. Faure, M. Felcini, K. Freudenreich, I. Gil Botella, C. Grab, M. Hilgers, H. Hofer, A. Holzner, I. Horvath, C. Humbertclaude<sup>2</sup>, B. Iliev<sup>28</sup>, P. Ingenito, J. Kuipers, P. Le Coultre, P. Lecomte, B. List, W. Lustermann, I. Nanov, F. Nessi-Tedaldi, R.A. Ofierzynski, A. Patino Revuelta<sup>2</sup>, F. Pauss, G. Rahal, J.F. Rico, C.H. Rivetta<sup>2</sup>, U. Roeser, A. Rubbia, H. Rykaczewski, A. Schoning, N. Sinanis, H. Suter, S. Udriot, J. Ulbricht, I. Veltchev<sup>28</sup>, G. Viertel, H. Von Gunten, S. Waldmeier-Wicki, A. Zanet<sup>2</sup>

**Universität Zürich, Zürich, SWITZERLAND**

C. Amsler, R. Kaufmann, H. Pruys, C. Regenfus, P. Riedler, P. Robmann, T. Speer, S. Steiner

**Cukurova University, Adana, TURKEY**

I. Dumanoglu, E. Eskut, A. Kayis Topaksu, A. Kuzucu-Polatoz, G. Onengut, N. Ozdes Koca

**Middle East Technical University, Physics Department, Ankara, TURKEY**

A.M. Guler, M. Serin-Zeyrek, R. Sever, P. Tolun, H. Yildiz, M. Zeyrek

**Bogazici University, Department of Physics, Istanbul, TURKEY**

E. Gulmez, R. Unalan

**Institute of Single Crystals of National Academy of Science, Kharkov, UKRAINE**

A. Borisenko, B. Grinev, V. Lebedev, V. Lyubynskiy, V. Senchyshyn, V. Vasilchuk

**National Scientific Center, Kharkov Institute of Physics and Technology, Kharkov, UKRAINE**

L. Levchuk, A. Nemashkalo, V. Popov, P. Sorokin

**Kharkov State University, Kharkov, UKRAINE**

S. Duplij, N.A. Kluban, I. Zalyubovskiy

**University of Bristol, Bristol, UNITED KINGDOM**

D.S. Bailey, J.J. Brooke, D. Cussans, R.D. Head, G.P. Heath, H.F. Heath, C.K. Mackay, D.M. Newbold, A.D. Presland, M.G. Probert, V.J. Smith, R.J. Tapper

**Centre for Complex Cooperative Systems, University of the West of England, Bristol, UNITED KINGDOM (Associated Institute)**

N. Baker, A. Barry, G. Chevenier, F. Estrella, G. Mathers<sup>2</sup>, R. McClatchey<sup>2</sup>

**Rutherford Appleton Laboratory, Didcot, UNITED KINGDOM**

S.A. Baird, R.A. Barlow, E. Bateman, K.W. Bell, R.M. Brown, D.J. Cockerill, J.A. Coughlan, L.G. Denton, P.S. Flower, V.B. Francis, M. French, J. Greenhalgh, R. Halsall, W.J. Haynes, F.R. Jacob, P.W. Jeffreys, L. Jones, B.W. Kennedy, L. Lintern, A.B. Lodge, J. Maddox, S. Martin, Q.R. Morrissey, P. Murray, P. Rabbett, A.A. Shah, B. Smith, S. Spagnolo, M. Sproston, R. Stephenson, P. Thayaparan, I. Tomalin, M. Torbet, J. Williams

**Imperial College, University of London, London, UNITED KINGDOM**

M. Apollonio, G. Barber, R. Beuselinck, D. Britton, W. Cameron, E. Corrin, G. Davies, C. Foudas, J. Fulcher, G. Hall, J. Hays, G. Iles, B.C. Macevoy, N. Marinelli, E.M. McLeod, E. Noah Messomo, D.M. Raymond, P.J. Savage, C. Seez, L. Toudup, P. Walsham

**Brunel University, Uxbridge, UNITED KINGDOM**

R. Broadhead, B. Camanzi, P.R. Hobson, D.C. Imrie, A. McKemey, O. Sharif, S.J. Watts

**University of California at Davis, Davis, California, USA**

R. Breedon, P.T. Cox, J. Gunion, S. Hershman, B. Holbrook, W. Ko, R. Lander, P. Murray, D. Pellett, J. Smith, M. Tripathi, R. Vogt

**University of California San Diego, La Jolla, California, USA**

S. Bhattacharya, J.G. Branson, I. Fisk, J.P. Fryckman, D. Macfarlane, M. Mojaver, H.P. Paar, G. Raven, V. Sharma, A. White

**University of California at Los Angeles, Los Angeles, California, USA**

K. Arisaka, A. Attal, F. Chase, D. Cline, R. Cousins, S. Erhan, J. Hauser, M. Lindgren, C. Matthey, S. Otwinowski, Y. Pischalnikov, P. Schlein, Y. Shi, B. Tannenbaum, M. Von Der Mey, H.G. Wang

**California Institute of Technology, Pasadena, California, USA**

J. Bunn, G. Denis, P. Galvez, M. Gataullin, M. Hafeez<sup>2</sup>, T. Hickey, K. Holtman, I. Legrand, V. Litvine, H.B. Newman, A. Samar, S. Shevchenko, R. Wilkinson, L. Xia, R.Y. Zhu

**University of California, Riverside, California, USA**

R. Clare, I. Crotty<sup>2</sup>, J.W. Gary, J.G. Layter, H. Rick, B.C. Shen, V. Sytnik, D. Zer-Zion<sup>2</sup>

**Fairfield University, Fairfield, Connecticut, USA**

C.P. Beetz, G. Cirino, V. Podrasky, C. Sanzeni, D. Winn

**University of Florida, Gainesville, Florida, USA**

D. Acosta, P. Avery, S. Dolinsky, R.D. Field, L. Gorn, S. Klimenko, J. Konigsberg, A. Korytov, A. Madorsky, G. Mitselmakher<sup>17</sup>, A. Nomerotski, P. Ramond, B. Scurlock, S.M. Wang, J. Yelton

**Florida State University, Tallahassee, Florida, USA**

H. Baer, M. Bertoldi, H. Goldman, S. Hagopian, V. Hagopian, K.F. Johnson, H.B. Prosper, J. Thomaston, H. Wahl

**Fermi National Accelerator Laboratory, Batavia, Illinois, USA**

G. Apolinari, M. Atac, S. Aziz, E. Barsotti, L.A.T. Bauerick, A. Baumbaugh, U. Baur, A. Beretvas, M. Binkley, M. Bowden, J.N. Butler, N. Chester, I. Churin, S. Cihangir, M. Crisler, D. Denisov, M. Diesburg, D.P. Eartly, J.E. Elias, S. Feher, B. Flaugher, J. Freeman, I. Gaines, H. Glass, D.A. Glenzinski, J. Goldstein, D. Green, J. Hanlon, S. Hansen, R.M. Harris, J. Incandela, U. Joshi, S. Kwan, M. Lamm, S. Lammel, D. Lazic, R. Lee, R. Lipton, M. Litmaath, S. Los, P. Lukens, K. Maeshima, J.M. Marraffino, S. Mishra, N. Mokhov, C. Moore, S. Muzaffar, V. O'Dell, J. Patrick, R. Pordes, R. Raja, P. Rapidis, M. Reichanadter, A. Ronzhin, V. Rykalin, T. Shaw, M. Shea, E. Skup, R.P. Smith, L. Spiegel, D. Stuart, I. Suzuki, S. Tkaczyk, R. Tschirhart, R. Vidal, R. Wands, H. Wenzel, J. Whitmore, W.J. Womersley, W.M. Wu, A. Yagil, V. Yarba

**University of Illinois at Chicago (UIC), Chicago, Illinois, USA**

M.R. Adams, C.E. Gerber, K. Papageorgiou, J. Solomon

**Northwestern University, Evanston, Illinois, USA**

B. Gobbi, S. Malik, R. Tilden

**University of Notre Dame, Notre Dame, Indiana, USA**

B. Baumbaugh, J.M. Bishop, N.M. Cason, M. Hildreth, D.J. Karmgard, R. Ruchti, J. Warchol, M. Wayne

**Purdue University - Tasks D & G, West Lafayette, Indiana, USA**

V.E. Barnes, G. Bolla, D. Bortoletto, A. Bujak, A.F. Garfinkel, L. Gutay, M. Kopal, A.T. Laasanen, S. Medved, I. Pal, C. Rott, A. Roy, A. Sedov

**Iowa State University, Ames, Iowa, USA**

E.W. Anderson, H. Chakir, J.M. Hauptman, J. Krane

**The University of Iowa, Iowa City, Iowa, USA**

U. Akgun, A.S. Ayan, A. Cooper, M. Fountain, E. McCliment, J.P. Merlo, M.J. Miller, Y. Onel, I. Schmidt

**Johns Hopkins University, Baltimore, Maryland, USA**

B.A. Barnett, C.Y. Chien, H.S. Cho, G. Liang, M. Swartz, X. Xie

**University of Maryland, College Park, Maryland, USA**S. Abdullin<sup>24</sup>, S. Arcelli, D. Baden, R. Bard, S.C. Eno, D. Fong, T. Grassi, N.J. Hadley, R.G. Kellogg, S. Kunori, A. Sharma, A. Skuja**Boston University, Boston, Massachusetts, USA**

R. Carey, E. Hazen, E. Kearns, E. Machado, J. Miller, D. Osborne, B.L. Roberts, J. Rohlf, L. Sulak, J.D. Sullivan, S. Wu

**Northeastern University, Boston, Massachusetts, USA**G. Alverson, H. Fenker, I. Gaponenko, J. Moromisato, Y.V. Musienko<sup>29</sup>, S. Nicol, T. Paul, S. Reucroft, J. Swain, L. Taylor, E. Von Goeler, T. Yasuda**Massachusetts Institute of Technology, Cambridge, Massachusetts, USA**

G. Bauer, J. Friedman, C. Paus, S. Pavlon, L. Rosenson, K.S. Sumorok, S. Tether, J. Tseng

**University of Minnesota, Minneapolis, Minnesota, USA**

P. Cushman, A.H. Heering, I. Kronkvist, R. Rusack, A. Singovsky, P. Vikas

**University of Mississippi, University, Mississippi, USA**

K. Bhatt, M. Boone, L. Cremaldi, R. Kroeger, J. Reidy, D. Sanders, D. Summers

**University of Nebraska-Lincoln, Lincoln, Nebraska, USA**

W.B. Campbell, D.R. Claes, C. Lundstedt, G.R. Snow

**Rutgers, the State University of New Jersey, Piscataway, New Jersey, USA**

E. Bartz, J. Conway, T. Devlin, J. Doroshenko, P.F. Jacques, M.S. Kalelkar, T. Koeth, A. Lath, L. Perera, S. Schnetzer, S. Somalwar, R. Stone, G. Thomson, T.L. Watts

**Princeton University, Princeton, New Jersey, USA**

J.M. Bussat, P. Denes, V. Gupta, J. Mans, D. Marlow, P. Piroue, D. Stickland, C. Tully, T. Wildish

**University of Rochester, Rochester, New York, USA**

S.R. Blusk, A. Bodek, H. Budd, P. De Barbaro, A. Dyshkant, G. Ginther, M.C. Kruse, D. Ruggiero, W. Sakumoto, P. Slattery, P. Tipton

**The Ohio State University, Columbus, Ohio, USA**

B. Bylsma, L.S. Durkin, J. Gilmore, J. Gu, D. Herman, C. Kim, D. Larsen, T.Y. Ling, C.J. Rush, V. Sehgal

**Carnegie Mellon University, Pittsburgh, Pennsylvania, USA**

T. Ferguson, J. Russ, N. Terentyev, H. Vogel, I. Vorobiev

**Rice University, Houston, Texas, USA**

N. Adams, M. Corcoran, G. Eppley, J. Lamas-Valverde, M. Matveev, H.E. Miettinen, T. Nussbaum, P. Padley, E. Platner, J. Roberts, P. Yepes

**Texas Tech University, Lubbock, Texas, USA**

N. Akchurin, J. Cranshaw, V. Nagaslaev, V. Papadimitriou, A. Sill, R. Wigmans

**University of Texas at Dallas, Richardson, Texas, USA**

R.C. Chaney, E.J. Fenyves, H.D. Hammack, M.R. O'Malley, D. Suson, A.V. Vassiliev

**Virginia Polytechnic Institute and State University, Blacksburg, Virginia, USA**

H. Meyer, L. Mo, T.A. Nunamaker

**University of Wisconsin, Madison, Wisconsin, USA**

D. Carlsmith, P. Chumney, S. Dasu, F. Feyzi, M. Jaworski, J. Lackey, R. Loveless, S. Lusin, D. Reeder, W.H. Smith

**Institute of Nuclear Physics of the Uzbekistan Academy of Sciences, Ulugbek, Tashkent, UZBEKISTAN**

A. Avezov, M. Belov, N. Bisenov, A. Gafarov, E. Gasanov, E. Ibragimova, G. Kim, Y. Koblik, N. Rakhmatov, I. Rustamov, I. Shukrullo, A. Urkinbaev, B.S. Yuldashev

1. Also at Inst. of Physics Academy of Science, Tbilisi, Georgia
2. Also at CERN, Geneva, Switzerland
3. Also at NIKHEF, Amsterdam, Netherlands
4. Also at Moscow State Univ., Moscow, Russia
5. Also at Inst. für Teilchenphysik, ETH, Zürich, Switzerland
6. Also at Univ. di Pisa e Sez. dell' INFN, Pisa, Italy
7. Also at Institute Rudjer Boskovic, Zagreb, Croatia
8. Also at Université de Haute-Alsace, Mulhouse, France
9. Also at Université Louis Pasteur, Strasbourg, France
10. Also at Paul Scherrer Inst., Villigen, Switzerland
11. Also at Humboldt-Univ. zu Berlin, Berlin, Germany
12. Also at Institute of Nuclear Research ATOMKI, Debrecen, Hungary
13. Also at Univ. of California, Riverside, California, USA
14. Also at Politecnico di Milano, Milano, Italy

15. Also at Univ. of Bucharest, Bucuresti-Magurele, Romania
16. Also at ENEA - Casaccia Research Center, S. Maria di Galeria, Italy
17. Also at Fermi National Accelerator Lab., Batavia, Illinois, USA
18. Also at Kansas State Univ., Manhattan, Kansas, USA
19. Also at Bogoroditsk Tech. Plant (BTCP), Moscow
20. Also at Centre for Complex Coop. Systems, Univ. of the West of England, Bristol, UK
21. Also at Imperial College, Univ. of London, London, United Kingdom
22. Also at AI Techn. Corp. of Pakistan (PVT) Ltd, Islamabad, Pakistan
23. Also at CIEMAT, Madrid, Spain
24. Also at Inst. for Theoretical and Exp. Phys., Moscow, Russia
25. Also at RAL, Didcot, United Kingdom
26. Also at MIT, Cambridge, Massachusetts, USA
27. Also at HEPHY, Wien, Austria
28. Also at Inst. for Nucl. Research and Nucl. Energy, Sofia, Bulgaria
29. Also at Inst. for Nucl. Research, Moscow, Russia
30. Also at INESC, Lisbon, Portugal
31. Also at IST, Technical University of Lisbon, Portugal

## Acknowledgements

The CMS Trigger and Data Acquisition group wish to thank all the technical staff involved in the design, prototyping, and testing work for their invaluable contributions.

We wish to express our thanks to Kirsti Aspola, Madeleine Azeglio, Delphine Labrousse, Anne Lissajoux, Guy Martin, Sandra Monachon, and Marie-Claude Pelloux for their help and assistance with innumerable tasks. Their expertise and dedication are truly appreciated.

Special thanks go to Sergio Cittolin for his artist's view of the Trigger project, which graces the cover of this report.

## Table of Contents

|                                                       |            |
|-------------------------------------------------------|------------|
| <b>CMS Trigger TDR Editorial Board</b>                | <b>i</b>   |
| <b>CMS Collaboration</b>                              | <b>ii</b>  |
| <b>Acknowledgements</b>                               | <b>xv</b>  |
| <b>Table of Contents</b>                              | <b>xvi</b> |
| <b>1 General Overview</b>                             | <b>1</b>   |
| <b>1.1 Introduction</b>                               | <b>1</b>   |
| <b>1.2 Summary of Requirements</b>                    | <b>1</b>   |
| 1.2.1 Physics Requirements                            | 1          |
| 1.2.2 System Requirements                             | 3          |
| 1.2.3 Rate Requirements                               | 3          |
| 1.2.4 Structural Requirements                         | 4          |
| <b>1.3 Overview of Trigger Structure</b>              | <b>5</b>   |
| 1.3.1 Level 1                                         | 5          |
| 1.3.2 High Level Triggers                             | 6          |
| <b>1.4 Overview of Level 1 Trigger Organization</b>   | <b>6</b>   |
| 1.4.1 Introduction                                    | 6          |
| 1.4.2 Calorimeter Trigger                             | 7          |
| 1.4.3 Muon Trigger                                    | 8          |
| 1.4.4 Global Trigger                                  | 8          |
| 1.4.5 Timing, Trigger and Control System              | 9          |
| 1.4.6 Physical Location of the Trigger Electronics    | 9          |
| 1.4.7 Physical Realization of the Trigger Electronics | 9          |
| 1.4.8 Coordinates and unit requirements               | 11         |
| <b>2 Requirements</b>                                 | <b>13</b>  |
| <b>2.1 Physics Requirements</b>                       | <b>13</b>  |
| 2.1.1 Cross Sections and Rates                        | 13         |
| 2.1.2 Physics Simulation Tools                        | 14         |
| 2.1.3 Review of Physics Channels                      | 14         |
| 2.1.4 Trigger Requirements                            | 23         |
| 2.1.5 Background                                      | 24         |
| <b>2.2 Calorimeter Trigger Requirements</b>           | <b>25</b>  |
| <b>2.3 Muon Trigger Requirements</b>                  | <b>26</b>  |
| <b>2.4 Trigger Efficiency Measurement</b>             | <b>27</b>  |
| 2.4.1 Electron/Photon and Muon Triggers               | 27         |
| 2.4.2 Triggering of leptons inside jets               | 28         |
| 2.4.3 $\tau$ Trigger                                  | 28         |
| 2.4.4 Jet Triggers                                    | 28         |
| 2.4.5 Missing $E_T$ Trigger                           | 29         |
| 2.4.6 Technical Triggers                              | 29         |
| <b>2.5 Requirements for Heavy Ion Runs</b>            | <b>29</b>  |
| <b>3 Calorimeter Trigger Introduction</b>             | <b>33</b>  |
| <b>3.1 CMS Calorimetry</b>                            | <b>33</b>  |
| 3.1.1 Electromagnetic Calorimeter                     | 33         |
| 3.1.2 Hadronic Calorimeter                            | 34         |

---

|                                                                 |           |
|-----------------------------------------------------------------|-----------|
| <b>3.2 Calorimeter Requirements .....</b>                       | <b>34</b> |
| <b>3.3 Calorimeter Trigger Algorithms .....</b>                 | <b>36</b> |
| 3.3.1 Geometry and Definitions .....                            | 37        |
| 3.3.2 Trigger Primitives .....                                  | 40        |
| 3.3.3 Electron and Photon Triggers .....                        | 42        |
| 3.3.4 Jet and $\tau$ Triggers .....                             | 44        |
| 3.3.5 Energy Triggers.....                                      | 45        |
| 3.3.6 Quiet and MIP Bits .....                                  | 46        |
| 3.3.7 Calorimeter Trigger Output.....                           | 47        |
| <b>3.4 Algorithm Performance .....</b>                          | <b>47</b> |
| 3.4.1 Simulation Programs.....                                  | 47        |
| 3.4.2 Electron and Photon Trigger Efficiencies .....            | 47        |
| 3.4.3 Electron and Photon Trigger Background Rates.....         | 48        |
| 3.4.4 Jet Trigger Rate and Efficiency.....                      | 50        |
| 3.4.5 $\tau$ Trigger Rate and Efficiency .....                  | 50        |
| 3.4.6 Missing $E_T$ Trigger Rate and Efficiency.....            | 52        |
| 3.4.7 Sample Trigger Table.....                                 | 52        |
| <b>3.5 Overall Structure .....</b>                              | <b>55</b> |
| 3.5.1 Calorimeter Trigger Subdivisions .....                    | 55        |
| 3.5.2 Trigger Primitives .....                                  | 56        |
| 3.5.3 Regional Calorimeter Trigger .....                        | 57        |
| 3.5.4 Global Calorimeter Trigger.....                           | 58        |
| <b>3.6 System Robustness .....</b>                              | <b>59</b> |
| <b>4 Calorimeter Trigger Primitive Generation .....</b>         | <b>61</b> |
| <b>4.1 Requirements .....</b>                                   | <b>61</b> |
| 4.1.1 Functional Requirements .....                             | 61        |
| 4.1.2 Performance Requirements .....                            | 62        |
| 4.1.3 Interface Requirements .....                              | 62        |
| 4.1.4 Testing Requirements .....                                | 66        |
| 4.1.5 Upgradability or Flexibility Requirements .....           | 66        |
| <b>4.2 System Overview .....</b>                                | <b>66</b> |
| <b>4.3 Information from the Calorimeter .....</b>               | <b>68</b> |
| 4.3.1 Trigger Tower Definition.....                             | 68        |
| 4.3.2 Data Formats for ECAL and HCAL .....                      | 69        |
| 4.3.3 Synchronization of the Trigger Tower Channels.....        | 70        |
| 4.3.4 Suppression of Bad Channels .....                         | 70        |
| 4.3.5 Zeroing of Channels during Monitoring .....               | 70        |
| 4.3.6 Linearization of ECAL Data and Scale Transformation.....  | 71        |
| 4.3.7 Linearization of HCAL Data and Scale Transformation ..... | 71        |
| <b>4.4 Bunch Crossing Assignment .....</b>                      | <b>71</b> |
| 4.4.1 Principles of the Bunch Crossing IDentification .....     | 71        |
| 4.4.2 Functional Description of the BCID .....                  | 73        |
| <b>4.5 Energy Sums .....</b>                                    | <b>73</b> |
| 4.5.1 Principles of Energy Filtering .....                      | 73        |
| 4.5.2 ECAL Adder Tree Structure .....                           | 74        |
| 4.5.3 Non Linear Transformation of the Energy.....              | 74        |
| <b>4.6 Fine Structure Bits .....</b>                            | <b>74</b> |

---

|                                                |                                                           |            |
|------------------------------------------------|-----------------------------------------------------------|------------|
| 4.6.1                                          | Algorithms to extract the Fine Structure Bits.....        | 74         |
| 4.6.2                                          | Generation of ECAL Fine Grain Bit .....                   | 76         |
| 4.6.3                                          | Generation of HCAL feature bit.....                       | 76         |
| <b>4.7 Synchronization and Latency .....</b>   |                                                           | <b>76</b>  |
| 4.7.1                                          | Synchronization Principles .....                          | 77         |
| 4.7.2                                          | Distribution of Clock, L1A, Reset and BC0 .....           | 77         |
| 4.7.3                                          | Synchronization of the Detector Links.....                | 79         |
| 4.7.4                                          | Synchronization of Trigger Primitives .....               | 80         |
| 4.7.5                                          | Synchronization Setting-up, Monitoring and Recovery ..... | 80         |
| 4.7.6                                          | The Trigger Synchronization Circuit .....                 | 82         |
| 4.7.7                                          | Latency .....                                             | 83         |
| <b>4.8 Interface to Regional Trigger .....</b> |                                                           | <b>84</b>  |
| 4.8.1                                          | Data Frame Format.....                                    | 84         |
| 4.8.2                                          | ECAL Interface to Regional Trigger.....                   | 85         |
| 4.8.3                                          | HCAL Interface to Regional Trigger .....                  | 87         |
| 4.8.4                                          | The Trigger Link .....                                    | 87         |
| <b>4.9 Simulation Results .....</b>            |                                                           | <b>87</b>  |
| 4.9.1                                          | Software Tools .....                                      | 87         |
| 4.9.2                                          | Performance of the Bunch Crossing Assignment.....         | 87         |
| 4.9.3                                          | Transverse Energy Resolution and Linearity .....          | 89         |
| 4.9.4                                          | Resolution on the Fine Grain Variable .....               | 90         |
| <b>4.10 Prototypes and Tests .....</b>         |                                                           | <b>91</b>  |
| 4.10.1                                         | ECAL Results.....                                         | 91         |
| <b>4.11 Status and Schedule .....</b>          |                                                           | <b>96</b>  |
| 4.11.1                                         | ECAL Status and Schedule .....                            | 96         |
| 4.11.2                                         | HCAL Status and Schedule .....                            | 97         |
| <b>5 Regional Calorimeter Trigger .....</b>    |                                                           | <b>101</b> |
| <b>5.1 System Requirements .....</b>           |                                                           | <b>101</b> |
| 5.1.1                                          | Physics Requirements.....                                 | 101        |
| 5.1.2                                          | Data Acquisition Requirements.....                        | 101        |
| <b>5.2 System Specifications .....</b>         |                                                           | <b>101</b> |
| 5.2.1                                          | Input Specification .....                                 | 101        |
| 5.2.2                                          | Output Specification.....                                 | 102        |
| 5.2.3                                          | Latency .....                                             | 102        |
| <b>5.3 System Overview .....</b>               |                                                           | <b>102</b> |
| 5.3.1                                          | System Functionality .....                                | 102        |
| 5.3.2                                          | Calorimeter Trigger Tower Mapping .....                   | 103        |
| 5.3.3                                          | Crate, Backplane and Cards .....                          | 103        |
| 5.3.4                                          | Crate Power, Cooling .....                                | 106        |
| 5.3.5                                          | Clock and Control .....                                   | 107        |
| 5.3.6                                          | Backplane .....                                           | 107        |
| 5.3.7                                          | VME .....                                                 | 107        |
| 5.3.8                                          | Backplane Data Sharing .....                              | 108        |
| 5.3.9                                          | Inter-crate Data Sharing .....                            | 108        |
| 5.3.10                                         | Implementation of Algorithms .....                        | 108        |
| <b>5.4 Receiver Card .....</b>                 |                                                           | <b>109</b> |
| 5.4.1                                          | Overview .....                                            | 109        |

|             |                                                |            |
|-------------|------------------------------------------------|------------|
| 5.4.2       | Serial Links .....                             | 111        |
| 5.4.3       | Phase ASIC .....                               | 111        |
| 5.4.4       | Link Error Handling.....                       | 113        |
| 5.4.5       | Look-up Tables .....                           | 113        |
| 5.4.6       | Energy Sums .....                              | 114        |
| 5.4.7       | Adder ASIC.....                                | 114        |
| 5.4.8       | Backplane Drivers.....                         | 116        |
| 5.4.9       | Boundary Scan ASIC .....                       | 116        |
| <b>5.5</b>  | <b>Electron Identification Card .....</b>      | <b>117</b> |
| 5.5.1       | Overview .....                                 | 117        |
| 5.5.2       | Input .....                                    | 117        |
| 5.5.3       | Electron Isolation ASIC .....                  | 117        |
| 5.5.4       | Output.....                                    | 120        |
| <b>5.6</b>  | <b>Jet/Summary Card .....</b>                  | <b>120</b> |
| 5.6.1       | Input .....                                    | 120        |
| 5.6.2       | Electron/Photon Processing .....               | 120        |
| 5.6.3       | Sort ASIC.....                                 | 120        |
| 5.6.4       | 4x4 ET Sums .....                              | 122        |
| 5.6.5       | $\tau$ Veto Bit .....                          | 122        |
| 5.6.6       | MIP Bit.....                                   | 122        |
| 5.6.7       | Quiet Bit .....                                | 122        |
| 5.6.8       | Output Processing .....                        | 122        |
| <b>5.7</b>  | <b>HF Crate .....</b>                          | <b>122</b> |
| <b>5.8</b>  | <b>Cluster Crate .....</b>                     | <b>123</b> |
| 5.8.1       | Jet and $\tau$ Sorting .....                   | 123        |
| 5.8.2       | Missing and Total $E_T$ sums .....             | 123        |
| 5.8.3       | Connection to Global Calorimeter Trigger ..... | 124        |
| 5.8.4       | Error Detection.....                           | 124        |
| 5.8.5       | System Test Errors .....                       | 124        |
| 5.8.6       | Boundary Scan .....                            | 124        |
| 5.8.7       | Run Time Errors.....                           | 125        |
| <b>5.9</b>  | <b>Latency .....</b>                           | <b>125</b> |
| <b>5.10</b> | <b>Prototypes and Tests .....</b>              | <b>126</b> |
| 5.10.1      | Overview .....                                 | 126        |
| 5.10.2      | Crate .....                                    | 126        |
| 5.10.3      | Backplane .....                                | 126        |
| 5.10.4      | Clock and Control Card .....                   | 127        |
| 5.10.5      | Receiver Card.....                             | 128        |
| 5.10.6      | Adder ASIC.....                                | 128        |
| 5.10.7      | Electron ID Card .....                         | 129        |
| 5.10.8      | Receiver Mezzanine Cards.....                  | 129        |
| 5.10.9      | Serial Link Test Card .....                    | 129        |
| 5.10.10     | Prototype Summary .....                        | 129        |
| <b>5.11</b> | <b>Status and Schedule .....</b>               | <b>130</b> |
| <b>6</b>    | <b>Global Calorimeter Trigger.....</b>         | <b>135</b> |
| <b>6.1</b>  | <b>Introduction .....</b>                      | <b>135</b> |
| <b>6.2</b>  | <b>System Specification .....</b>              | <b>136</b> |

|            |                                                      |            |
|------------|------------------------------------------------------|------------|
| 6.2.1      | Trigger Object Sort.....                             | 136        |
| 6.2.2      | Jet Count.....                                       | 136        |
| 6.2.3      | Global Energy Calculation .....                      | 137        |
| 6.2.4      | Luminosity Monitoring .....                          | 138        |
| 6.2.5      | Trigger Data Capture .....                           | 138        |
| 6.2.6      | Control, Test and Monitoring .....                   | 139        |
| 6.2.7      | GCT Extensions .....                                 | 139        |
| <b>6.3</b> | <b>System Overview .....</b>                         | <b>140</b> |
| <b>6.4</b> | <b>Implementation: Interfaces .....</b>              | <b>142</b> |
| 6.4.1      | Regional Calorimeter Trigger Interface .....         | 142        |
| 6.4.2      | Global Trigger Interface .....                       | 147        |
| 6.4.3      | Global Muon Trigger Interface .....                  | 147        |
| 6.4.4      | TRC Interface .....                                  | 148        |
| 6.4.5      | TTC Interface .....                                  | 148        |
| <b>6.5</b> | <b>Implementation: Processing .....</b>              | <b>148</b> |
| 6.5.1      | Trigger Processor module .....                       | 148        |
| 6.5.2      | Control, Setup and Test.....                         | 153        |
| 6.5.3      | Processor Crate Layout .....                         | 154        |
| 6.5.4      | Algorithm Implementation .....                       | 154        |
| 6.5.5      | Details of latency calculation .....                 | 159        |
| <b>6.6</b> | <b>Prototypes and Tests .....</b>                    | <b>160</b> |
| 6.6.1      | FPGA Processing Tests .....                          | 160        |
| 6.6.2      | Data Link Tests .....                                | 163        |
| <b>6.7</b> | <b>Status and Schedule .....</b>                     | <b>165</b> |
| <b>7</b>   | <b>Calorimeter Trigger Readout and Control .....</b> | <b>167</b> |
| <b>7.1</b> | <b>Requirements .....</b>                            | <b>167</b> |
| 7.1.1      | Requirements on Trigger Data Readout.....            | 167        |
| 7.1.2      | Requirements on Calorimeter Trigger Control .....    | 168        |
| <b>7.2</b> | <b>System Overview .....</b>                         | <b>169</b> |
| 7.2.1      | Operational Environment.....                         | 169        |
| 7.2.2      | Readout Hardware.....                                | 169        |
| 7.2.3      | Control Software .....                               | 171        |
| <b>7.3</b> | <b>Calorimeter Trigger Data .....</b>                | <b>173</b> |
| 7.3.1      | Data Description and Rates .....                     | 173        |
| <b>7.4</b> | <b>Calorimeter Trigger Readout .....</b>             | <b>174</b> |
| 7.4.1      | Readout of Trigger Primitives .....                  | 174        |
| 7.4.2      | Readout of Regional Trigger Data .....               | 177        |
| 7.4.3      | Readout of Global Calorimeter Trigger Data .....     | 177        |
| 7.4.4      | Calorimeter Trigger Readout Crate .....              | 177        |
| <b>7.5</b> | <b>Calorimeter Trigger Control .....</b>             | <b>178</b> |
| 7.5.1      | Calorimeter Trigger Back-end Subsystem .....         | 179        |
| 7.5.2      | Front-end Calorimeters Subsystem .....               | 180        |
| 7.5.3      | Calorimeter Regional Trigger Subsystem .....         | 181        |
| 7.5.4      | Calorimeter Global Trigger Subsystem.....            | 181        |
| 7.5.5      | Calorimeter Trigger Readout Subsystem .....          | 181        |
| 7.5.6      | Calorimeter Trigger TTC Subsystem .....              | 182        |
| <b>7.6</b> | <b>Interfaces .....</b>                              | <b>182</b> |

---

|            |                                               |            |
|------------|-----------------------------------------------|------------|
| 7.6.1      | Hardware Interfaces .....                     | 182        |
| 7.6.2      | Software Interfaces.....                      | 183        |
| 7.6.3      | User Interface.....                           | 183        |
| <b>7.7</b> | <b>Prototype and Tests .....</b>              | <b>184</b> |
| 7.7.1      | Data Concentrator Card.....                   | 184        |
| 7.7.2      | Readout PMC .....                             | 185        |
| 7.7.3      | Boundary Scan Controller.....                 | 185        |
| 7.7.4      | Control Software .....                        | 186        |
| <b>7.8</b> | <b>Status and Schedule .....</b>              | <b>186</b> |
| <b>8</b>   | <b>Muon Trigger Introduction .....</b>        | <b>187</b> |
| <b>8.1</b> | <b>Muon Detector .....</b>                    | <b>187</b> |
| 8.1.1      | Drift Tubes .....                             | 187        |
| 8.1.2      | Cathode Strip Chambers .....                  | 189        |
| 8.1.3      | Resistive Plate Chambers .....                | 189        |
| 8.1.4      | Magnetic Field .....                          | 190        |
| <b>8.2</b> | <b>Muon Trigger Overall Structure .....</b>   | <b>191</b> |
| <b>8.3</b> | <b>Algorithms and Implementation .....</b>    | <b>193</b> |
| 8.3.1      | Drift Tube Trigger.....                       | 194        |
| 8.3.2      | CSC Trigger .....                             | 197        |
| 8.3.3      | RPC Trigger .....                             | 199        |
| 8.3.4      | Global Muon Trigger .....                     | 200        |
| <b>8.4</b> | <b>Summary of Algorithm Performance .....</b> | <b>201</b> |
| 8.4.1      | Simulation Tools .....                        | 201        |
| 8.4.2      | Background .....                              | 202        |
| 8.4.3      | Muon isolation .....                          | 206        |
| 8.4.4      | Trigger efficiency and rates .....            | 208        |
| <b>8.5</b> | <b>Muon Trigger for Heavy Ion Runs .....</b>  | <b>210</b> |
| 8.5.1      | Low Momentum Threshold .....                  | 210        |
| 8.5.2      | Trigger Rates .....                           | 210        |
| <b>8.6</b> | <b>System robustness .....</b>                | <b>213</b> |
| <b>9</b>   | <b>Drift Tube Local Trigger .....</b>         | <b>219</b> |
| <b>9.1</b> | <b>Requirements .....</b>                     | <b>219</b> |
| <b>9.2</b> | <b>System Overview .....</b>                  | <b>220</b> |
| 9.2.1      | Drift Tube Local Trigger Layout .....         | 221        |
| 9.2.2      | Trigger Boards .....                          | 222        |
| 9.2.3      | Server Board .....                            | 225        |
| <b>9.3</b> | <b>Bunch and Track Identifier .....</b>       | <b>225</b> |
| 9.3.1      | Working Principle .....                       | 225        |
| 9.3.2      | Algorithm Description .....                   | 226        |
| 9.3.3      | Hardware Implementation.....                  | 228        |
| <b>9.4</b> | <b>Track Correlator .....</b>                 | <b>230</b> |
| 9.4.1      | Algorithm Description .....                   | 230        |
| 9.4.2      | Hardware Implementation.....                  | 232        |
| <b>9.5</b> | <b>Trigger Server .....</b>                   | <b>236</b> |
| 9.5.1      | Track Sorter Slave.....                       | 237        |
| 9.5.2      | Track Sorter Master.....                      | 240        |
| 9.5.3      | Trigger Server in the Longitudinal View ..... | 245        |

---

|             |                                                     |            |
|-------------|-----------------------------------------------------|------------|
| 9.5.4       | Timing of the Drift Tube Local Trigger System ..... | 245        |
| <b>9.6</b>  | <b>Sector Collector .....</b>                       | <b>247</b> |
| 9.6.1       | Data Exchange with Regional Muon Trigger.....       | 248        |
| <b>9.7</b>  | <b>Chamber Electronics Control System .....</b>     | <b>249</b> |
| 9.7.1       | Drift Tubes Control Interface .....                 | 249        |
| 9.7.2       | Control Board.....                                  | 250        |
| <b>9.8</b>  | <b>Synchronization and Latency .....</b>            | <b>252</b> |
| 9.8.1       | Synchronization Procedure.....                      | 252        |
| 9.8.2       | Latency Determination .....                         | 257        |
| <b>9.9</b>  | <b>Prototypes .....</b>                             | <b>258</b> |
| 9.9.1       | BTI Prototype and Test Bench Performance.....       | 258        |
| 9.9.2       | TRACO Prototype and Test Bench Performance.....     | 260        |
| 9.9.3       | TSS Prototype and Test Bench Performance .....      | 260        |
| 9.9.4       | TSM Prototype and Test Bench Performance .....      | 261        |
| <b>9.10</b> | <b>Test and Simulation Results .....</b>            | <b>261</b> |
| 9.10.1      | Bunch and Track Identifier.....                     | 262        |
| 9.10.2      | Track Correlator and Trigger Server .....           | 271        |
| <b>9.11</b> | <b>Noise Reduction .....</b>                        | <b>277</b> |
| 9.11.1      | Noise Generation Mechanisms.....                    | 277        |
| 9.11.2      | Noise Reduction Methods .....                       | 279        |
| 9.11.3      | Dimuons Detection Efficiency .....                  | 283        |
| <b>9.12</b> | <b>Radiation Tests .....</b>                        | <b>283</b> |
| 9.12.1      | Gamma Irradiation Studies.....                      | 284        |
| 9.12.2      | Neutron Irradiation Studies .....                   | 286        |
| <b>9.13</b> | <b>Status and Schedule .....</b>                    | <b>288</b> |
| <b>10</b>   | <b>Drift Tube Track-Finder.....</b>                 | <b>293</b> |
| <b>10.1</b> | <b>Requirements .....</b>                           | <b>293</b> |
| <b>10.2</b> | <b>System Overview .....</b>                        | <b>293</b> |
| <b>10.3</b> | <b>Subsystem Interfacing .....</b>                  | <b>294</b> |
| 10.3.1      | Input Data .....                                    | 294        |
| 10.3.2      | Output Data .....                                   | 296        |
| 10.3.3      | Internal Data Exchange .....                        | 297        |
| <b>10.4</b> | <b>Track-Finder Algorithm .....</b>                 | <b>297</b> |
| 10.4.1      | The Barrel DT Track-Finder Algorithm.....           | 297        |
| 10.4.2      | Barrel-Endcap Overlap Region Handling .....         | 304        |
| 10.4.3      | $\eta$ Track-Finder Algorithm.....                  | 306        |
| <b>10.5</b> | <b>Track-Finder Hardware Implementation .....</b>   | <b>308</b> |
| 10.5.1      | DTTF Processor .....                                | 309        |
| 10.5.2      | Barrel-Endcap Overlap Region Handling .....         | 317        |
| 10.5.3      | The $\eta$ Track-Finder Implementation .....        | 318        |
| <b>10.6</b> | <b>Muon Sorter .....</b>                            | <b>319</b> |
| 10.6.1      | Wedge Sorter .....                                  | 320        |
| 10.6.2      | Barrel Sorter Board .....                           | 321        |
| <b>10.7</b> | <b>Synchronization and Latency .....</b>            | <b>322</b> |
| 10.7.1      | TTC Interface .....                                 | 322        |
| 10.7.2      | Synchronization Procedure and Control.....          | 322        |
| 10.7.3      | Latency Budget.....                                 | 323        |

---

|                                                     |            |
|-----------------------------------------------------|------------|
| <b>10.8 Subsystem Controls .....</b>                | <b>323</b> |
| 10.8.1 Board-level Control Hardware .....           | 323        |
| 10.8.2 Board-Level Monitoring Solutions .....       | 323        |
| 10.8.3 DTTF General Control System .....            | 325        |
| <b>10.9 Simulation Results .....</b>                | <b>326</b> |
| 10.9.1 Overall Performance .....                    | 326        |
| 10.9.2 Extrapolation Filter Effects .....           | 331        |
| 10.9.3 $\eta$ Track-Finder Performance .....        | 331        |
| 10.9.4 Trigger Rates .....                          | 332        |
| 10.9.5 Radiation Background.....                    | 332        |
| 10.9.6 Muon Chamber Mis-alignment Effects.....      | 336        |
| <b>10.10 Prototypes and Tests .....</b>             | <b>337</b> |
| <b>10.11 Status and Schedule .....</b>              | <b>338</b> |
| 10.11.1 Design Status.....                          | 338        |
| 10.11.2 Prototype Status .....                      | 339        |
| 10.11.3 Control Software Status .....               | 339        |
| 10.11.4 Schedule .....                              | 339        |
| <b>11 Cathode Strip Chamber Local Trigger .....</b> | <b>341</b> |
| <b>11.1 Requirements .....</b>                      | <b>341</b> |
| <b>11.2 Overview .....</b>                          | <b>342</b> |
| <b>11.3 Cathode Signal Processing .....</b>         | <b>348</b> |
| 11.3.1 Amplification and Shaping .....              | 348        |
| 11.3.2 Trigger Digitization.....                    | 349        |
| 11.3.3 Cathode LCT Pattern-Finding.....             | 350        |
| <b>11.4 Anode Signal Processing .....</b>           | <b>352</b> |
| 11.4.1 CSC Anode Digitization .....                 | 354        |
| 11.4.2 Anode LCT Pattern Finding.....               | 354        |
| <b>11.5 Cathode-Anode Correlation .....</b>         | <b>357</b> |
| 11.5.1 Bunch Crossing Alignment.....                | 357        |
| 11.5.2 Cathode-Anode Matching .....                 | 357        |
| 11.5.3 LCT Data Transmission to the MPC.....        | 358        |
| <b>11.6 Muon Port Card Selection of LCTs .....</b>  | <b>358</b> |
| 11.6.1 MPC Functionality .....                      | 359        |
| 11.6.2 MPC Input Synchronization.....               | 359        |
| 11.6.3 Selection Logic .....                        | 360        |
| 11.6.4 Interface to Sector Receiver Card .....      | 360        |
| <b>11.7 Clock and Control Board .....</b>           | <b>360</b> |
| <b>11.8 Synchronization and Latency .....</b>       | <b>361</b> |
| 11.8.1 Synchronization Procedure .....              | 362        |
| 11.8.2 TMB Synchronization of ALCT Data .....       | 362        |
| 11.8.3 Latency Determination.....                   | 362        |
| <b>11.9 Prototypes .....</b>                        | <b>363</b> |
| 11.9.1 Comparator ASIC Prototypes .....             | 363        |
| 11.9.2 Cathode LCT Prototypes.....                  | 363        |
| 11.9.3 Anode LCT Prototypes .....                   | 366        |
| 11.9.4 TMB and CCB Prototypes .....                 | 369        |
| 11.9.5 MPC Prototype.....                           | 370        |

---

|              |                                                     |            |
|--------------|-----------------------------------------------------|------------|
| 11.9.6       | Radiation Resistance .....                          | 371        |
| <b>11.10</b> | <b>Simulation Status and Results .....</b>          | <b>374</b> |
| <b>11.11</b> | <b>Maintenance and Operation .....</b>              | <b>377</b> |
| 11.12        | Status and Schedule .....                           | 379        |
| <b>12</b>    | <b>Cathode Strip Chamber Track-Finder.....</b>      | <b>381</b> |
| <b>12.1</b>  | <b>Requirements .....</b>                           | <b>381</b> |
| 12.1.1       | Physics Requirements.....                           | 381        |
| 12.1.2       | Boundary Between DT and CSC Track-Finders .....     | 382        |
| <b>12.2</b>  | <b>System Overview .....</b>                        | <b>382</b> |
| <b>12.3</b>  | <b>System Interfaces .....</b>                      | <b>383</b> |
| 12.3.1       | Crate and Backplane.....                            | 383        |
| 12.3.2       | Transition Modules.....                             | 385        |
| 12.3.3       | Clock and Control Module .....                      | 385        |
| 12.3.4       | Crate Power and Cooling .....                       | 386        |
| <b>12.4</b>  | <b>Sector Receiver .....</b>                        | <b>386</b> |
| 12.4.1       | Optical link inputs .....                           | 387        |
| 12.4.2       | Backplane inputs. ....                              | 388        |
| 12.4.3       | Sector Receiver Outputs to Trigger Path.....        | 388        |
| 12.4.4       | Sector Receiver Outputs to DAQ Path .....           | 388        |
| 12.4.5       | Hardware Implementation.....                        | 388        |
| <b>12.5</b>  | <b>Sector Processor .....</b>                       | <b>392</b> |
| 12.5.1       | Overview .....                                      | 392        |
| 12.5.2       | Bunch Crossing Analyzer.....                        | 393        |
| 12.5.3       | Extrapolation Unit .....                            | 394        |
| 12.5.4       | Track Assembly Unit.....                            | 397        |
| 12.5.5       | Final Selection Unit.....                           | 398        |
| 12.5.6       | Assignment Unit.....                                | 400        |
| 12.5.7       | Hardware Implementation .....                       | 402        |
| <b>12.6</b>  | <b>CSC Muon Sorter .....</b>                        | <b>402</b> |
| 12.6.1       | Algorithm .....                                     | 403        |
| 12.6.2       | Hardware Implementation .....                       | 404        |
| <b>12.7</b>  | <b>Synchronization and Latency .....</b>            | <b>405</b> |
| 12.7.1       | Synchronization Procedure.....                      | 405        |
| 12.7.2       | Latency Determination .....                         | 405        |
| <b>12.8</b>  | <b>System Monitoring .....</b>                      | <b>406</b> |
| <b>12.9</b>  | <b>Simulation Results .....</b>                     | <b>406</b> |
| <b>12.10</b> | <b>Prototypes and Tests .....</b>                   | <b>411</b> |
| 12.10.1      | Sector Receiver.....                                | 411        |
| 12.10.2      | Sector Processor .....                              | 413        |
| 12.10.3      | Track-Finder Crate Test .....                       | 414        |
| <b>12.11</b> | <b>Maintenance and Operation .....</b>              | <b>416</b> |
| <b>12.12</b> | <b>Status and Schedule .....</b>                    | <b>417</b> |
| <b>13</b>    | <b>RPC Trigger .....</b>                            | <b>419</b> |
| <b>13.1</b>  | <b>Requirements and design considerations .....</b> | <b>419</b> |
| <b>13.2</b>  | <b>System Overview .....</b>                        | <b>419</b> |
| <b>13.3</b>  | <b>RPC Front End .....</b>                          | <b>421</b> |
| 13.3.1       | Overview .....                                      | 421        |

---

|              |                                                          |            |
|--------------|----------------------------------------------------------|------------|
| 13.3.2       | RPC Electrical Characteristics .....                     | 423        |
| 13.3.3       | Front End Chip Characteristics .....                     | 423        |
| 13.3.4       | FEB Functional Description.....                          | 424        |
| <b>13.4</b>  | <b>Optical Communication System .....</b>                | <b>425</b> |
| 13.4.1       | Overview .....                                           | 425        |
| 13.4.2       | Data Compression Scheme .....                            | 427        |
| 13.4.3       | The Link Board .....                                     | 428        |
| 13.4.4       | Splitter System in the Counting Room Electronics .....   | 430        |
| <b>13.5</b>  | <b>Trigger Crates .....</b>                              | <b>430</b> |
| 13.5.1       | Overview .....                                           | 430        |
| 13.5.2       | Trigger Board.....                                       | 432        |
| 13.5.3       | PAC Trigger Processor .....                              | 434        |
| 13.5.4       | The Readout System .....                                 | 440        |
| <b>13.6</b>  | <b>Sorting and Ghostbusting .....</b>                    | <b>444</b> |
| 13.6.1       | Ghostbusting .....                                       | 444        |
| 13.6.2       | Ghostbusting within a tower .....                        | 447        |
| 13.6.3       | Ghostbusting between towers .....                        | 449        |
| 13.6.4       | Sorting .....                                            | 450        |
| <b>13.7</b>  | <b>Simulation Results .....</b>                          | <b>453</b> |
| 13.7.1       | Overview .....                                           | 453        |
| 13.7.2       | Simulation of the Pre-defined Patterns .....             | 454        |
| 13.7.3       | Geometry and acceptance .....                            | 455        |
| 13.7.4       | Simulation of RPC Performance.....                       | 456        |
| 13.7.5       | Main Results .....                                       | 458        |
| 13.7.6       | Transfer Losses in the OCS .....                         | 459        |
| 13.7.7       | Robustness of the RPC Muon Trigger .....                 | 462        |
| <b>13.8</b>  | <b>Latency, Synchronization, BX Identification .....</b> | <b>463</b> |
| 13.8.1       | Latency .....                                            | 463        |
| 13.8.2       | The Synchronization Unit .....                           | 463        |
| <b>13.9</b>  | <b>Diagnostics and Calibration .....</b>                 | <b>467</b> |
| 13.9.1       | General description .....                                | 467        |
| 13.9.2       | Diagnostics and Calibration on the Detector .....        | 469        |
| 13.9.3       | Diagnostics and Calibration in the Counting Room .....   | 470        |
| <b>13.10</b> | <b>Milestones, Prototypes, Test Results .....</b>        | <b>471</b> |
| 13.10.1      | Important Passed Milestones: .....                       | 471        |
| 13.10.2      | Tests of Data Compression .....                          | 471        |
| 13.10.3      | Irradiation Tests for the Link and FEB Components .....  | 472        |
| 13.10.4      | LINX - the Data Transfer Test System .....               | 473        |
| 13.10.5      | PAC Prototypes 1 and 2 .....                             | 473        |
| 13.10.6      | Sorting ASIC Prototype .....                             | 475        |
| 13.10.7      | Readout Board Prototype .....                            | 476        |
| <b>13.11</b> | <b>Status and Schedule .....</b>                         | <b>478</b> |
| <b>14</b>    | <b>Global Muon Trigger .....</b>                         | <b>481</b> |
| <b>14.1</b>  | <b>Requirements .....</b>                                | <b>481</b> |
| 14.1.1       | Functional Requirements .....                            | 481        |
| 14.1.2       | Input Requirements .....                                 | 481        |
| <b>14.2</b>  | <b>System Overview .....</b>                             | <b>482</b> |

---

|                                                                  |            |
|------------------------------------------------------------------|------------|
| <b>14.3 Input Processing of DT, CSC and RPC Data .....</b>       | <b>484</b> |
| 14.3.1 Input channels.....                                       | 484        |
| 14.3.2 Input synchronization .....                               | 485        |
| <b>14.4 Input from the Global Calorimeter Trigger .....</b>      | <b>485</b> |
| 14.4.1 ISOLATION and MIP bit logic.....                          | 486        |
| <b>14.5 Matching, Merging and Sorting Logic .....</b>            | <b>486</b> |
| 14.5.1 Match and Pair Logic .....                                | 486        |
| 14.5.2 Rank assignment.....                                      | 487        |
| 14.5.3 Muon Merger Logic .....                                   | 487        |
| 14.5.4 Sorting Logic .....                                       | 488        |
| <b>14.6 Latency .....</b>                                        | <b>488</b> |
| <b>14.7 Output Processing and Monitoring .....</b>               | <b>489</b> |
| <b>14.8 Simulation Results .....</b>                             | <b>490</b> |
| 14.8.1 Samples .....                                             | 490        |
| 14.8.2 GMT Algorithm .....                                       | 490        |
| 14.8.3 Efficiencies .....                                        | 491        |
| 14.8.4 Ghosts.....                                               | 494        |
| 14.8.5 Turn-on curves .....                                      | 495        |
| 14.8.6 Trigger Rates .....                                       | 495        |
| <b>14.9 Status and Schedule .....</b>                            | <b>498</b> |
| <b>15 Global Trigger.....</b>                                    | <b>499</b> |
| <b>15.1 Introduction .....</b>                                   | <b>499</b> |
| <b>15.2 System Overview .....</b>                                | <b>501</b> |
| 15.2.1 Functionality.....                                        | 501        |
| 15.2.2 Implementation.....                                       | 501        |
| <b>15.3 Hardware Details .....</b>                               | <b>503</b> |
| 15.3.1 Input from Global Calorimeter Trigger.....                | 503        |
| 15.3.2 Input from Global Muon Trigger .....                      | 503        |
| 15.3.3 Summary of input bits .....                               | 503        |
| 15.3.4 Synchronisation and Latency Buffer Hardware .....         | 504        |
| 15.3.5 Synchronisation FPGA.....                                 | 505        |
| 15.3.6 Dual Port Memories for input data.....                    | 506        |
| <b>15.4 Logic .....</b>                                          | <b>506</b> |
| 15.4.1 Standard algorithms.....                                  | 509        |
| 15.4.2 Special algorithms .....                                  | 511        |
| <b>15.5 Final Decision Logic Module .....</b>                    | <b>512</b> |
| <b>15.6 Synchronisation Procedure .....</b>                      | <b>513</b> |
| 15.6.1 Fine time adjustment of input channels.....               | 513        |
| 15.6.2 Bunch Crossing Synchronisation .....                      | 514        |
| 15.6.3 Synchronisation of Muon and Calorimeter Trigger data..... | 515        |
| <b>15.7 Latencies of the GMT and the GT .....</b>                | <b>515</b> |
| <b>15.8 Output Processing and Monitoring .....</b>               | <b>516</b> |
| 15.8.1 L1A requests.....                                         | 517        |
| 15.8.2 Monitoring requests.....                                  | 517        |
| 15.8.3 Readout Processors.....                                   | 517        |
| 15.8.4 Interface to Data Acquisition .....                       | 517        |
| 15.8.5 Hardware monitoring and failures.....                     | 518        |

---

|                                                            |            |
|------------------------------------------------------------|------------|
| <b>15.9 Simulation .....</b>                               | <b>518</b> |
| <b>15.10 Prototypes and Tests .....</b>                    | <b>519</b> |
| 15.10.1 Custom prototype backplane.....                    | 519        |
| 15.10.2 Prototype input boards PSB .....                   | 519        |
| 15.10.3 Prototype logic board GTL .....                    | 522        |
| <b>15.11 Status and Schedule .....</b>                     | <b>522</b> |
| <b>16 Trigger Control.....</b>                             | <b>525</b> |
| <b>16.1 System Requirements .....</b>                      | <b>525</b> |
| 16.1.1 Requirements on L1A Control .....                   | 526        |
| 16.1.2 Requirements on Fast Controls .....                 | 527        |
| 16.1.3 Requirements on Fast Monitoring.....                | 528        |
| 16.1.4 Requirements on Calibration and Test Triggers ..... | 528        |
| 16.1.5 Requirements on Subsystems Reset.....               | 530        |
| 16.1.6 Requirements on Partitioning.....                   | 530        |
| 16.1.7 Requirements on Trigger Control Software.....       | 531        |
| <b>16.2 Overview of the Trigger Control System .....</b>   | <b>532</b> |
| 16.2.1 Architecture.....                                   | 532        |
| 16.2.2 Interfaces .....                                    | 533        |
| 16.2.3 Components .....                                    | 533        |
| <b>16.3 Trigger Control Interfaces .....</b>               | <b>534</b> |
| 16.3.1 Interface to Global Trigger.....                    | 534        |
| 16.3.2 LHC Clock Interface .....                           | 534        |
| 16.3.3 Interface to TTC System .....                       | 534        |
| 16.3.4 Interface to Fast Monitoring Network.....           | 537        |
| 16.3.5 Interface to DAQ Event Manager .....                | 538        |
| <b>16.4 Trigger Control System Components .....</b>        | <b>538</b> |
| 16.4.1 Fast Control Generator.....                         | 538        |
| 16.4.2 Fast Monitoring Receiver.....                       | 542        |
| 16.4.3 Trigger Throttling Subsystem .....                  | 542        |
| 16.4.4 Deadtime Monitor and Counters .....                 | 543        |
| 16.4.5 Calibration and Test Control .....                  | 544        |
| <b>16.5 Partitioning .....</b>                             | <b>547</b> |
| 16.5.1 Trigger Control Partitioning.....                   | 547        |
| 16.5.2 Sub-detector partitions .....                       | 548        |
| <b>16.6 Trigger Control Software .....</b>                 | <b>549</b> |
| 16.6.1 Operational environment.....                        | 549        |
| 16.6.2 Control Software Framework.....                     | 550        |
| <b>16.7 Software Functions .....</b>                       | <b>552</b> |
| <b>16.8 Software Design .....</b>                          | <b>553</b> |
| 16.8.1 The software process.....                           | 553        |
| 16.8.2 The software development environment .....          | 553        |
| <b>17 Synchronization and Latency .....</b>                | <b>555</b> |
| <b>17.1 Introduction .....</b>                             | <b>555</b> |
| <b>17.2 Latency .....</b>                                  | <b>557</b> |
| 17.2.1 L1 Latency Budget.....                              | 557        |
| 17.2.2 L1 Latency Components .....                         | 558        |
| <b>17.3 Overall Trigger Pipeline Alignment .....</b>       | <b>563</b> |

---

|                    |                                                              |            |
|--------------------|--------------------------------------------------------------|------------|
| 17.3.1             | Synchronization of the Detector Signals with the Clock ..... | 563        |
| 17.3.2             | Bunch Crossing Assignment of Trigger Primitives .....        | 565        |
| 17.3.3             | Trigger Subsystems Alignment .....                           | 567        |
| 17.3.4             | Alignment of TTC .....                                       | 568        |
| 17.3.5             | Global Alignment .....                                       | 568        |
| <b>17.4</b>        | <b>Alignment with LHC Crossings .....</b>                    | <b>568</b> |
| 17.4.1             | Histograms of Occupancy per Bunch.....                       | 569        |
| 17.4.2             | Subdetectors Bunch Crossing Identification .....             | 571        |
| <b>17.5</b>        | <b>Alignment with Readout .....</b>                          | <b>571</b> |
| 17.5.1             | Sub-detector Frontend Pipelines .....                        | 571        |
| 17.5.2             | Alignment of L1A with Readout Pipelines .....                | 572        |
| 17.5.3             | Bunch Number and Event Number .....                          | 572        |
| 17.5.4             | Synchronization of Event Fragments .....                     | 573        |
| <b>17.6</b>        | <b>Timing Setup Procedures .....</b>                         | <b>573</b> |
| 17.6.1             | Cable Length .....                                           | 573        |
| 17.6.2             | Timing of TTC distribution .....                             | 573        |
| 17.6.3             | Timing of Trigger Pipeline .....                             | 574        |
| 17.6.4             | Timing of Readout Pipelines .....                            | 574        |
| <b>17.7</b>        | <b>Monitoring and Diagnostics Procedures .....</b>           | <b>575</b> |
| 17.7.1             | Data and Trigger Links.....                                  | 575        |
| 17.7.2             | Bunch Crossing Assignment .....                              | 575        |
| 17.7.3             | Synchronization between L1A and pipeline data.....           | 576        |
| 17.7.4             | Synchronization of event fragments.....                      | 576        |
| <b>17.8</b>        | <b>Operation with Test and Calibration Triggers .....</b>    | <b>576</b> |
| 17.8.1             | Special Beam Conditions .....                                | 576        |
| 17.8.2             | Test and Calibration Triggers.....                           | 577        |
| <b>18</b>          | <b>Installation and Maintenance .....</b>                    | <b>579</b> |
| <b>18.1</b>        | <b>On-Detector Electronics .....</b>                         | <b>579</b> |
| <b>18.2</b>        | <b>Counting Room Electronics .....</b>                       | <b>579</b> |
| <b>18.3</b>        | <b>Electronics Maintenance .....</b>                         | <b>580</b> |
| <b>18.4</b>        | <b>Configuration Control .....</b>                           | <b>580</b> |
| <b>19</b>          | <b>Safety and Environment .....</b>                          | <b>583</b> |
| <b>19.1</b>        | <b>On-Detector Electronics .....</b>                         | <b>583</b> |
| <b>19.2</b>        | <b>Counting Room Electronics .....</b>                       | <b>583</b> |
| <b>19.3</b>        | <b>Electrical and Non-Ionising Radiation Safety .....</b>    | <b>584</b> |
| <b>19.4</b>        | <b>Fire Safety .....</b>                                     | <b>584</b> |
| <b>19.5</b>        | <b>Radiation Tolerance .....</b>                             | <b>584</b> |
| <b>19.6</b>        | <b>Radiation Levels and Access .....</b>                     | <b>584</b> |
| <b>20</b>          | <b>Project Management .....</b>                              | <b>587</b> |
| <b>20.1</b>        | <b>Institutes and Responsibilities .....</b>                 | <b>587</b> |
| <b>20.2</b>        | <b>Management Organization .....</b>                         | <b>588</b> |
| <b>20.3</b>        | <b>Overall Schedule .....</b>                                | <b>588</b> |
| <b>20.4</b>        | <b>Costs and Resources .....</b>                             | <b>589</b> |
| <b>Appendix A:</b> | <b>Acronyms and Abbreviations .....</b>                      | <b>591</b> |
| <b>Appendix B:</b> | <b>CMS Trigger and Data Acquisition Membership.....</b>      | <b>597</b> |



# 1 General Overview

## 1.1 Introduction

For the nominal LHC design luminosity of  $10^{34} \text{ cm}^{-2}\text{s}^{-1}$ , an average of 17 events occurs at the beam crossing frequency of 25 ns. This input rate of  $10^9$  interactions every second must be reduced by a factor of at least  $10^7$  to 100 Hz, the maximum rate that can be archived by the on-line computer farm. CMS has chosen to reduce this rate in two steps. At the first level all data is stored for 3.2  $\mu\text{s}$ , after which no more than 100 kHz of the stored events are forwarded to the High Level Triggers. This must be done for all channels without dead time. The Level-1 (L1) system is based on custom electronics. The High Level Trigger (HLT) system, relies upon commercial processors. The L1 system uses only coarsely segmented data from calorimeter and muon detectors, while holding all the high-resolution data in pipeline memories in the front-end electronics. The HLT is provided by a subset of the on-line processor farm which, in turn, passes a fraction of these events to the remainder of the on-line farm for more complete processing.

The physical size of the CMS detector and underground caverns imposes constraints on signal propagation that combine with electronics technology to require 3.2  $\mu\text{s}$ , equivalent to 128 25-ns beam crossings, for any primary decision to discard data from a particular beam crossing. During this 3.2  $\mu\text{s}$  period, trigger data must be collected from the front end electronics, decisions must be developed that discard a large fraction of the data while retaining the small portion coming from interactions of interest and these decisions must be propagated to the readout electronics front end buffers.

The trigger is the start of the physics event selection process. A decision to retain an event for further consideration has to be made every 25 ns. This decision is based on the event's suitability for inclusion in one of the various data sets to be used for analysis. The data sets to be taken are determined by CMS physics priorities as a whole. These data sets include di-lepton and multi-lepton data sets for top and higgs searches, lepton plus jet data sets for top physics, and inclusive electron data sets for calorimeter calibrations. In addition, other samples are necessary for measuring efficiencies in event selection and studying backgrounds. The trigger has to select these samples in real time along with the main data samples.

This document describes the CMS L1 trigger system. The CMS HLT and DAQ system will be described in the CMS DAQ Technical Design Report scheduled for release at the end of 2001.

## 1.2 Summary of Requirements

### 1.2.1 Physics Requirements

The CMS L1 trigger is based on the identification of muons, electrons, photons, jets, and missing transverse energy. The trigger must have a sufficiently high and understood efficiency at a sufficiently low threshold to ensure a high yield of events in the final CMS physics plots to

provide enough statistics and a high enough efficiency for these events so that the correction for this efficiency does not add appreciably to the systematic error of the measurement.

The physics requirements on the L1 trigger are:

- The CMS trigger system should be capable of selecting leptons and jets over the pseudorapidity range  $|\eta| < 2.5$ , with an efficiency which is very high, above a selected threshold in transverse momentum.
- For the single lepton triggers it is required that the trigger is fully efficient ( $> 95\%$ ) in the pseudorapidity range  $|\eta| < 2.5$ , with a threshold of  $p_T > 40 \text{ GeV}/c$ .
- For the dilepton trigger, it is required that the trigger is fully efficient ( $> 95\%$ ) in the pseudorapidity range  $|\eta| < 2.5$  with thresholds of  $p_T > 20$  and  $15 \text{ GeV}/c$  for the first and second leptons respectively.
- Single photon and diphoton triggers are required to have thresholds similar to those of the leptons.
- Single and multiple jet triggers are required with a well defined efficiency over the entire rapidity range  $|\eta| < 5$  in order to reconstruct jet spectra that overlap with data attainable at lower energy colliders such as the Tevatron. For higher transverse momenta the jet trigger should also be fully efficient.
- A missing transverse energy trigger with a threshold of about  $100 \text{ GeV}$  is required.

The above requirements are chosen to provide a high efficiency for the hard scattering physics to be studied at the LHC. This physics includes signals such as top decays to electrons and muons, higgs decays to two photons or four leptons, W-W scattering, supersymmetry, Z' and top decays. Many of these processes involve W decays. Due to the low mass of the W compared to the energies at the LHC, the tightest constraint on a single lepton trigger comes from a requirement to trigger on W decays. Therefore, to be able to trigger on hard scattering physics signals, we have set up the requirements, as benchmarks, that the L1 single muon and calorimeter isolated electron triggers provide about a 50% efficiency for identifying W decay leptons (integrated over all lepton momenta). This results in a single lepton trigger  $p_T$  threshold of about  $40 \text{ GeV}$ . The higgs ( $100 \text{ GeV}/c^2$  mass) decay to two photons with 95% efficiency determines the requirement for the isolated double photon trigger to have about a  $20 \text{ GeV}/c$   $p_T$  threshold. Note that the L1  $p_T$  cutoffs are, and ought to be, somewhat smaller than the offline physics analysis cuts. The reason for such a requirement is that the efficiency turn-on curves for the L1 trigger will be somewhat softer than can be achieved with a full analysis including the best resolutions and calibration corrections.

We must also plan for the evolution in the physics processes studied as the LHC luminosity increases from about  $10^{32}$  to  $10^{34} \text{ cm}^{-2}\text{s}^{-1}$ . Lower  $p_T$  thresholds and removal or relaxation of isolation cuts will be useful to maximize the physics output during lower luminosity and also to match with the Tevatron data.

The CMS trigger system will operate in heavy ion running. The purpose is to probe the quark gluon plasma. This requires the lowest possible muon threshold in order to study quarkonia production, as well as triggers on photons and jets. Heavy ion collisions will occur every 125 ns, but will have a much higher multiplicity than pp interactions. Due to the high resulting data volume, the L1 rate will be limited in heavy ion running to about 5 kHz for central Pb-Pb collisions.

The high multiplicities for individual event imply that the luminosity will need to be limited to reduce event pileup.

### 1.2.2 System Requirements

The trigger has to be inclusive, local, measurably efficient, and fill the DAQ bandwidth with a high purity stream. The local philosophy of the trigger implies an initial trigger selection of electrons, photons, muons and jets that relies on local information tied directly to their distinctive signatures, rather than on global topologies. For example, electron showers are small and extremely well defined in the transverse and longitudinal planes. Information from a few ECAL and HCAL calorimeter towers (at the L1 trigger), the preshower detector, and a small region of the tracking volume (at higher trigger levels) are sufficient for electron identification. The only global entities are neutrinos (from a global sum of missing  $E_T$ ).

For the trigger to be measurably efficient the tools to measure lepton and jet efficiencies must be built into the trigger architecture from the start. One such tool is overlapping programmable triggers so that multiple triggers with different thresholds and cuts can run in parallel. A second tool is prescaled triggers of lower threshold or weaker criteria that run in parallel with the more strict triggers. A third tool is prescaling of a particular trigger with one of its cuts removed.

The requirement on the use of DAQ bandwidth implies two conditions. First, each level of the trigger attempts to identify leptons and jets as efficiently as possible, while keeping the output bandwidth within requirements. The selected event sample should include all events would be found by the full offline reconstruction. Hence, the cuts in the trigger must be consistent with those of the offline. Second, since the bandwidth to permanent storage media is limited, events must be selected with care at the final trigger level.

The measurement of trigger efficiency requires the flexibility to have overlapping triggers so that efficiencies can be measured from the data. The overlaps include different thresholds, relaxed individual criteria, prescaled samples with one criterion missing, and overlapping physics signatures. For example, measurement of the inclusive jet spectrum uses several triggers of successively higher thresholds, with the lower thresholds prescaled by factors that allow a reasonable rate to storage. These triggers overlap in jet energy all the way down to minimum bias events so that the full spectrum can be constructed accurately. The efficiency and bias of each higher threshold can be measured from the data sets of lower threshold.

### 1.2.3 Rate Requirements

The CMS L1 trigger rate is limited by the speed of the detector electronics readout and the rate at which the data can be harvested by the data acquisition system. The L1 trigger electronics itself is pipelined and deadtimeless, and as such can render a decision on every beam crossing. The design capability of the readout, event builder and event filter are each at 100 kHz. However, since CMS plans to exploit the funding resources and computing technology advances in the most effective manner, the approach to this design capability will be evolutionary. At the luminosity expected at the experiment turn-on, the maximum capability will not be required. Therefore, those components of the data acquisition system that can be easily scaled with additional purchases of “off the shelf” commercial hardware can be planned for initial operation at lower rates. The event filter is presently planned for an input capacity of 75 kHz at the experiment

turn-on. This capacity may be adjusted through the purchase of additional computing nodes which can be rapidly installed. The operation of the event filter at 75 kHz means that the event builder has to operate at 75 kHz and will be implemented with this capability. The event builder will be based on commercial switch and network technology that can also be scaled simply and quickly. The individual detector readout systems have many custom components and as such are not easily scalable. They will be implemented with the full 100 kHz L1 trigger capability, although initial operation will be restricted to 75 kHz. Given the initial operation of the L1 trigger and readout throughput at 75 kHz, for the purposes of this TDR, we will restrict ourselves to consideration of the planned initial L1 trigger rate of 75 kHz from here on.

The uncertainties in estimates of cross sections at high energies and limited knowledge of branching ratios impose a large error on the estimated trigger rates. In addition we cannot assume that the CMS DAQ system will always run at its maximum design capacity. Therefore, we provide for a safety margin of a factor of three from the planned initial 75 kHz maximum L1 output rate to 25 kHz, in designing algorithms for L1 triggers. Furthermore, this 25 kHz bandwidth of L1 output has to be shared amongst both muon and calorimeter triggers. Therefore, we have selected a target rate of 12.5 kHz for the individual totals of the calorimeter and muon triggers. The detailed list of triggers with their rates is presented in Chapter 15.

## 1.2.4 Structural Requirements

The time between beam crossings at the LHC is 25 ns, which is too short to read out the Megabytes of data for each event and to provide a trigger decision. The data are therefore stored in a pipeline and the first level trigger decision is transmitted to the detector electronics within 3.2  $\mu$ s after the crossing. In order to avoid deadtime, the trigger electronics must itself be pipelined: every process in the trigger must be repeated every 25 ns. This has important consequences for the requirements on the structure of the trigger system. The fact that each piece of logic must accept new data every 25 ns means that no piece of individual data processing can take more than 25 ns. This prohibits the use of iterative algorithms, such as jet-finding based on finding a seed tower and then adding the surrounding towers to make a jet energy sum.

The high operational speed and pipelined architecture also requires that specific data is brought to specific points in the trigger system for processing and that there cannot be fetching of data based on analysis of other data in an event. The data must flow synchronously across the trigger logic in a deterministic manner in the same way for each crossing. At any moment there are many crossings being processed in order in the various stages of the trigger logic. The consequence is that most of the trigger operations are either simple arithmetic operations or functions using memory lookup tables where an address of data produces a result previously written into the memory.

The requirement that data are read in, calculations performed and the decision transmitted to the detector electronics in 3.2  $\mu$ s restricts the data usable for trigger calculations to that immediately available after the crossing occurred. The longest time to have data presented for processing is found in the Muon Barrel Drift Chambers where the drift time of 400 ns must pass before all signals are collected. The restriction to promptly available data means that the trigger system cannot use data from the preshower nor the tracker nor can extensive processing or corrections be applied to the data to be used in the trigger calculations.

## 1.3 Overview of Trigger Structure

### 1.3.1 Level 1

The design of the CMS Trigger and Data Acquisition system is illustrated in Fig. 1.1.



**Fig. 1.1:** CMS Trigger and Data Acquisition System

At the first level all information about the event is preserved. The first level decision is made, with negligible deadtime, on a subset of the total information available for the events. Made at a fixed time after the interaction has occurred, a first level decision is issued every 25 ns. The L1 trigger system must be able to accept a new event every 25 ns. The L1 pipeline data storage time is 3.2  $\mu$ s. Since signal propagation delays are included in this pipeline time, the L1 trigger calculations must be done in many cases in less than 1  $\mu$ s. If the first level trigger generates an accept, the event data are moved or assigned to a buffer for readout and processing by the High Level Triggers.

The limit of 3.2  $\mu$ s is imposed by the amount of data storage in the tracker and preshower front-end buffers before readout after a L1 accept. The quantity of tracker and preshower data requires an architecture which provides for storage of event data before a L1 accept and readout of the event data (at maximum 100 kHz out of the input rate of 40 MHz bunch crossings) corresponding to the L1 accepts. This architecture prevents use of the tracker data in the L1 trigger decision.

The L1 trigger involves the calorimetry and muon systems as well as some correlation of information from these systems. The L1 decision is based on the presence of local objects such as photons, electrons, muons, and jets, using information from calorimeters, and muon systems in a given element of  $\eta$ - $\phi$  space. It also employs global sums of  $E_T$  and missing  $E_T$ . Each of these items is tested against several  $p_T$  or  $E_T$  thresholds.

The global compilation of this information is used to decide whether to keep (i.e. trigger on) the data from a particular beam crossing. The L1 logic also has the ability to monitor and control trigger rates, hot and dead channels, and other pathological conditions. The maximum design trigger rate of 100 kHz for L1 corresponds to a minimum rejection rate of  $10^4$  at design luminosity of  $10^{34} \text{cm}^{-2}\text{s}^{-1}$ . This maximum L1 trigger rate is set by the average time to read information for processing by the HLT and the average time for completion of processing steps in the HLT logic.

### 1.3.2 High Level Triggers

The CMS Level-1 Trigger System is required to reduce the input interaction rate of 1 GHz to a filtered event rate of 75 kHz. For physics analysis and further event filtering, the data corresponding to each selected event must then be moved from about 512 front-end buffers to a single location. To match the capabilities of the mass storage and offline computing systems, the final output of the experiment should not exceed 100 events per second.

These functions will be performed by a system employing a high performance readout network to connect the sub-detector readout units via a switch fabric to the event filter units (which are implemented by a computer farm). The flow of event data will be controlled by an event manager system.

In order to optimize the data flow, the filter farm performs event selection in progressive stages by applying a series of High Level Trigger filters. The initial filtering decision is made on a subset of the data, from detector components such as the calorimeter and muon systems. This avoids saturating the system bandwidth by reading out the large volume of tracking data at 75 kHz. It is expected that initial filtering can reduce the event rate by at least one order of magnitude. The remainder of the full event data are only transferred to the farm after passing these initial filters and the final High Level Trigger algorithms are then applied to the complete event.

The High Level Triggers have access to all the information used in L1 since this is stored locally in the L1 trigger crates. Consequently, High Level Triggers can make further combinations and other topological calculations on the digital list of objects transmitted from L1.

More importantly, much information is not available on the time scale of the L1 trigger decision. This information is then used in the High Level Triggers. This information includes that from the tracker and the full granularity of the calorimeters. Eventually, the High Level Triggers use the full event data for the decision to keep an event.

The High Level Triggers, implemented as a processing farm that is designed to achieve a rejection factor of  $10^3$ , write up to 100 events/second to mass storage. The last stage of High Level Trigger processing does reconstruction and event filtering with the primary goal of making data sets of different signatures on easily accessed media.

## 1.4 Overview of Level 1 Trigger Organization

### 1.4.1 Introduction

The L1 Trigger System is organized into three major subsystems: the L1 calorimeter trigger, the L1 muon trigger, and the L1 global trigger. The muon trigger is further organized into

subsystems representing the 3 different muon detector systems, the Drift Tube Trigger in the barrel, the Cathode Strip Chamber (CSC) trigger in the endcap and the Resistive Plate Chamber (RPC) trigger covering both barrel and endcap. The L1 muon trigger also has a global muon trigger that combines the trigger information from the DT, CSC and RPC trigger systems and sends this to the L1 global trigger. A diagram of the L1 trigger system is shown in Fig. 1.2.



**Fig. 1.2:** Overview of Level 1 Trigger

The data used as input to the L1 trigger system as well as the input data to the global muon trigger, global calorimeter trigger and the global trigger are transmitted to the DAQ for storage along with the event readout data. In addition, all trigger objects found, whether they were responsible for the L1 trigger or not, are also sent. The decision whether to trigger on a specific crossing or to reject that crossing is transmitted via the Trigger Timing and Control system to all of the detector subsystem front end and readout systems.

#### 1.4.2 Calorimeter Trigger

The calorimeter trigger begins with trigger tower energy sums formed by the ECAL, HCAL and HF upper level readout Trigger Primitive Generator (TPG) circuits from the individual calorimeter cell energies. For the ECAL, these energies are accompanied by a bit indicating the transverse extent of the electromagnetic energy deposit. For the HCAL, the energies are accompanied by a bit indicating the presence of minimum ionizing energy. The TPG information is transmitted over high speed copper links to the Regional Calorimeter Trigger (RCT), which finds candidate electrons, photons, taus, and jets. The RCT separately finds both isolated and non-isolated electron/photon candidates. The RCT transmits the candidates along with sums of transverse energy to the Global Calorimeter Trigger (GCT). The GCT sorts the candidate electrons, photons, taus, and jets and forwards the top 4 of each type to the global trigger. The GCT also calculates the total transverse energy and total missing energy vector. It transmits this information

to the global trigger as well. The RCT also transmits an  $(\eta, \phi)$  grid of quiet regions to the global muon trigger for muon isolation cuts.

### 1.4.3 Muon Trigger

Each of the L1 muon trigger systems has its own trigger logic. The RPC strips are connected to a Pattern Comparator Trigger (PACT), which is projective in  $\eta$  and  $\phi$ . The PACT forms trigger segments which are connected to segment processors which find the tracks and calculate the  $p_T$ . The RPC logic also provides some hit data to the CSC trigger system to improve resolution of ambiguities caused by 2 muons in the same CSC.

The Cathode Strip Chambers form Local Charged Tracks (LCT) from the Cathode Strips, which are combined with the Anode wire information for bunch crossing identification on a Trigger Motherboard. The LCT pattern logic assigns a  $p_T$  and quality, which is used to sort the LCT on the Motherboard and the Muon Port Card that collects LCTs from up to 9 CSC chambers. The top 3 LCTs from all the MPCs in a sector are transmitted to the CSC Track Finder, which combines the LCTs into full muon tracks and assigns  $p_T$  values to them. The CSC and Drift Tube Track-Finders exchange track segment information in the region where the chambers overlap.

The Barrel Muon Drift Tubes are equipped with Bunch and Track Identifier (BTI) electronics that finds track segments from coincidences of aligned hits in 4 layers of one drift tube superlayer. The track segments positions and angles are sent to the Track Correlator (TRACO), which attempts to combine the segments from the two SLs measuring the  $\phi$  coordinate. The best combinations from all TRACOs of a single chamber together with the SL  $\eta$  segments are collected by the Trigger Server. The Trigger Server then sends the best two segments (if found) to the Track Finder, which combines the segments from different stations into full muon tracks and assigns  $p_T$  values to them.

The Global Muon Trigger sorts the RPC, DT and CSC muon tracks, converts these tracks into the same  $\eta$ ,  $\phi$  and  $p_T$  scale, and validates the muon sign. It then attempts to correlate the CSC and DT tracks with RPC tracks. It also correlates the found muon tracks with an  $\eta$ - $\phi$  grid of quiet calorimeter towers to determine if these muons are isolated. The final ensemble of muons are sorted based on their initial quality, correlation and  $p_T$  and then the 4 top muons are sent to the Global Trigger.

### 1.4.4 Global Trigger

The Global Trigger accepts muon and calorimeter trigger information, synchronizes matching sub-system data arriving at different times and communicates the Level-1 decision to the timing, trigger and control system for distribution to the sub-systems to initiate the readout. The global trigger decision is made using logical combinations of the trigger data from the Calorimeter and Muon Global Triggers.

The CMS L1 system sorts ranked trigger objects, rather than histogramming objects over a fixed threshold. This allows all trigger criteria to be applied and varied at the Global Trigger level rather than earlier in the trigger processing. All trigger objects are accompanied by their coordinates in  $(\eta, \phi)$  space. This allows the Global Trigger to vary thresholds based on the location of the trigger objects. It also allows the Global Trigger to require trigger objects to be close or opposite from each other. In addition, the presence of the trigger object coordinate data in the

trigger data, which is read out first by the DAQ after a L1A, permits a quick determination of the regions of interest where the more detailed HLT analyses should focus. Besides handling physics triggers, the Global Trigger provides for test and calibration runs, not necessarily in phase with the machine, and for prescaled triggers, as this is an essential requirement for checking trigger efficiencies and recording samples of large cross section data.

The Global L1 Trigger transmits a decision to either accept (L1A) or reject each bunch crossing. This decision is transmitted through the Trigger Throttle System (TTS) to the Timing Trigger and Control system (TTC). The TTS allows the reduction by prescaling or shutting off of L1A signals in case the detector readout or DAQ buffers are at risk of overflow.

#### 1.4.5 Timing, Trigger and Control System

The TTC system provides for the transmission of the L1A and a precise 40 MHz bunch crossing clock along with other fast data to the detector subsystems over a network of optical fibers. The TTC system uses optical-broadcast technology developed by the RD-12 Collaboration [1.1]. The TTC system in CMS is divided into a series of zones. Within each zone signals can be broadcast from a single laser source to more than a thousand destinations over a passive network composed of a hierarchy of optical tree couplers. Active optical/electrical converters (TTCrx) at each fiber destination provide programmable coarse and fine deskew to compensate for different particle flight times and detector, electronics, propagation and test generator delays. Prototype TTC hardware has been used successfully to provide clock and control signals in laboratory and beam tests by CMS. The results from these tests are described in the subsequent chapters for the individual trigger subsystems.

#### 1.4.6 Physical Location of the Trigger Electronics

CMS L1 Trigger electronics is located in both the underground interaction hall, UXC55, and in the underground counting house, USC55, as shown in Fig. 1.3. The electronics in the counting house are connected to the electronics on the detector with optical fibres which penetrate the shielding wall between the two sections of the cavern via tunnels that minimize the connecting path length in order to minimize latency.

All of the L1 Calorimeter trigger is located in the counting room. The trigger primitive generation is performed in the calorimeter readout crates in the counting house which are connected via optical fibres to digitizers on the detector. The muon drift chambers, RPCs and CSCs form track segments in logic mounted in the chambers and on the outside of the detector. These segments are then transmitted over optical fibres to track-finder logic in the counting room.

The Global L1 trigger is located in the counting room in racks close to the other L1 trigger components. The Global L1 decision to keep or reject a particular crossing is transmitted through the Timing Trigger and Control system (TTC) over optical links from the control room via the tunnels through the shielding wall out to the front end electronics buffers on the detector where the data is stored during the level 1 latency.

#### 1.4.7 Physical Realization of the Trigger Electronics

Much of the logic in the trigger system is contained in custom Application Specific Integrated Circuits (ASICs), semi-custom or gate-array ASICs, Field Programmable Gate Arrays



**Fig. 1.3:** CMS underground interaction hall and counting house.

(FPGAs), Programmable Logic Devices (PLDs), or discrete logic such as Random Access Memories (RAM) that are used for memory Look-Up Tables (LUT)<sup>1</sup>. The time of writing of this Technical Design Report is coincident with remarkable progress in FPGA technology, both in speed and number of gates. The CMS trigger group is taking full advantage of these advances and will continue to do so. As is described in the following chapters, each trigger subsystem has performed careful optimization of their designs fully cognizant of the evolution of technology consistent with the overall project schedule. Where possible and where the added flexibility offers an advantage and is cost effective, designs incorporate new FPGA technology. Where the dataflow is predetermined by cables and backplanes or where the highest speed is needed or where standard operations such as adding or sorting are used, ASICs appear in the designs.

The key to a good trigger system is flexibility. The CMS L1 trigger electronics has been designed to provide maximum flexibility. Not only are all thresholds programmable, but as mentioned above, algorithms are either implemented in FPGAs or LUTs. Reprogramming the FPGAs or downloading new LUT contents allows for major revisions of the trigger algorithms. The only aspect of the trigger system that is fixed is which data is brought to which point for processing. However, this is determined by the detector elements, size of showers and curvature of tracks, which are well known and basic features of the CMS detector and LHC physics. Therefore, in the sections that follow, when various algorithms, thresholds and cuts are discussed, it should be understood that these are examples to prove that the trigger electronics as proposed will be able to handle the physics requirements of CMS. These algorithms, thresholds and cuts are not by any means the final choices that will be made by CMS. They are simply illustrative of the type of trigger configuration that CMS will run.

---

<sup>1</sup>. Note: All trademarks, copyright names and products referred to in this document are acknowledged as such.

The L1 trigger system sustains a large dataflow. This is either carried on optical fibers, copper cables, or on backplanes within crates. The data carried by these means may be sent in parallel at either 40 MHz, or a higher multiple of this frequency, or converted from parallel to serial and transmitted at a higher rate on a single lines or pair of lines. Serial data transmission has the advantage of transmitting more data per cable wire or backplane pin but the disadvantage of extra latency for the parallel to serial and serial to parallel operations plus the risk of data errors involved with the encoding, high frequency transmission and link synchronization. In many cases this requires the overhead of monitoring and error detection bits. Copper cables in general avoid the necessity for optical drivers with their cost, size and power requirements, but have limited length capability. As is described in the subsequent chapters, each trigger subsystem has performed careful optimization of their designs to see if parallel data transmission and copper cables are feasible. Typically, where card space for mounting sufficient connectors is not available, serial data transmission has been employed and where distances are too great for copper cable of a reasonable size, optical fibers are used.

#### 1.4.8 Coordinates and unit requirements

Establishing the coordinate system of CMS is of primary importance to the trigger systems since their channels are laid out in correspondence to this coordinate system. Therefore the location of trigger objects and the application of spatial correlations depends on use of the same coordinate system by all components. Therefore, the coordinate system is explicitly described here. All input data use this coordinate system to combine data from different subsystems. The coordinate definitions are as follows:

Definition: CMS is NORTH of LHC centre;

x = horizontal axis pointing southwards toward the centre of LHC;

y = vertical axis pointing upwards;

z = horizontal axis pointing westwards in beam direction, parallel to B-field;

$\phi = 0^\circ$  in x-axis,  $= 90^\circ$  in y-axis

$\eta = 0 \dots$  in x-y plane,  $\eta > 0 \dots$  positive z-axis,  $\eta < 0 \dots$  negative z-axis;

Origin = CMS collision point

#### Reference

- [1.1] 1999 Status Report on the RD-12 Project, CERN/LHCC 2000-002, 3 January 2000.



# 2 Requirements

## 2.1 Physics Requirements

### 2.1.1 Cross Sections and Rates

Cross sections of phenomena to be studied at LHC span many orders of magnitude. This is illustrated in Fig. 2.1.



**Fig. 2.1:** Inclusive proton-(anti)proton cross sections  $\sigma$  for basic physics processes. Interaction rates for the nominal luminosity are given on the right hand scale.

The enormous range of the cross sections makes the triggering at LHC a very challenging task. The trigger system has to select efficiently a few interesting events among millions of background ones. Note the relatively high cross section for b-quarks. The physics of b-quarks is very interesting in itself, because of quark mixing and CP violation phenomena. On the other hand it is one of the main sources of leptons which constitute huge background for other processes. High  $p_T \pi^0$ 's generated in jet fragmentation and direct photon processes form a large background for electron trigger. The top quark discovered at Tevatron will be studied in detail at LHC at low luminosity. On the other hand, top is a very severe background to more exotic physics at high luminosity, because it has signatures very similar to many new expected particles.

In the following sections of this chapter we are going to review basic physics channels to be studied at LHC. Since there is a vast literature on the subject we do it here only in a brief, tabular form. Our goal is to derive requirements for the trigger.

### 2.1.2 Physics Simulation Tools

Most of the physics studies for CMS, including those discussed in this chapter, were done with event generators, like PYTHIA [2.1], ISAJET [2.2], or their supersymmetric extensions. Some results were obtained on the particle level, without simulating the detector. More advanced studies were performed with the CMSIM program [2.3], which simulates in detail the detector response. In some cases the digitisation and reconstruction was performed by its object oriented version called ORCA [2.4]. For more information concerning the simulation of physics processes discussed in this chapter we refer the reader to the quoted papers. Trigger simulation tools are described in Sections 3.4.1 and 8.4.1.

### 2.1.3 Review of Physics Channels

In this section we review the physics channels to be studied at LHC, looking for possible ways of triggering. We refer to CMS documents quoting cuts applied in Monte Carlo analysis. Trigger threshold on corresponding objects should never be lower. The following trigger objects are considered:

- $\mu$  — muon (any),
- $\mu_i$  — isolated muon (no nearby jet),
- $e$  — electron/photon (isolated),
- $e_b$  — b-electron (from b-quark decay),
- jet — local energy concentration in the calorimeter,
- $\tau$  — tau trigger (a narrow jet),
- $E_T^{\text{miss}}$  — missing transverse energy,
- $2\mu, 2e, e\mu, 2$  jets, 3 jets, etc. — multi-object triggers.

Only isolated electrons and photons are considered because otherwise one cannot stand the background from QCD jets. This is not satisfactory for b-quark physics and therefore a dedicated *b-electron trigger* is required. Possible implementation of those triggers will be discussed in Chapters 3, 4 and 5.

Whenever we consider a multi-object trigger for a given channel, the efficiency will be complemented with corresponding single object triggers. For example some dimuon events may escape a  $p_T > 10 \text{ GeV}/c$  two-muon trigger if one of the muons is beyond the trigger acceptance. However, those among them which have one muon of  $p_T > 20 \text{ GeV}/c$  can be recovered by a single  $\mu$  trigger set at  $20 \text{ GeV}/c$ . In this sense we can say that the two-muon trigger implies the use of a single muon trigger, which can be denoted shortly: " $2\mu$  implies  $\mu$ ". This means that wherever in the tables of this chapter we quote the two-muon trigger, the single muon one is also applicable. The complete set of this kind of dependencies is given below:

- $\mu_i$  implies  $\mu$
- $e_b$  implies  $e$
- 4 jets implies 3 jets, which implies 2 jets, which implies 1 jet
- any multi-object trigger implies all corresponding single object triggers

Channels having low value of the (cross section  $\times$  branching ratio) product require high luminosity to collect reasonable statistics. Some others can be better studied at low luminosities because e.g. pileup of several pp interactions, typical for high luminosity, can spoil vertex finding, etc. We denote this in the following way:

- H — high luminosity:  $L = 10^{34} \text{ cm}^{-2} \text{s}^{-1}$
- L — low luminosity:  $L = 10^{33} \text{ cm}^{-2} \text{s}^{-1}$
- VL — very low luminosity:  $L = 10^{32} \text{ cm}^{-2} \text{s}^{-1}$

References in the tables are given in the following convention:

- **x.y.z** — chapters of the CMS Technical Proposal [2.5]
- **LOI x.y.z** — chapters of the CMS Letter Of Intent [2.6]
- **yy-xxx** — CMS Technical Note **CMS TN/yy-xxx**
- **yy/xxx** — CMS Internal Note **CMS IN yy/xxx**
- **Nyy/xxx** — CMS Note **CMS NOTE yy/xxx**
- **CRyy/xxx** — CMS Conference Report **CMS CR yy/xxx**

Empty cells correspond to the areas where study has not yet been done or the information is not available. Notation  $p_T > 0$  means that no trigger threshold is required; it is enough to observe the muon in the detector.

## Standard Model higgs

The quest for the higgs particle is a major goal of LHC. Many physics channels were envisaged to cover the entire range of possible higgs mass — from today's limit up to almost  $1 \text{ TeV}/c^2$ . Possible ways of triggering are reviewed in Table 2.1. It is already known from present experiments, that higgs cannot be too light. Triggering should not be very difficult, because relatively high thresholds can provide high acceptance.

**Table 2.1:** Search for Standard Model higgs

| physics channel                                                                                                               | references                              | $\mathcal{L}$ | offline cut (GeV/c)                                                               | trigger                            |
|-------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------|---------------|-----------------------------------------------------------------------------------|------------------------------------|
| $H \rightarrow \gamma\gamma$                                                                                                  | 12.1.2, 93-75, 94-289<br>94-290, CR97/6 | H             | $p_T(\gamma) > 40, 25$                                                            | 2e                                 |
| $WH \rightarrow \gamma\gamma$<br>$t\bar{t}H \rightarrow \gamma\gamma$                                                         | 12.1.4, 93-86<br>94-247                 | H             | $p_T(\gamma) > 40, 20$<br>$p_T(\ell) > 20$                                        | 2e<br>$e\mu_i$                     |
| $H \rightarrow \gamma\gamma j e$                                                                                              |                                         | H             |                                                                                   | 2e, jet                            |
| $t\bar{t}H \rightarrow b\bar{b}$<br>$t \rightarrow bW \rightarrow q\bar{q}, \bar{t} \rightarrow \bar{b}W \rightarrow \ell\nu$ | N99/1                                   | H             | $p_T(\ell) > 20$<br>$E_T^{jet} > 10$                                              | $e, \mu_i$<br>6 jets               |
| $H \rightarrow ZZ^* \rightarrow 4\ell$                                                                                        | 12.1.5, 93-85<br>94-214, 95-18          | L             | $p_T(e) > 20, 15, 10, 10$<br>$p_T(\mu) > 10-20, 5-10, 5, 5$                       | 2e, $2\mu_i$<br>$e\mu_i$           |
|                                                                                                                               | 95-19, 95-59, 95-101<br>96-100, N97/43  | H             | $p_T(e) > 20, 15, 10, 10$<br>$p_T(\mu) > 20, 10, 5, 5$                            | 2e, $2\mu_i$<br>$e\mu_i$           |
| $H \rightarrow ZZ \rightarrow 4\ell$                                                                                          | 12.1.6, 93-79<br>93-101, 95-11          | L             | $p_T(e) > 20, 15, 10, 10$<br>$p_T(\mu) > 10, 5, 5, 5$                             | 2e, $2\mu_i$<br>$e\mu_i$           |
|                                                                                                                               | 95-18, 95-19<br>95-76, 96-92            | H             | $p_T(e) > 20, 15, 10, 10$<br>$p_T(\mu) > 20, 10, 5, 5$                            | 2e, $2\mu_i$<br>$e\mu_i$           |
| $WH \rightarrow ZZ^* \rightarrow 4\ell$                                                                                       | N99/71                                  | H             | $p_T(e) > 20, 15, 7, 7$<br>$p_T(\mu) > 20, 10, 5, 5$                              | 2e, $2\mu_i$<br>$e\mu_i$           |
| $H \rightarrow ZZ \rightarrow \ell\ell\nu\nu$                                                                                 | 12.1.7<br>93-87<br>95-75                | L             | $E_T^{miss} > 100$<br>$p_T(\ell) > 20, 20$<br>$p_T(Z \rightarrow \ell\ell) > 60$  | $E_T^{miss}$<br>2e, $2\mu_i$       |
|                                                                                                                               | 12.1.7, 92-49<br>94-179<br>95-75        | H             | $E_T^{miss} > 100$<br>$p_T(\ell) > 20, 20$<br>$p_T(Z \rightarrow \ell\ell) > 200$ | $E_T^{miss}$<br>2e, $2\mu_i$       |
| $H \rightarrow ZZ \rightarrow \ell\ell jj$                                                                                    | 12.1.8, 93-88<br>95-75                  | L             | $p_T(\ell) > 20$<br>$p_T(Z \rightarrow jj) > 100$                                 | 2e, $2\mu$<br>2 jets               |
|                                                                                                                               | 12.1.8, 92-49<br>94-178, 95-75          | H             | $p_T(\ell) > 50$<br>$p_T(Z \rightarrow jj) > 150$                                 | 2e, $2\mu$<br>2 jets               |
| $H \rightarrow WW \rightarrow \ell\nu jj$                                                                                     | 12.1.8<br>93-88                         | L             | $E_T^{miss} > 100$<br>$p_T(\ell) > 20$<br>$p_T(W \rightarrow jj) > 150$           | $E_T^{miss}$<br>$e, \mu$<br>2 jets |
|                                                                                                                               | 12.1.8, 92-49<br>94-178<br>95-154       | H             | $E_T^{miss} > 150$<br>$p_T(\ell) > 150$<br>$p_T(W \rightarrow jj) > 300$          | $E_T^{miss}$<br>$e, \mu$<br>2 jets |
| $H \rightarrow WW \rightarrow \ell^+\nu \ell^-\bar{\nu}$                                                                      | N97/83, N98/89                          | L,H           | $p_T(\ell) > 20, 10$                                                              | 2e, $2\mu_i, e\mu_i$               |

See also general reports: N97/57, N97/80.

We consider in detail the following channels:

- $H(115 \text{ GeV}/c^2) \rightarrow \gamma\gamma$
- $H(120 \text{ GeV}/c^2) \rightarrow ZZ^* \rightarrow 4l$
- $H(500 \text{ GeV}/c^2) \rightarrow ZZ \rightarrow ll\nu\nu$

- H (800 GeV/c<sup>2</sup>) → ZZ → lljj

Figures 2.2-2.3 show how trigger acceptance depends on the thresholds. It is seen that the full acceptance is preserved for a single lepton cut at ~20 GeV/c and two-lepton cut at 10-15 GeV/c. The highest single photon threshold one can consider is ~40-50 GeV/c for the light higgs (110-120 GeV/c<sup>2</sup>) and the highest dilepton threshold is ~20-30 GeV/c. Beyond that the trigger acceptance is seriously degraded. These numbers will be used in the next chapter to derive requirement for the trigger and data acquisition.



**Fig. 2.2: a)** Acceptance of the single and double photon trigger for  $H \rightarrow \gamma\gamma$  ( $m_H=115$  GeV).



**Fig. 2.2: b)** Acceptance of the single and double lepton trigger for  $H \rightarrow ZZ^* \rightarrow 4l$  ( $m_H=120$  GeV).



**Fig. 2.3: a)** Acceptance of the single and double photon trigger for  $H \rightarrow ZZ \rightarrow llvv$  ( $m_H=500$  GeV).



**Fig. 2.3: b)** Acceptance of the single and double lepton trigger for  $H \rightarrow ZZ \rightarrow lljj$  ( $m_H=800$  GeV).

## SUSY higgs

In the case of supersymmetric models there is more than one higgs particle. The SM higgs triggers are all used for SUSY higgs searches as well, but new decay channels can be used to increase sensitivity. They are listed in Table 2.2. SUSY higgs decaying to  $\tau\bar{\tau}$  and  $b\bar{b}$  pairs is an important signal because of enhanced cross sections of these channels. Acceptance for the  $\tau\bar{\tau}$  channel with taus decaying to e or  $\mu$  places strong constraints on trigger thresholds. A dedicated  $\tau$  trigger for hadronic  $\tau$  decays is foreseen to improve efficiency for  $H \rightarrow \tau\bar{\tau}$  searches.

**Table 2.2:** Search for SUSY higgs.

| physics channel                                                                | references                                      | $\mathcal{L}$ | offline cut (GeV/c)                                                              | trigger                          |
|--------------------------------------------------------------------------------|-------------------------------------------------|---------------|----------------------------------------------------------------------------------|----------------------------------|
| $h, H \rightarrow \gamma\gamma$                                                | see SM $H \rightarrow \gamma\gamma$<br>+ 96-102 | H             | $p_T(\gamma) > 40, 40$                                                           |                                  |
| $h \rightarrow ZZ^*$                                                           | see SM $H \rightarrow ZZ^*$<br>+ 96-96          |               |                                                                                  |                                  |
| $h, H \rightarrow ZZ$                                                          | see SM $H \rightarrow ZZ$<br>+ 96-96            |               |                                                                                  |                                  |
| $h, A, H \rightarrow \tau\tau \rightarrow \tau\text{-jet } \tau\text{-jet } X$ | N99/37                                          | L             | $E_T(\tau) > 60$                                                                 | $2\tau$                          |
| $h, A, H \rightarrow \tau\tau \rightarrow \ell^\pm \tau\text{-jet } X$         | 12.2.4, 93-98<br>93-103, 96-29<br>N97/2, N97/16 | L             | $p_T(\ell) > 10-40, \text{isol.}$<br>$E_T^{miss} > 20-30$<br>$E_T^{jet} > 40-80$ | $e, \mu_i, \tau$                 |
| $h, A, H \rightarrow \tau\tau \rightarrow e\mu X$                              | 12.2.4, 93-84<br>N98/19                         | L,H           | $p_T(e) > 20$<br>$p_T(\mu) > 20$                                                 | $e, \mu_i$<br>$e\mu_i$           |
| $t \rightarrow H^\pm b, \quad H^\pm \rightarrow \tau\nu$                       | 12.2.5, 92-48<br>94-233                         | L             | $p_T(\ell) > 20, \text{isol.}$<br>$p_T(\mu) > 7, \text{b-tag}$                   | $e, \mu_i$<br>$2\mu, e\mu, \tau$ |
| $h, A, H \rightarrow \mu\mu$                                                   | 12.2.6-7, 94-182<br>N98/39                      | L,H           | $p_T(\mu) > 10, 10$                                                              | $2\mu_i$                         |
| $h \rightarrow b\bar{b}$                                                       | CR98/5                                          | L             | $E_T^{jet} > 40, E_T^{miss} > 400$                                               | 4 jets, $E_T^{miss}$             |
| $A \rightarrow Zh \rightarrow \ell\ell b\bar{b}$                               | (12.2.8)<br>96-49                               | L             | $p_T(e) > 20, 20$<br>$p_T(\mu) > (5, 5) 10, 10$<br>$E_T^{jet} > 20$              | $2e, 2\mu_i$<br>$e\mu$           |
| $Wh, Zh, Hh \rightarrow (\ell)\ell b\bar{b}$                                   | 12.2.9                                          |               | W, Z, t                                                                          | $2e, 2\mu_i, e\mu$               |
| $H \rightarrow WW \rightarrow \ell^+\nu \ell^-\bar{\nu}$                       | see SM H                                        |               |                                                                                  |                                  |

See also general reports: 93-122, N97/57.

## Supersymmetric particles

If the supersymmetry is indeed realized in our world, the zoo of many new particle species will keep us busy for many years, discovering them one by one, and studying their properties. The results of those studies will have also cosmological implications, as the lightest SUSY particle can constitute a significant fraction of the dark matter in the universe. The channels already studied by simulation in CMS are listed in Table 2.3. Complicated cascade decays will create many hard leptons, very useful for triggering. Squarks will produce numerous jets. Neutralinos and gravitino might be detected by missing energy  $E_T^{miss}$ .

**Table 2.3:** Search for SUSY partners

| physics channel                                                                                | references                                                                       | $\mathcal{L}$ | offline cut (GeV/c)                                                  | trigger                                                   |
|------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------|---------------|----------------------------------------------------------------------|-----------------------------------------------------------|
| $\tilde{g}\tilde{g}, \tilde{q}\tilde{q} \rightarrow 1\text{-}4 \ell \chi_1^0 2\text{jets}$     | 95-90, 95-91, 94-318<br>96-22, 96-95, 96-103<br>N97/15, N97/16<br>N98/73, N99/18 | L,H           | $p_T(\ell) > 10\text{-}20$<br>$E_T^{miss} > 100$<br>$E_T^{jet} > 40$ | 2e, $2\mu_i$ , e $\mu$<br>$E_T^{miss}$<br>2 jets          |
| $\tilde{q}\tilde{q} \rightarrow 4 \text{jets}$                                                 | N97/67                                                                           | L,H           | $E_T^{jet} > 100, 100, 100, 100$                                     | 4 jets                                                    |
| $\tilde{q}\tilde{g} \rightarrow \chi_i^0 \rightarrow \chi_j^0 h$<br>$h \rightarrow b\bar{b}$   | N97/70                                                                           | L,H           | $E_T^{miss} > 100$<br>$E_T^{jet} > 20, 20, 20, 20$                   | $E_T^{miss}$<br>4 jets                                    |
| $\tilde{g}, \tilde{q}, \chi, h \rightarrow b\text{-jets}/\tau\text{-jets}$                     | N99/35                                                                           | L,H           | $E_T^{jet} > 50$                                                     | N jets, N $\tau$ 's                                       |
| $\tilde{\ell}\tilde{\ell} \rightarrow 2\text{-}3 \ell \chi_1^0$ 's                             | 96-59                                                                            | L,H           | $p_T(\ell) > 20$<br>$E_T^{miss} > 50$                                | 2e, $2\mu_i$ , e $\mu$                                    |
| $\tilde{\ell}\tilde{\ell} \rightarrow \chi_1^0 \chi_1^0$                                       | N98/40                                                                           | H             | $p_T(\ell) > 20, 20, E_T^{miss} > 50$                                | 2e, $2\mu_i$                                              |
| $\chi_1^0 \rightarrow 3\ell$                                                                   | N99/53                                                                           | H             | $p_T(\mu) > 10, p_T(e) > 20$<br>$E_T^{jet} > 50, 50$                 | 2e, $2\mu_i$ , e $\mu_i$<br>2 jets                        |
| $\chi_2^0 \rightarrow \ell^+ \ell^- \chi_1^0$<br>$\chi_2^0 \rightarrow \tau^+ \tau^- \chi_1^0$ | N98/85                                                                           | L,H           | $p_T(\ell) > 10, E_T^{miss} > 150$<br>$E_T^{jet} > 60, 60, 60$       | 2e, $2\mu_i$ , e $\mu$ , 2 $\tau$<br>3 jets, $E_T^{miss}$ |
| $\chi_2^0 \chi_1^0 \pm \rightarrow \ell \ell \chi_1^0 \ell' \nu \chi_1^0$                      | N97/7, N97/65<br>N97/69, N97/94                                                  | L,H           | $p_T(\ell) > 15$                                                     | 2e, $2\mu_i$ , e $\mu$                                    |
| $\chi_1^0 \chi_1^0 \rightarrow \tilde{G}\gamma \tilde{G}\gamma$                                | N97/79, CR99/19                                                                  | L,H           | $p_T(\gamma) > 40, 40$<br>$E_T^{miss} > 100$                         | 2e, $E_T^{miss}$                                          |

See also general reports: 93-125, 95-66, 96-58, 96-65, CR97/9, CR97/12, CR97/19, N98/6, CR98/13.

### Alternative models and exotica

Higgs mechanism is not the only possibility of explaining the masses of fundamental fermions. Alternative models often predict new particles, like additional gauge bosons  $W'$  and  $Z'$ . A few examples are given in Table 2.4. The new particles are expected to be as heavy as several hundred GeV/c<sup>2</sup> and they produce very hard leptons, easy to trigger.

**Table 2.4:** Search for exotic particles

| physics channel                                                         | references        | $\mathcal{L}$ | offline cut (GeV/c)                          | trigger                |
|-------------------------------------------------------------------------|-------------------|---------------|----------------------------------------------|------------------------|
| $VV$ scattering                                                         | 12.1.9, 94-276    | H             | $p_T(W, Z) > 300$                            | 2e, $2\mu_i$ , e $\mu$ |
| $W' \rightarrow \ell\nu$                                                | LOI 8.1.3         | H             | $p_T(\ell) > 100$                            | e, $\mu_i$             |
| $W' \rightarrow WZ \rightarrow \ell^\pm \nu \ell^\mp \ell^\pm \ell^\mp$ | LOI 8.1.3, 93-57  | H             | $p_T(\ell) > 100, 100, 100$                  | 2e, $2\mu_i$ , e $\mu$ |
| $Z' \rightarrow \ell\ell$                                               | LOI 8.1.3, 93-107 | L,H           | $p_T(\ell) > 20, 20$                         | 2e, $2\mu_i$           |
| leptoquarks LQ<br>(scalar, $\sim 1.5$ TeV)                              | CR96/3, N99/27    | H             | $p_T(\ell) > 40, 40$<br>$E_T^{jet} > 40, 40$ | 2e, $2\mu_i$           |
| compositeness: $Z \rightarrow \gamma\gamma\gamma$                       | 94-188            | H             | $p_T(\gamma) > 20, 20, 10$                   | 2e                     |
| compositeness: $\gamma^*/Z \rightarrow e^+ e^-$                         | N99/75            | L,H           | $p_T(e) > 25, 25$                            | 2e                     |
| technicolor $\rho_T, \omega_T$                                          |                   | H             |                                              | 2e, $2\mu_i$ , e $\mu$ |

## b-quark physics

The b-quark physics seems to be the most challenging task for the L1 trigger. Leptons from b decays are very soft and their spectrum is rapidly falling down with  $p_T$ . Therefore the trigger thresholds should be as low as possible to preserve relatively high acceptance. This is well seen in Table 2.5. It is foreseen that it will be possible to study b-physics only at low luminosity phase of LHC. The muon trigger plays a more important role in these studies than the electron one, because it can operate with lower  $p_T$  thresholds.

**Table 2.5:** Study of the b-quark physics

| physics channel                                                                                                                    | references                                                | $\mathcal{L}$ | offline cut (GeV/c)                                 | trigger                   |
|------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------|---------------|-----------------------------------------------------|---------------------------|
| $B_d^0 \rightarrow J/\psi K_s^0 \rightarrow \ell^+ \ell^- \pi^+ \pi^-$<br>$b \rightarrow \mu_{tag}$ or $b \rightarrow e_{tag}$     | 12.4.2, 12.4.4, 93-69<br>94-193, 96-105<br>96-116, 96-117 | L             | $p_T(\mu) > 2-4, 2-4, 0$<br>$p_T(e) > 5, 5, 2$      | $2\mu$<br>$e_b \mu$       |
| $B_d^0 \rightarrow J/\psi K_s^0 \rightarrow \ell^+ \ell^- \pi^+ \pi^-$<br>with self-tagging or b-jet tagging                       | 95-39<br>N00/8                                            | L             | $p_T(\mu) > 2-4, 2-4$<br>$p_T(e) > 5, 5$            | $2\mu, e_b \mu$<br>$2e_b$ |
| $B_d^0 \rightarrow J/\psi K_s^0 \rightarrow \ell^+ \ell^- \pi^+ \pi^-$<br>with $\Lambda$ tagging                                   | 94-189                                                    | L             | $p_T(\mu) > 2-4, 2-4$<br>$p_T(e) > 5, 5$            | $2\mu, e_b \mu$<br>$2e_b$ |
| $B^\pm \rightarrow J/\psi K^\pm, b \rightarrow \mu_{tag}$                                                                          | 12.4.5                                                    | L             | $p_T(\mu) > 2-4, 2-4, 0$                            | $2\mu$                    |
| $B_d^0 \rightarrow J/\psi K^{*0}, b \rightarrow \mu_{tag}$                                                                         | 94-237, 96-105                                            | L             | $p_T(\mu) > 2-4, 2-4, 0$                            | $2\mu$                    |
| $B_s^0 \rightarrow J/\psi \phi, b \rightarrow \mu_{tag}$                                                                           | N97/72, N99/25                                            | L             | $p_T(\mu) > 2-4, 2-4, 0$                            | $2\mu$                    |
| $B_d^0 \rightarrow \pi^+ \pi^-$<br>$b \rightarrow \mu_{tag}$ or $b \rightarrow e_{tag}$                                            | 12.4.3-4, 94-114<br>94-328                                | L             | $p_T(\mu) > 6.5$<br>$p_T(e) > 10$<br>$p_T(\pi) > 5$ | $\mu$<br>$e_b$            |
| $B_s^0 \rightarrow D_s^{(*)\pm} \mu X$                                                                                             | 12.4.5                                                    | L             | $p_T(\mu) > 10$                                     | $\mu$                     |
| $B_d^0 \rightarrow D^{*\pm} \mu X$                                                                                                 | 94-184, N98/82                                            | L             | $p_T(\mu) > 10$                                     | $\mu$                     |
| $B_s^0 / \bar{B}_s^0 \rightarrow D_s^\mp, D_s^\mp \rightarrow \phi \pi^\mp$<br>$\phi \rightarrow K^+ K^-, b \rightarrow \mu_{tag}$ | 12.4.6, 93-117<br>94-183, 94-184                          | L             | $p_T(\mu) > 10$                                     | $\mu$                     |
| $B_s^0 \rightarrow \mu^+ \mu^-$                                                                                                    | 12.4.7, 94-186, N99/39                                    | L             | $p_T(\mu) > 4.3, 4.3$                               | $2\mu$                    |
| $\Lambda_b \rightarrow J/\psi \Lambda$                                                                                             | 94-227                                                    | L             | $p_T(\mu) > 2-4, 2-4$                               | $2\mu$                    |
| $\Xi_b \rightarrow J/\psi \Xi$                                                                                                     |                                                           |               |                                                     |                           |
| $\Lambda_b \rightarrow \Lambda_c^+ \pi^- \rightarrow p K^+ \pi^- \pi^-$                                                            | 94-227                                                    | L             |                                                     |                           |
| $\Lambda_b \rightarrow \Lambda_c^+ \pi^- \rightarrow p K^0 \pi^-$                                                                  |                                                           |               |                                                     |                           |
| See also general reports: 94-229, 95-10, 95-178, 96-139, CR96/2, CR96/5.                                                           |                                                           |               |                                                     |                           |

## t-quark physics

The LHC is a top quark factory. Even at the initial luminosity of  $10^{33} \text{ cm}^{-2} \text{s}^{-1}$  the  $t\bar{t}$  pairs will be produced copiously at the speed of one per minute. The rates of muons from top events are shown in Fig. 2.4 as diamonds. One can see from Table 2.6 that the triggering is rather easy. An interesting case is the last but one channel in the table. It offers the most precise measurement of the top mass, because of the  $J/\psi$  constraint. However, muons from  $J/\psi$  are very soft and the trigger threshold should be as low as possible. In fact this is the only channel which may require three lepton trigger.

**Table 2.6:** Study of the top quark

| physics channel                                                                                          | references         | $\mathcal{L}$ | offline cut (GeV/c)                           | trigger               |
|----------------------------------------------------------------------------------------------------------|--------------------|---------------|-----------------------------------------------|-----------------------|
| $t\bar{t} \rightarrow W_{\ell\nu}^{\pm} W_X^{\mp}$                                                       | LOI 8.1.4<br>92-34 | VL            | $p_T(\ell) > 50$<br>$E_T^{jet} > 50$          | e, $\mu_i$            |
| $t\bar{t} \rightarrow W_{\ell\nu}^{\pm} W_X^{\mp} b/\bar{b}_{\ell}$                                      |                    | L             | $p_T(\ell) > 40, 15$<br>$E_T^{jet} > 30$      | 2e, 2 $\mu$ , e $\mu$ |
| $t\bar{t} \rightarrow W_{\ell\nu}^{\pm} W_X^{\mp} b_{\ell} \bar{b}_{\ell}$                               | 93-73<br>93-118    | L             | $p_T(\ell) > 30, 4, 4$<br>$E_T^{jet} > 30$    | 2e, 2 $\mu$ , e $\mu$ |
| $t\bar{t} \rightarrow W_{\ell\nu}^{\pm} W_X^{\mp} b/\bar{b}_{J\psi \rightarrow \ell\ell}$                |                    | L,H           | $p_T(\ell) > 30, 4, 4$<br>$E_T^{jet} > 30$    | 2e, 2 $\mu$ , e $\mu$ |
| $t\bar{t} \rightarrow W_{\ell\nu}^{\pm} W_X^{\mp} \bar{b}_{\ell} b/\bar{b}_{J\psi \rightarrow \ell\ell}$ | N99/65             | H             | $p_T(\ell) > 15, 4, 4, 4$<br>$E_T^{jet} > 30$ | 3 $\ell$              |
| $\bar{b}t \rightarrow W_{\ell\nu} b$                                                                     | N99/48             | L             | $p_T(\ell) > 20$<br>$E_T^{jet} > 15$          | e, $\mu_i$<br>jet     |

### Minimum bias, QCD and Standard Model physics

This physics is important because of two reasons. First, tests on the Standard Model are interesting in itself. Any possible deviation, if observed, can give us a “window for a new physics”. Basic triggers are listed in Table 2.7. Second, the Standard Model physics is a background to any physics process we are looking for. It determines all trigger thresholds and rates.

**Table 2.7:** Study of minimum bias, QCD and Standard Model physics

| physics channel          | trigger                 |
|--------------------------|-------------------------|
| “soft physics”           | min. bias               |
| SM tests: WZ, W $\gamma$ | 2e, 2 $\mu_i$ , e $\mu$ |
| inclusive W              | e, $\mu_i$              |
| inclusive Z              | 2e, 2 $\mu_i$           |

Many new expected particles decay to W or Z bosons. Therefore the inclusive W and Z production can be considered as benchmark processes for the trigger. The rates of muons from W and Z events are shown in Fig. 2.4 as triangles. They should be compared to “minimum bias” rates, indicated by squares. By “minimum bias” muon rates we understand here the rate of muons created by decays of u, d, s, c, and b quarks. There are several important observations to be done on Fig. 2.4. First, the  $p_T^{\text{cut}}$  dependence of the rate is very strong. Therefore, changing the  $p_T^{\text{cut}}$  can be a very effective tool of controlling the trigger rate. Second, the double muon rate is two orders of magnitude lower than the single muon rate for the same threshold. Thus, the trigger  $p_T$  threshold can be much lower for the processes producing more than one muon. Third, the muon rate below  $p_T = 10$  GeV/c is dominated by “minimum bias” physics. All these observations have an important impact on the design of the muon trigger, which will be described in the next chapters.



**Fig. 2.4:** Single muon rates ( $|\eta| < 2.4$ ) defined as a number of events with at least one muon with  $p_T$  above a given threshold  $p_T^{\text{cut}}$  [2.7].



**Fig. 2.5:** Double muon rates ( $|\eta| < 2.4$ ) defined as a number of events with at least two muons with  $p_T$  above a given threshold  $p_T^{\text{cut}}$ . "MB 2mu" stands for events with both  $\mu$  coming from hadron decays. "MB mix" denotes events with second  $\mu$  coming from top, W, Z or Drell-Yan. [2.7]

### 2.1.4 Trigger Requirements

**Table 2.8:** Particles to be studied at LHC.

|                                              | light     | medium               | heavy                               |
|----------------------------------------------|-----------|----------------------|-------------------------------------|
| mass ( $\text{GeV}/c^2$ )                    | $<< 100$  | $\sim 100$           | $>> 100$                            |
| particle                                     | b-quark   | t, W, Z, light higgs | heavy higgs, Z', W', SUSY particles |
| luminosity ( $\text{cm}^{-2}\text{s}^{-1}$ ) | $10^{33}$ | $10^{33}, 10^{34}$   | $10^{34}$                           |

Particles discussed in the previous chapter can be divided into three classes, as shown in Table 2.8. Each class has different requirements for the trigger. It has been shown [2.8] that medium and heavy particles (see Table 2.8) can be effectively recognized applying a logical *OR* of the following conditions:

- single  $l^\pm$  or  $\gamma$  with  $p_T > 60 \text{ GeV}/c$ ,
- two  $l^\pm$  or  $\gamma$  with  $p_T > 15 \text{ GeV}/c$ ,
- $E_T^{\text{miss}} > 150 \text{ GeV}$ .

The rate of processes selected by these criteria is dominated by standard physics background (Table 2.9). This does not include instrumental background and therefore the L1 Trigger output rate can be much higher. However, the instrumental background should be eliminated by higher trigger levels, and one can consider the rate of 100 Hz as a first estimate of needed mass storage (e.g. tape drives) capacity.

**Table 2.9:** Standard physics background at LHC for  $L = 10^{34} \text{ cm}^{-2}\text{s}^{-1}$ .

| condition                             | process                                                     | rate  |
|---------------------------------------|-------------------------------------------------------------|-------|
| 1 $\gamma$ of $E_T > 60 \text{ GeV}$  | $\text{jet} \rightarrow \pi^0 \rightarrow \gamma\gamma$     | 10 Hz |
| 2 $\gamma$ of $E_T > 15 \text{ GeV}$  | $\text{jets} \rightarrow \pi^0, s \rightarrow \gamma\gamma$ | 10 Hz |
| 1 $l$ of $p_T > 60 \text{ GeV}/c$     | $W \rightarrow l, \text{ jet} \rightarrow l$                | 10 Hz |
| 2 $l$ of $p_T > 15 \text{ GeV}/c$     | $Z \rightarrow ll$                                          | 20 Hz |
| $E_T^{\text{miss}} > 150 \text{ GeV}$ | QCD jets                                                    | 10 Hz |

Requirements for the L1 Trigger should be derived from the tables of the previous chapter. One can see that most of the interesting physics processes produce at least two trigger objects. Only a very few channels require single-object triggers *per se*. Those are:

- $h, A, H \rightarrow l^\pm \tau\text{-jet } X$
- $B_d^0 \rightarrow \pi^+ \pi^-$  with  $b \rightarrow \mu_{\text{tag}}$  or  $b \rightarrow e_{\text{tag}}$
- $\bar{B}_S^0 / B_S^0 \rightarrow D_S^\pm, D_S^\pm \rightarrow \phi \pi^\pm, \phi \rightarrow K^+ K^-, b \rightarrow \mu_{\text{tag}}$
- inclusive  $W$

In the first two channels one can still try to apply multi-object triggers looking at the  $\tau$ -jet or treating the  $\pi^+\pi^-$  pair as a kind of narrow jet.

The fact that multi-object triggers are of primary importance at LHC has very substantial implications for the principle of the trigger operation. Different combination of objects may require different trigger thresholds. Therefore one should avoid any explicit cut on single objects on the level of Muon or Calorimeter Trigger. The purpose of these triggers is to identify objects, estimate their  $p_T$  or  $E_T$  and send them to the Global Trigger. The Global Trigger is the only place where the objects are combined and the cuts are applied depending on a given combination.

Single-object triggers are used mainly to recover the multi-object events which were not recognized by the multi-object triggers, because of incomplete acceptance. Therefore, the criteria on their thresholds are not very strict. The actual working point should be chosen as a result of the trade off between the efficiency and the rate. A reasonable upper limit is about 100 GeV/c. Beyond this point efficiency for various heavy objects is significantly degraded (see Section 2.1.3). The useful lower limit for  $\mu/e/\gamma$  at  $L = 10^{34} \text{ cm}^{-2}\text{s}^{-1}$  is about 20 GeV/c. Below this value one cannot further improve the efficiency for objects like W, Z or heavier, whereas the rate is dominated by leptons from quark decays (except the top quark). At this point, the rate of every single object is of the order of a kHz (see e.g. Fig. 3.11). Adding all the channels together and leaving some room for an instrumental background one can expect the total L1 of the order of  $10^4$  Hz. Thus, in order to have some safety margin, the Higher Level Trigger (HLT) should be able to receive  $\sim 10^5$  Hz of events. The exact requirement chosen for the CMS startup is 75 kHz.

One can summarize this section with the following list of requirements:

- muon and calorimeter L1 recognize objects and estimate their  $p_T$  or  $E_T$ ;
- cuts are applied by the Global L1 Trigger;
- expected thresholds for photons, electrons and muons are as shown in Table 2.10;
- input of the HLT should be able to accept 75 kHz of events.

**Table 2.10:** Expected L1 thresholds in GeV/(c).

| luminosity                             | $e/\gamma$ | $2 e/\gamma$ | $e_b$     | $2 e_b$  | $\mu$     | $2 \mu$   |
|----------------------------------------|------------|--------------|-----------|----------|-----------|-----------|
| $10^{33} \text{ cm}^{-2}\text{s}^{-1}$ | 15-40      | $\sim 10$    | $\sim 10$ | $\sim 5$ | $\sim 10$ | $\sim 5$  |
| $10^{34} \text{ cm}^{-2}\text{s}^{-1}$ | 20-100     | 15-20        | —         | —        | 20-100    | $\sim 10$ |

## 2.1.5 Background

### Calorimeter Trigger background

Electron/photon trigger background is dominated by jets fragmenting into high  $p_T$  neutral pions. A large fraction of this background is suppressed by the H/E and isolation vetoes. However, given large jet production rate at LHC energies this background dominates over true electrons produced in heavy flavor decays and direct photon production. Lack of high precision in jet fragmentation models and parton distribution functions, particularly at LHC energies, results in a large uncertainty in this background. Therefore, we require multiple controls and programmability of isolation thresholds to tune the backgrounds online. The background for  $\tau$

trigger is due to jets fragmenting primarily to just one or two particles that carry most of the jet  $p_T$  yielding a very narrow jet. Uncertainties on these backgrounds are also large. Therefore, we adopt a large safety margin requiring that the simulated rates should be a factor of 3 less than the maximum output rate of the trigger system.

The background for jet trigger is completely dominated by mismeasured jets. Due to the finite resolution of the calorimeter system, lower  $E_T$  jets get promoted to higher  $E_T$  causing a significant background. At high luminosities accidental overlap of jets produced in multiple interactions can also lead to lower  $E_T$  jets passing the jet thresholds. The background due to multiple interactions is more important for multi-jet triggers where various jets may come from different primary interactions. At higher levels of trigger this background can be reduced by requiring vertex reconstruction. However, at L1 both these backgrounds are irreducible.

The background for missing  $E_T$  trigger is caused by mismeasurement of missing  $E_T$  due to calorimeter energy resolution, limited acceptance of the calorimeter, geometrical problems, holes due to broken channels, bending of charged particles in high magnetic field, use of integer scale  $E_T$  and  $\phi$  look-up-tables in calculations, etc. In order to limit this background rate from QCD events the trigger energy scales and angular resolution are optimized.

### Muon Trigger background

It can be seen from Fig. 2.4 that the total rate of prompt muons in CMS exceeds  $10^6$  Hz. This is three orders of magnitude above the total L1 output rate limit. Therefore the most severe Muon Trigger background are real prompt muons with overestimated  $p_T$ . The second most dangerous background are muons from hadron decays. The fact that they are often not pointing to the vertex can be used to eliminate them, but on the other hand it can confuse the momentum measurement. Similar problems can be caused by beam halo muons. In principle also cosmic muons could contribute to the trigger rate but in practice this contribution is negligible.

False triggers can be caused by hadronic punch-through giving track segments in muon stations. Single hit rate in muon chambers is dominated by thermal neutrons producing photons and electrons. Isolated single hits cannot affect the trigger, but the high rate of them can disturb trigger algorithms. Last but not least source of background are energetic muons emitting photons, hard  $\delta$ -electrons and  $e^+e^-$  pairs. Electromagnetic showers created in muon stations can spoil pattern recognition and momentum measurement.

All this background sources have to be taken into account designing the trigger. They are discussed in detail in Section 8.4

## 2.2 Calorimeter Trigger Requirements

The CMS Calorimeter Trigger has to fulfill the following requirements:

- Identify in the calorimeters, the following objects:
  - the four most energetic isolated electron/photon candidates in  $|\eta| < 2.5$ ;
  - the four most energetic non-isolated electron/photon candidates in  $|\eta| < 2.5$ ;
  - the four most energetic jets in the central region,  $|\eta| < 3$

- the four most energetic jets in the forward region,  $3 < |\eta| < 5$ ;
- the four more energetic tau-like jets in  $|\eta| < 2.5$ .
- Provide a measurement of the transverse energy and position information for the objects identified.
- Identify bunch crossing each object is originating from.
- Count the number of jets fulfilling several programmable energy and position cuts.
- Measure the missing transverse energy and the total transverse energy in the calorimeters using complete system, i.e.  $|\eta| < 5$ .
- Flag the quiet regions in the calorimeters and the regions with an energy deposit compatible with a minimum ionizing particle for use by the muon trigger.
- Compute the quantities listed above in pipeline mode with an input frequency of 40 MHz and with a latency between the collision time and the input of data to the Global Trigger smaller than 85 bunch crossings. No dead time is allowed.
- Create histograms for luminosity monitoring accumulated over several crossings and accessible by detector control system at about few seconds intervals.
- Calorimeter Trigger  $E_T$  scales should be programmable such that the LSB values should allow tuning of parameters for electron/photon triggers while keeping the maximum values high enough that single tower saturation probability is very low.
- Calorimeter Trigger should provide the selectivity and the energy resolution necessary to keep the total output rate of the calorimeter triggers below 12.5 kHz at high luminosity, as defined in Section 1.2.3.

Calorimeter Trigger rate should be tunable by adjusting trigger cutoffs and programmable parameters. As a benchmark we require better than 90% efficiency for top decays to electron and jets and H ( $115 \text{ GeV}/c^2$ ) decays to two photons at high luminosity. In addition, for  $\tau$ , jet and  $E_T^{\text{miss}}$  triggers we require high efficiency for SUSY higgs masses of  $200 \text{ GeV}/c^2$  and sparticle masses of  $300 \text{ GeV}/c^2$  at high luminosity.

## 2.3 Muon Trigger Requirements

The basic tasks of the CMS Muon Trigger are:

- muon identification,
- transverse momentum measurement,
- bunch crossing identification.

It has to fulfill the following requirements.

**Geometrical coverage: up to  $|\eta|=2.4$ ,** in order to cover the entire area of the muon system.

**Latency:  $< 3.2 \mu\text{s}$ .** Total trigger processing, including  $2 \times 90 \text{ m}$  cables ( $0.9 \mu\text{s}$ ) to the control room, should stay within the length of the tracker pipelines equal to 128 bunch crossings. This implies that the trigger algorithms cannot use full information available from the detectors.

**Trigger dead time: not allowed.** Every bunch crossing has to be processed in order to maintain high efficiency crucial for many physics channels with low cross section.

**Design output rate:**  $< 12.5 \text{ kHz}$  for luminosities  $< 10^{34} \text{ cm}^{-2}\text{s}^{-1}$ , as described in Section 1.2.3. About 8 kHz is assumed for the single muon trigger. This implies a rejection factor of  $\sim 1:10000$  at the highest luminosity.

**Background rejection: trigger rate due to background (both physics and instrumental) should not exceed the rate of prompt muons from heavy quark decays.** This is necessary to maintain the rejection factor stated above. The prompt muon rate is irreducible except for channels where the isolation criterion can be applied (see below).

**The fraction of ghosts must not exceed 0.5 %.** Ghosts are fake muon candidates usually created when different trigger elements receive hits or track segments of the same, real muon. Resulting false dimuon trigger rate should not exceed the rate of real muon pairs, which is about 0.5 % of single muon rate for the same  $p_T$  threshold.

**Low  $p_T$  reach: should be limited only by muon energy loss in the calorimeters.** It is equal to about 4 GeV/c in the barrel and it decreases with  $|\eta|$  down to  $\sim 2$  GeV/c. This is required mainly by b-quark physics at  $L = 10^{33} \text{ cm}^{-2}\text{s}^{-1}$ .

**The highest possible  $p_T$  cut: ~50-100 GeV/c.** Expected threshold needed to keep the single muon trigger rate below 10 kHz at  $L = 10^{34} \text{ cm}^{-2}\text{s}^{-1}$  is about 25 GeV/c. Uncertainty in estimates of cross sections and background levels requires large safety margin. Increasing the threshold from 25 GeV/c to 50-100 GeV/c reduces the rate by an order of magnitude.

**Isolation: transverse energy  $E_T$  deposited in each calorimeter region of  $\Delta\phi \times \Delta\eta = 0.35 \times 0.35$  around a muon is compared with a threshold.** This function is needed to suppress the rate of background and prompt muons from heavy quark decays when triggering on muons not accompanied by jets. This is particularly useful in channels like  $h, A, H \rightarrow \mu\mu$ ,  $h, A, H \rightarrow \tau\bar{\tau}$ ,  $t\bar{t} \rightarrow WW$  and gluino decays.

**Output to the Global Trigger: up to 4 highest  $p_T$  muons in each event.** In principle only 3 muons are necessary for the Global Trigger to perform single- and multi object cuts including the three-muon trigger. Delivering 4 muons we reduce the probability that a low  $p_T$  isolated muon will not be selected because of the presence of higher  $p_T$  non-isolated muons. This way we also reduce the probability of accepting ghosts instead of real muons.

## 2.4 Trigger Efficiency Measurement

### 2.4.1 Electron/Photon and Muon Triggers

The basic tool for the measurement of the lepton triggers efficiencies are Z events decaying into leptons triggered by one of the leptons ( $p_T > 30 \text{ GeV}/c$ ). The acceptance of this trigger for Z events is close to 100%. The analysis of the trigger data relative to the second lepton allows to extract the single lepton trigger efficiency. The same events allow to measure the double lepton trigger efficiency.

At high luminosity ( $10^{34} \text{ cm}^{-2}\text{s}^{-1}$ ), the final rate of single leptons (isolated) with  $p_T > 30 \text{ GeV}/c$  is 100 Hz, mainly from W lepton decays. Retaining just the tracker and calorimeter

data in cones around the leptons, the final data volume and rate is suitable for storage. This electron sample will also be used for the crystal calorimeter calibration.

At low luminosity ( $10^{33}\text{cm}^{-2}\text{s}^{-1}$ ) the rate of Z events decaying into electrons or muons is about 0.5 Hz per class. About one week of running is sufficient to accumulate  $10^5$  Z events per class, which allows to determine the electron or muon trigger efficiency with a precision of 1-2% in each of one hundred detector regions.

The main source of photons are  $\pi^0$  decays. In order to measure the L1 trigger efficiency for isolated photons we will select jet events with  $E_T > 30\text{ GeV}$  with the L1 jet trigger, using an adequate prescaling factor to reduce the rate to an acceptable level. A sample of isolated photons, adequate for L1 trigger efficiency measurements, is selected by the high-level triggers. At low luminosity, assuming a prescaling of 100 for the L1 jet trigger, we estimate that the final rate of isolated photons with  $E_T > 20\text{ GeV}$  is about 0.5 Hz.

### 2.4.2 Triggering of leptons inside jets

The method described in the previous section provides a measurement of the trigger efficiency for isolated leptons. The efficiencies for leptons from b decays could be different.

At low luminosity, where this measurement is required, we will use the events selected by the single muon trigger ( $p_T > 8\text{ GeV}/c$ ). This trigger has a L1 rate of a few kHz at  $10^{33}\text{cm}^{-2}\text{s}^{-1}$ . This rate is dominated by b decays. About 1-2% of these events have a second muon or electron with  $p_T > 8\text{ GeV}/c$  suitable for efficiency measurements. Loose HLT cuts will provide a sample suitable for off-line study of the trigger efficiency.

Another possibility is to use a sample which requires in Level 1 one jet above 30 GeV. Selecting (prescaling) 1% of the triggers one gets a L1 rate of a few kHz. We estimate that about 1% of these events have a lepton with  $p_T > 8\text{ GeV}$ . Loose lepton selection criteria are used in the farm to select a sizable sample of events suitable for off-line trigger efficiency determination.

### 2.4.3 $\tau$ Trigger

The single lepton trigger mentioned in Section 2.4.1 will also be used to measure the efficiency of the  $\tau \rightarrow \text{hadrons}$  trigger. In this case we trigger on the leptonic decay of one of the taus from Z's and we measure the trigger efficiency for events where the other  $\tau$  decays hadronically. The fraction of  $Z \rightarrow \tau\tau$  events useful for this analysis is around 10%. About one week is needed at low luminosity to accumulate  $10^4$  of such events.

### 2.4.4 Jet Triggers

The minimum-bias L1 trigger prescaled by a large factor (order of  $10^5$ ) will be used to measure the efficiency of the low energy jet trigger. About 2-3% of these events have a jet with  $E_T > 30\text{ GeV}$ . Loose cuts are used in the farm to select an event sample for off-line analysis. In the same fashion, prescaled low energy threshold jet triggers will be used to measure the efficiency of high energy jet triggers.

### 2.4.5 Missing $E_T$ Trigger

This is a global rather than a local trigger. In consequence the statistics needed to measure the efficiency of this trigger is not an issue. We will use the single lepton (electron+muon) trigger with  $p_T > 30$  GeV/c to select a sample of  $W \rightarrow$ lepton events. At low luminosity these events are produced at a rate of  $\sim 15$  Hz. About 0.1% of them have missing  $E_T$  above 80 GeV and are adequate to measure the missing  $E_T$  trigger efficiency, so that two weeks are needed to accumulate around 5000 events. The trigger efficiency for higher thresholds can be obtained with top events captured by the L1 lepton triggers.

### 2.4.6 Technical Triggers

Trigger efficiency can be also evaluated using prescaled triggers. Since Calorimeter and Muon Triggers do not apply any cut, but measure the  $E_T$  ( $p_T$ ) of recognized objects, the Global L1 Trigger has a possibility to accept some fraction of events with very low thresholds. The Muon Trigger has additional possibility to measure efficiency of the DT/CSC system while triggering with RPC and *vice versa*.

## 2.5 Requirements for Heavy Ion Runs

CMS detector will also be used for heavy ion studies [2.9]. The requirements for the trigger are, however, quite different from the pp case. First of all, the LHC bunch structure is different [2.10]. The bunch spacing is 125 ns. The maximal luminosities and collision rates for available ion species are given in Table 2.11. They are calculated for two experiments (ALICE and CMS) taking data at the same time.

**Table 2.11:** Luminosities and collision rates for different ion species [2.11].

|                                              | pp        | O O                 | Ar Ar               | Kr Kr               | Sn Sn               | Pb Pb     |
|----------------------------------------------|-----------|---------------------|---------------------|---------------------|---------------------|-----------|
| A                                            | 1         | 16                  | 40                  | 84                  | 120                 | 208       |
| Z                                            | 1         | 8                   | 18                  | 36                  | 50                  | 82        |
| $\sigma$ [barn]                              | 0.55      | 1.5                 | 3.1                 | 4.5                 | 5.5                 | 7.6       |
| luminosity [ $\text{cm}^{-2}\text{s}^{-1}$ ] | $10^{34}$ | $3.1 \cdot 10^{31}$ | $1.0 \cdot 10^{30}$ | $6.6 \cdot 10^{28}$ | $1.7 \cdot 10^{28}$ | $10^{27}$ |
| av. collision rate [kHz]                     | 550 000   | 46 500              | 3100                | 300                 | 94                  | 7.6       |
| $1\mu$ trigger rate [kHz]                    | 190       | 120                 | 21                  | 5.7                 | 2.9                 | 0.5       |

One should take with caution the luminosity quoted for O-O collisions. The average collision rate of 46.5 kHz is about 6 times higher than the bunch crossing frequency of 8 kHz. It means an average pileup of 6 O-O collisions. This is not acceptable for heavy ion studies and in practice an order of magnitude lower luminosity will be used.

Particle density in nucleus-nucleus collisions is much higher than in the pp case. For central Pb-Pb collisions it may reach 8000 charged particles per rapidity unit<sup>1</sup>, to be compared with 5 for pp collisions. For the minimum bias Pb-Pb collision  $dN^\pm/dy$  is expected to be less than 2500.

Resulting data volumes for central Pb-Pb collisions have been estimated [2.12] and they are given in Table 2.12. Total event size is expected to be about 1500 MBytes. Assuming mass storage of 100 MBytes/s one can write to tape  $\sim 70$  events per second.



**Fig. 2.6:** Expected particle rates in LHC min. bias Pb-Pb events [2.13].  
The hadron rates are predicted by Monte Carlo or rescaled from CDF and LHC pp data. The muon rates are rescaled from the LHC pp data.

**Table 2.12:** Data volumes for Pb-Pb central events [2.12].

|                                          | k Bytes     |
|------------------------------------------|-------------|
| Pixel barrel                             | 300         |
| Outer Silicon Tracker ( $ \eta  < 1.5$ ) | 930         |
| ECAL - 1 time slice                      | 173         |
| HCAL full                                | 28          |
| Muon System ( $ \eta  < 1.5$ )           | 10          |
| Total                                    | $\sim 1500$ |

<sup>1</sup>. Recent studies suggest  $dN^\pm/dy=2000$  as an upper limit for central Pb-Pb collisions, but we keep  $dN^\pm/dy=8000$  for all predictions quoted in this document, thus having the safety factor of 4.

The entire data flow from detector Front-Ends (FE) through Front End Drivers (FED), Readout Dual Port Memories (RDPM) and through the Switch have been also evaluated [2.12]. It was found that the most critical is the bandwidth of FE→FED and FED→RDPM connections for the Pixel Detector. It limits the L1 output rate to about 5 kHz for the central Pb-Pb collisions. Particle spectra for  $p_t > 1 \text{ GeV}/c$  can be described by a scaling formula

$$\sigma_{AA}^{hard} = A^{2 \cdot 0.95} \cdot \sigma_{pp}^{hard}$$

**Table 2.13:** Study of heavy ion physics.

| physics channel                               | references                              | $\mathcal{L}$ | offline cut (GeV/c)     | trigger |
|-----------------------------------------------|-----------------------------------------|---------------|-------------------------|---------|
| $Pb Pb \rightarrow Y \rightarrow \mu^+ \mu^-$ | 12.5.1, N97/89, N97/95<br>N99/4, N99/52 | –             | $p_T(\mu) > 2\text{-}4$ | $\mu$   |
| $Pb Pb \rightarrow \mu^+ \mu^-$               | N99/16, N99/17                          | –             | $p_T(\mu) > 3.5$        | $\mu$   |
| $A A \rightarrow \gamma\gamma$                | N98/9                                   | –             |                         | 2e      |
| $Pb Pb \rightarrow \gamma jet$                | N98/63, N99/16                          | –             |                         | e, jet  |
| $Pb Pb \rightarrow jet$                       | N98/25                                  | –             | $E_T^{jet} > 40$        | jet     |
| $Pb Pb \rightarrow jets$                      | 12.5.2, N98/25, N99/16                  | –             | $E_T^{jet} > 40$        | 2 jets  |

See also general reports: CR97/15, IN97/32, N98/61, CR99/15.



**Fig. 2.7:** Acceptance for  $Y \rightarrow \mu^+ \mu^-$  as a function of muon trigger threshold  $p_T^{cut}$ , normalised to  $p_T^{cut} = 4.5 \text{ GeV}$ .

The result of rescaling CDF and LHC pp data is compared to PYTHIA and HIJING Monte Carlo predictions in Fig. 2.6. In this document we use rescaled LHC prediction as it is a rather conservative estimate. The Physics requirements for the trigger are summarized in Table 2.13. The most critical one is the lowest possible muon threshold. This is crucial to study

quarkonia production, which is a probe of quark-gluon plasma. It is illustrated in Fig. 2.7 where the relative Y acceptance is shown as a function of the muon threshold This requirement means that muon  $p_T$  should be limited only by energy loss in calorimeters, which is about 4 GeV/c in the barrel and it goes down to about 2 GeV/c in the endcap.

## References

- [2.1] T. Sjostrand et al., *High Energy Physics Event Generation with PYTHIA 6.1*, **hep-ph/0010017**, LU TP 00-30; T. Sjostrand, Computer Physics Commun. **101** (1997) 232.
- [2.2] ISAJET, F. Paige and S. Protopopescu, in *Supercollider Physics*, p.41, ed. D. Soper (World Scientific, 1986); H. Baer, F. Paige, S. Protopopescu and X. Tata, in *Proceedings of the Workshop on Physics at Current Accelerators and Supercolliders*, ed. J. Hewett, A. White and D. Zeppenfeld, (Argonne National Laboratory, 1993).
- [2.3] CMS Simulation Package CMSIM — Users' Guide and Reference Manual, <http://cmsdoc.cern.ch/cmsim/cmsim.html>
- [2.4] CMS Reconstruction Software: The ORCA Project, **CMS IN-1999/035**.
- [2.5] The Compact Muon Solenoid — Technical Proposal, **CERN/LHCC 94-38**.
- [2.6] The Compact Muon Solenoid — Letter Of Intent, **CERN/LHCC 92-3**.
- [2.7] N. Neumeister et al., *Monte Carlo simulation for High Level Trigger studies in single and di-muon topologies*, **CMS IN 2000/053**.
- [2.8] C. Laurencio, J. Varela, **CMS TN/95-025**.
- [2.9] *Heavy Ion Physics Programme in CMS*, CMS Note in preparation.
- [2.10] *The Large Hadron Collider - Conceptual Design*, **CERN/AC/95-05** (LHC).
- [2.11] D. Brandt, *The LHC with ions from a machine point of view*, talk given at the CMS Heavy Ion Meeting, St. Petersburg, June 2000.
- [2.12] A. Racz, et al., **CMS IN 2000/027**.
- [2.13] G. Wrochna, **CMS Note 1997/089**.

# 3 Calorimeter Trigger Introduction

## 3.1 CMS Calorimetry

### 3.1.1 Electromagnetic Calorimeter

One of the principal CMS design objectives is to construct a very high performance electromagnetic calorimeter (ECAL). A scintillating crystal calorimeter offers excellent performance for energy resolution since almost all of the energy of electrons and photons is deposited within the crystal volume. CMS has chosen lead tungstate crystals which have high density, a small Molière radius and a short radiation length allowing for a very compact calorimeter system.

The CMS electromagnetic calorimeter, shown in Fig. 3.1, consists of 76,832 lead-tungstate crystals equipped with avalanche photodiodes in the barrel (EB) or vacuum phototriodes in the endcap (EE) and associated electronics operating in a challenging environment: a magnetic field of 4T and a radiation dose of 1-2 kGy/year for LHC operation at maximum luminosity.

The barrel crystals have a front face of about  $22 \times 22 \text{ mm}^2$  — which matches well the Molière radius of 22 mm. To limit fluctuations on the longitudinal shower leakage of high-energy electrons and photons, the crystals were chosen with a total thickness of 26 radiation lengths — corresponding to a crystal length of about 23 cm. An R&D programme has shown that radiation affects neither the scintillation mechanism nor the uniformity of the light yield along the crystal. It only affects the transparency of the crystals through the formation of colour centers. This light loss will be monitored by a light-injection system.



**Fig. 3.1:** The CMS Electromagnetic Calorimeter.

In the barrel, the crystals are organized in supermodules containing 1700 crystals, 20 along  $\phi$  times 85 along  $\eta$ . Eighteen supermodules form one half-barrel cylinder. Truncated

pyramid-shaped crystals are mounted in a geometry which is off-pointing with respect to the interaction vertex, with a 3° tilt both in  $\phi$  and in  $\eta$ .

The endcap part of the crystal calorimeter covers a pseudorapidity range from 1.48 to 3.0. The mechanical design is based on an off-pointing pseudo-projective geometry using tapered crystals of the same shape and dimensions (30x30 mm<sup>2</sup> rear face). The two endcaps contain 15,632 crystals in total.

The scintillation light from the crystals will be captured by a avalanche photodiode, amplified and digitized. The resulting data are transported off the detector via optical fibres to the readout and trigger systems in the counting room.

### 3.1.2 Hadronic Calorimeter

The Hadronic Calorimeter (HCAL), plays an essential role in the identification and measurement of quarks, gluons, and neutrinos by measuring the energy and the direction of jets and of missing transverse energy flow in events. Missing energy forms a crucial signature of new particles, like the supersymmetric partners of quarks and gluons. For good missing energy resolution, a hermetic calorimetry coverage to  $|\eta|=5$  is required. The HCAL will also aid in the identification of electrons, photons and muons in conjunction with the tracker, the electromagnetic calorimeter, and muon systems.

The hadron barrel (HB) and hadron endcap (HE) calorimeters, shown in Fig. 3.2, are sampling calorimeters with 50 mm thick copper absorber plates which are interleaved with 4 mm thick scintillator sheets. Copper has been selected as the absorber material because of its density. The HB is constructed of two half-barrels each of length 4.3 m. The HE consists of two large structures, situated at each end of the barrel detector and within the region of high magnetic field. Because the barrel HCAL inside the coil is not sufficiently thick to contain all the energy of high energy showers, additional scintillation layers (HOB) are placed just outside the magnet coil. The full depth of the combined HB and HOB detectors is approximately 11 absorption lengths. Light emission from the tiles is in the blue-violet. This light is absorbed by the wave shifting fibers which fluoresce in the green. The waveshifted light is conveyed via clear fiber waveguides to hybrid photodiodes (HPDs).

There are two hadronic forward (HF) calorimeters, one located at each end of the CMS detector, which complete the HCAL coverage to  $|\eta|=5$ . The HF detector, situated in a harsh radiation field, is built of steel absorber plates and radiation-resistant quartz fibers, of selected lengths, which are inserted into the absorber plates. The energy of jets is measured from the Cherenkov light signals produced as charged particles pass through the quartz fibers. These signals result principally from the electromagnetic component of showers, which results in good directional information for jet reconstruction. Fibre optics convey the Cherenkov signals to photomultiplier tubes which are located in radiation shielded zones.

## 3.2 Calorimeter Requirements

The ECAL and HCAL detectors must function as designed in order for the calorimeter trigger system to fulfill its role. The calorimeter trigger system requires digitized  $E_T$  values from all its ECAL crystals and HCAL towers for every 25 ns LHC cycle. For each trigger tower in the EB, EE, HB, HE and HF, the Trigger Primitives Generator sums the  $E_T$  values from the constituent



**Fig. 3.2:** The CMS Hadronic Calorimeter.

ECAL crystals or HCAL readout towers to obtain the trigger tower  $E_T$  and attaches it to the correct bunch crossing. These two measurement processes are subject to various effects which can compromise their respective performance. These effects can:

- increase the constant term present in the resolution of the  $E_T$  measurement
- worsen the electronics noise term of the very front-end

Examples of effects entering in the first category are the imperfect knowledge of the calibration constant of each electronics channel, deviation from the optimal value of the phase adjustment between the maximum of the signal and the sampling clock and dispersion of the peaking times of the very front end electronics. Effects of algorithms used in the TPG to extract the primitives, the use of integer arithmetic and non-linear compression used in the coding of the results enter the second category.

Resolution degrading effects impact the performance of the calorimeter trigger due to the worsening of the efficiency turn-on of the electron/photon trigger versus  $p_T$ . A precision on the knowledge of the calibration constants of  $\pm 2\%$  is necessary. We require less than  $\pm 10\%$  dispersion around the mean value of the peaking time of the ECAL very front end electronic channels to limit the loss in resolution, e.g., this level of dispersion results in  $\pm 1\%$  resolution loss for electromagnetic showers  $10 \text{ GeV } E_T$  and above. We also require less than  $\pm 2.5 \text{ ns}$  dispersion in the phase adjustment of the sampling clock with the signal maximum, as this already results in a loss of  $+1.8\%, -2.1\%$  in resolution.

Noise degrading effects not only influence the resolution of any low amplitude signals, i.e., less than  $1 \text{ GeV}$ , but also degrade the ability of the TPG to identify the correct bunch crossing of these very low energy showers. This degrades the performance of the electromagnetic isolation criterion as performed by the regional trigger for electron and photon candidates. The same effect also influences the rejection power of the muon isolation and identification criteria. We require a noise level of  $30 \text{ MeV}$  or less per ECAL crystal readout sample, to enable the detection of signal with a  $\sim 500 \text{ MeV}$  amplitude in a single strip of five ECAL crystals with  $95\%$  efficiency. Raising this noise value to  $60 \text{ MeV}$  per sample degrades the rejection power of the electromagnetic

isolation. The use of integer arithmetic in computation is responsible for an additional worsening effect on the electronics noise term. This effect results in an effective noise term equal to 225 MeV, for a trigger tower with 25 channels and 30 MeV per readout sample noise.

### 3.3 Calorimeter Trigger Algorithms

The calorimeter trigger receives  $E_T$  and fine grain profile information from the calorimeter electronics and finds isolated and non-isolated electrons or photons,  $\tau$  candidates, jets and missing  $E_T$ . The geometry of calorimeters and their mapping to the trigger electronics and algorithms has been chosen to provide the best performance, i.e., meet the DAQ requirement for its output rate and high efficiencies for discovery physics, while keeping the system size and costs at a reasonable level. The calorimeter trigger algorithms are implemented in the Trigger Primitives Generator (TPG), Regional Calorimeter Trigger (RCT) and the Global Calorimeter Trigger (GCT). A block diagram of these subsystems and other trigger subsystems that they directly interacts with are shown in Fig. 3.3 and discussed in the following sections.



**Fig. 3.3:** The calorimeter trigger system overview.

### 3.3.1 Geometry and Definitions

#### Trigger Tower

The trigger tower ( $\eta, \phi$ ) dimension results from a compromise between the background rate of the electron/photon trigger, which increases with the cell size, and the number of trigger channels, which must be as small as possible for cost reasons. In total the CMS calorimeter trigger has 4176 towers, corresponding to 2448, 1584 and 144 towers respectively in the barrel, end-cap and forward calorimeters (Fig. 3.4).



**Fig. 3.4:** Layout of the calorimeter trigger towers in the r-z projection.

Each ECAL half-barrel is divided in 17 towers in  $\eta$  and 72 towers in  $\phi$ , so that the calorimeter trigger tower in the barrel has dimensions  $\Delta\eta \cdot \Delta\phi = 0.087 \times 0.087$ . In the barrel the trigger tower is formed by 5x5 crystals.

The ECAL trigger towers in the barrel are divided in strips. Each trigger cell has 5  $\eta$ -strips (one crystal along  $\eta$  and five crystals along  $\phi$ ). The strip information allows for a finer analysis of the lateral energy spread of electromagnetic showers. The strips are arranged along the bending plane in order to collect in one or two adjacent strips almost all the energy of electrons with bremsstrahlung and converted photons (Fig. 3.5).

In the ECAL endcap where the crystals are arranged in a x-y geometry, the trigger towers do not follow exact  $(\eta, \phi)$  boundaries (Fig. 3.6). The trigger tower average  $(\eta, \phi)$  boundaries are  $\Delta\eta \times \Delta\phi = 0.087 \times 0.087$  up to  $\eta \approx 2$ . The  $\eta$  dimension of trigger towers grows with  $\eta$  as indicated in Fig. 3.4 and Table 3.1. The number of crystals per trigger tower varies between 25 at  $\eta \approx 1.5$  and 10 at  $\eta \approx 2.8$ .



**Fig. 3.5:** Calorimeter trigger tower layout in one ECAL half barrel supermodule. The trigger towers are organized in calorimeter regions of 4x4 towers. Tower 17 is integrated with the endcap towers 18, 19 and 20 in a calorimeter trigger region.

In the barrel and in the endcap, the boundaries of ECAL and HCAL trigger towers follow each other. Each trigger tower corresponds to the  $(\eta, \phi)$  size of an HCAL physical tower, except for  $\eta > 1.74$  where the HCAL tower has twice the  $\phi$  dimension of the trigger tower. In this region, the HCAL tower energy is divided in equal amount and assigned to two trigger towers that are contained in it.

The HCAL barrel trigger towers are formed by the sum of the first two longitudinal segments (the Outer HB is not included in the trigger). The endcap towers are formed by two or three segments. In the barrel-endcap transition region, barrel and endcap segments are added together (see Table 3.1).

The trigger segmentation of the forward hadron calorimeter (HF) is not required to have small  $\phi$  binning, since this detector does not participate in the electron/photon trigger. However, we do need seamless coverage for jet and missing  $E_T$  algorithms. Therefore we keep 18 HF  $\phi$  divisions which exactly match the trigger boundaries of the 4x4 trigger tower regions in the HB and HE. As shown in the Fig. 3.7 the HF readout towers are combined  $3\eta \times 2\phi$  groups to form more coarse trigger towers. The resulting HF segmentation of  $4\eta \times 18\phi$  is used in the jet and missing transverse energy trigger. The  $\phi$  divisions are exactly four times the towers of HB/HE and the  $\eta$  divisions are approximately the size of outer HE divisions. The overlapping jet trigger extends seamlessly to  $|\eta|=5$ . Missing  $E_T$  is computed using  $\phi$  divisions of 0.348 for the entire  $(\eta, \phi)$  plane.

## Calorimeter Regions

The trigger towers are organized in calorimeter regions, each one formed by 4x4 trigger towers (Fig. 3.5). The HF towers 29-32 in Table 3.1 are themselves treated as regions and their  $\Delta\phi$  division matches the 4x4 regions in the barrel and endcap. These calorimeter regions form the basis of the jet and energy triggers. The dimensions of the calorimeter region are adequate to the jet trigger algorithm, which is based on sliding windows of 3x3 calorimeter regions (12x12 trigger



**Fig. 3.6:** Calorimeter trigger tower layout in the ECAL endcap



2 CMS HF Calorimeters mapping onto  
Trigger System HF Crate

*Readout segmentation:  $36\phi \times 12\eta \times 2z \times 2F/B$*

*Trigger Tower segmentation:  $18\phi \times 4\eta \times 2F/B$*

**Fig. 3.7:** Calorimeter trigger tower layout in the HF.

towers). The  $\eta-\phi$  indexes of the calorimeter regions are used to identify the location of L1 calorimeter trigger objects (electron/photons and jets) in the upper stages of the trigger chain.

**Table 3.1:** Characteristics of the Calorimeter Trigger towers. Towers 1-28 have  $\Delta\phi=0.087$ . HCAL towers 21-28 have  $\Delta\phi=0.174$  and are split into two trigger towers each of  $\Delta\phi=0.087$ . Towers 29-32 have  $\Delta\phi=0.348$ .

| Tower # in $\eta$ | $\Delta\eta$ | $\eta_{\text{max}}$ | ECAL crystals | HCAL long. segments     |
|-------------------|--------------|---------------------|---------------|-------------------------|
| 1-14              | 0.087        | $n \times 0.087$    | 5x5 barrel    | 2 (HB0,1)               |
| 15                | 0.087        | $n \times 0.087$    | 5x5 barrel    | 3 (HB0,1,2)             |
| 16                | 0.087        | $n \times 0.087$    | 5x5 barrel    | 3 (HB0,1,2) + 2 (HE0,1) |
| 17                | 0.087        | $n \times 0.087$    | 5x5 barrel    | 2 (HE0,1)               |
| 18-20             | 0.087        | $n \times 0.087$    | endcap        | 2 (HE0,1)               |
| 21                | 0.09         | 1.83                | endcap        | 2 (HE0,1), split        |
| 22                | 0.1          | 1.93                | endcap        | 2 (HE0,1), split        |
| 23                | 0.113        | 2.043               | endcap        | 3 (HE0,1,2), split      |
| 24                | 0.129        | 2.172               | endcap        | 3 (HE0,1,2), split      |
| 25                | 0.15         | 2.322               | endcap        | 3 (HE0,1,2), split      |
| 26                | 0.178        | 2.50                | endcap        | 3 (HE0,1,2), split      |
| 27                | 0.15         | 2.65                | endcap        | 3 (HE0,1,2), split      |
| 28                | 0.35         | 3.00                | endcap        | 3 (HE0,1,2), split      |
| 29                | 0.500        | 3.50                | -             | HF                      |
| 30                | 0.500        | 4.00                | -             | HF                      |
| 31                | 0.500        | 4.50                | -             | HF                      |
| 32                | 0.500        | 5.00                | -             | HF                      |

### 3.3.2 Trigger Primitives

The ECAL and HCAL trigger primitives are digital quantities computed out of the 40 MHz digital samples of the detector pulses. These quantities are computed by TPG electronics integrated with to the calorimeter readout systems and are transmitted through serial high speed links to the RCT.

#### Bunch Crossing Assignment

One of the most important tasks of the TPG is to assign a precise bunch crossing to detector pulses, which span several clock periods. The bunch crossing assignment uses a digital filtering technique applied to the energy samples, followed by a peak finder algorithm. Each sample of the filtered pulse is given by a weighted sum of five consecutive samples of the input

pulse. The peak finder selects the samples of the filtered pulse that are larger than the two nearest neighbours. All other samples are set to zero.

The amplitude of the peak filtered sample is used as an estimator of the pulse energy. The position of peak filtered sample in data pipeline flow determines the timing. The filter coefficients are optimized to maximize the bunch crossing efficiency and the energy resolution.

## Energy Sums

The transverse energy sum is computed for each calorimeter trigger tower. The sum is given in 10 bit linear scale. The LSB is programmable in powers of two multiples of the basic ADC least significant bit. In case of overflow the sum is set to the scale maximum.

Different energy scales can be used in different calorimeter regions, if needed. In particular, in the first level trigger, different scales in the ECAL barrel and endcap will allow to correct for the energy deposited in the endcap preshower. The relative scale between the ECAL and the HCAL is also adjustable in order to optimize the trigger energy resolution.

Before transmission to the RCT, the  $E_T$  sum in the trigger cell is coded in a programmable 8-bit compressed nonlinear scale, in order to minimize the trigger data flux to the regional trigger. The data compression deteriorates the trigger energy resolution by an amount smaller than 5%.

Prior to the energy summations, the energy contents of each calorimeter channel is linearized (whenever necessary) and converted into transverse energy. The ECAL trigger cell  $E_T$  is the sum of the  $E_T$  of 5x5 crystals (barrel) and of a variable number of crystals in the endcap. The HCAL trigger cell  $E_T$  is the sum of the  $E_T$  of the longitudinal compartments inside the coil of the HCAL towers.

## ECAL Fine Grain Bit

The fine grain (FG) algorithm provides for every ECAL trigger tower an information that reflects the lateral extension of the e.m. shower and is used to improve the rejection of background in the electron trigger. The FG veto bit is active when the highest energy adjacent strip pair has less than a programmable fraction  $R$  (typically 90%) of the tower energy.

The strips are disposed along the  $\phi$  direction, the direction in which the energy spreads because of the 4 tesla magnetic field. Pairs of adjacent  $\eta$  strips are added to account for any leakage along  $\eta$  within the trigger tower. Any leakage of energy outside of the trigger tower is ignored in setting the FG veto.

Electrons and photons (converted and non-converted), in the presence of noise and high luminosity pileup, have  $R < 0.90$  in 2% of the cases. The same criterion rejects the fraction of jets where two or more hadrons interact inside one trigger cell.

Depending on the tower energy range, two  $R$  thresholds are used. In the low energy domain (up to 5-10 GeV), a higher threshold (typically 0.95) is used to help the triggering of a low  $p_T$  bottom electron trigger which requires a moderate efficiency and a high rejection power. For tower energies below the noise threshold the FG veto is set to zero. A more detailed description of the fine grain algorithm will be given in Chapter 4.

## HCAL Fine Grain Bit

Because the energy for the trigger tower is sent in  $E_T$  scale rather than the  $E$  scale, the trigger scale is not sensitive to minimum ionizing particle energy deposit in the HCAL. Therefore, the fine grain bit is used to identify minimum ionizing particles, requiring the tower energy to be inside a programmable energy range for HCAL, within the trigger primitive generator.

## Energy Thresholds

In order to improve the immunity to noise (electronics and pile-up), programmable thresholds can be applied to individual calorimeter channels before the trigger primitives calculations as well as to the trigger tower energy sum before transmission to the regional trigger.

### 3.3.3 Electron and Photon Triggers

The electron/photon trigger uses a 3x3 trigger tower sliding window technique which spans the complete  $\eta, \phi$  coverage of the CMS electromagnetic calorimeter. Two independent streams are considered, non-isolated and isolated electron/photons. The non-isolated electron/photon identification is based:

- on the recognition of a large energy deposit in one or two adjacent ECAL trigger cells;
- on the lateral shower profile (fine grain energy spread in central ECAL cell of 3x3 window);
- on the longitudinal shower profile, i.e., the ratio of  $E_T$  deposits in the HCAL and ECAL portions of the calorimeter (H/E), in central trigger cell of 3x3 window;

The isolated electron/photons require two additional criteria based:

- on the e.m. isolation, i.e., a cut on ECAL  $E_T$  deposited in trigger towers surrounding the central cell of 3x3 window;
- on the hadronic isolation, i.e., FG and H/E vetos on all 8 nearest neighbors in 3x3;

A large safety margin in the rates is required for the CMS triggers given the various uncertainties that affect the trigger rate estimation. The implementation of longitudinal and lateral shower profile selection cuts, as well as e.m. and hadronic isolation programmable criteria provides safety and flexibility for the calorimeter electron/photon trigger. The use of the energy threshold together with shower profile criteria alone allows to trigger on low  $p_T$  electrons from b-decays.

## Electron/Photon Algorithm

An overview of the electron/photon isolation algorithm is shown in Fig. 3.8. This algorithm involves only the eight nearest neighbours around the central hit trigger tower and is applied over the entire  $(\eta, \phi)$  plane. The electron/photon candidate  $E_T$  is determined by summing the  $E_T$  in the hit tower with the maximum  $E_T$  tower of its four broad side neighbours. This summed transverse energy provides a sharper efficiency turn-on with the true  $E_T$  of the particles.

The non-isolated candidate requires passing of two shower profile vetoes, the first of which is based on fine-grain ECAL crystal energy profile (FG Veto), as described in Section 3.3.2. The second is based on HCAL to ECAL energy comparison, e.g. H/E less than 5% (HAC veto).



**Fig. 3.8:** Electron/photon trigger algorithm

The isolated candidate requires passing of two additional vetoes, the first of which is based on the passing of FG and HAC Vetoos on all eight nearest neighbours, and the second is based on there being at least one quiet corner, i.e., one of the five-tower corners has all towers below a programmable threshold, e.g., 1.5 GeV. Each candidate is characterized by the ( $\eta, \phi$ ) indexes of the calorimeter region where the hit tower is located.

### Single Electron/Photon Triggers

In each calorimeter region (4x4 trigger towers) the highest  $E_T$  non-isolated and isolated electron/photon candidates are separately found. The 16 candidates of both streams found in a regional trigger crate (corresponding to 16 calorimeter regions covering  $\Delta\eta \cdot \Delta\phi = 3.0 \times 0.7$ ) are further sorted by transverse energy. The four highest  $E_T$  candidates of both categories from each crate are transferred to the GCT where the top four candidates are retained for processing by the CMS global trigger.

The nominal electron/photon algorithm allows both non-isolated and isolated streams. The non-isolated stream uses only the hit tower information except for adding in any leakage energy from the maximum neighbour tower. This stream will be used at low luminosity to provide the B-electron trigger. The isolation and shower shape trigger cuts are programmable and can be adjusted to the running conditions. For example, at high luminosity isolation cuts could be relaxed to take into account higher pile-up energies. The electron/photon triggers specification includes also the definition of the  $\eta-\phi$  region where it is applicable. In particular, it is possible to define different trigger conditions (energy thresholds, isolation cuts) in different rapidity regions.

The typical energy threshold of the single electron/photon trigger is 15 GeV at low luminosity and 30 GeV at high luminosity. Due to the slope of the efficiency turn-on curve, 95%

electron/photon efficiency is reached at 20 and 35 GeV, respectively. In order to measure the trigger efficiency curve of the standard electron/photon trigger, a low threshold (5-10 GeV) electron/photon stream with adequate prescaling will be used.

### Multi Electron/Photon Triggers

Double, triple and quad electron/photon triggers can be defined. The requirements on the objects of a multi electron/photon trigger, namely the energy threshold, the cluster shape and isolation cuts and the  $(\eta, \phi)$  region, are set individually. Requirements on the  $(\eta, \phi)$  separation between objects can also be defined (see Chapter 15).

#### 3.3.4 Jet and $\tau$ Triggers

The jet trigger uses the transverse energy sums (e.m.+had) computed in calorimeter regions (4x4 trigger towers), except in the HF region where it is the trigger tower itself. The input tower  $E_T$  is coded in an 8 bit linear scale with programmable resolution. Values exceeding the dynamic range are set to the maximum. The subsequent summation tree extends to a 10 bit linear scale with overflow detection. Simulation studies showed that a scale of 10 bits with  $LSB=1$  GeV gives adequate jet trigger performance.

The jet trigger uses a 3x3 calorimeter region sliding window technique which spans the complete  $(\eta, \phi)$  coverage of the CMS calorimeters (Fig. 3.9) seamlessly. The central region  $E_T$  is required to be higher than the eight neighbour region  $E_T$  values.

#### Jet and $\tau$ Definition

The jets and  $\tau$ s are characterized by the transverse energy  $E_T$  in 3x3 calorimeter regions. The summation spans 12x12 trigger towers in barrel and endcap or 3x3 larger HF towers in the HF. The  $\phi$  size of the jet window is the same everywhere. The  $\eta$  binning gets somewhat larger at high  $\eta$  due to the size of calorimeter and trigger tower segmentation. The jets are labelled by  $(\eta, \phi)$  indexes of the central calorimeter region.

For each calorimeter region a  $\tau$ -veto bit is set ON if there are more than two active ECAL or HCAL towers in the 4x4 region. A jet is defined as “ $\tau$ -like” if none of the 9 calorimeter region  $\tau$ -veto bits are ON.

The four highest energy central and forward jets, and central  $\tau$ s in the calorimeter are selected. Jets and  $\tau$ s occurring in a calorimeter region where an electron is identified are not considered. The selection of the four highest energy central and forward jets and of the four highest energy  $\tau$ s provides enough flexibility for the definition of combined triggers.

In addition counters of the number of jets above programmable thresholds in various  $\eta$  regions are provided to give the possibility of triggering on events with a large number of low energy jets. Jets in the forward and backward HF calorimeters are sorted and counted separately. This separation is a safety measure to prevent more background susceptible high  $\eta$  region from masking central jets. Although the central and forward jets are sorted and tracked separately through the trigger system, the global trigger can use them seamlessly as the same algorithm and resolutions are used for the entire  $\eta-\phi$  plane.



**Fig. 3.9:** Jet trigger algorithm

### Multi Jet and $\tau$ triggers

Single, double, triple and quad jet ( $\tau$ ) triggers are possible. The single jet ( $\tau$ ) trigger is defined by the transverse energy threshold, the  $(\eta, \phi)$  region of validity and eventually by a prescaling factor. Prescaling will be used for low energy jet ( $\tau$ ) triggers, necessary for efficiency measurements.

The multi jet ( $\tau$ ) triggers are defined by the number of jets ( $\tau$ s) and their transverse energy thresholds, by a minimum separation in  $(\eta, \phi)$ , as well as by a prescaling factor. The global trigger accepts the definition, in parallel, of different multi jet ( $\tau$ ) triggers conditions (see Chapter 15).

### 3.3.5 Energy Triggers

The  $E_T$  triggers use the transverse energy sums (e.m.+had) computed in calorimeter regions (4x4 trigger towers in barrel and endcap).  $E_x$  and  $E_y$  are computed from  $E_T$  using the coordinates of the calorimeter region center. The computation of missing transverse energy from the energy in calorimeter regions does not affect significantly the resolution for trigger purposes.

### Missing $E_T$ Triggers

The missing  $E_T$  is computed from the sums of the calorimeter regions  $E_x$  and  $E_y$ . The sum extends up to the end of forward hadronic calorimeter, i.e.,  $|\eta|=5$ . The missing  $E_T$  triggers are

defined by a threshold value and by a prescaling factor. The global trigger accepts the definition, in parallel, of different missing  $E_T$  triggers conditions.

The missing transverse energy trigger is implemented with a number of thresholds. Some of these thresholds are used in combination with other triggers. Other thresholds are used with a prescale and one threshold is used for a stand-alone trigger.

The missing  $E_T$  trigger will be used in combination with other triggers, namely jet triggers, in the search for SUSY events.

### Total $E_T$ Trigger

The total  $E_T$  is given by the sum of the calorimeter regions  $E_T$ . The sum extends up to the end of forward calorimeter, i.e.,  $|\eta|=5$ . The total  $E_T$  triggers are defined by a threshold value and by a prescaling factor. The global trigger accepts the definition, in parallel, of different total  $E_T$  triggers conditions.

The total energy trigger is implemented with a number of thresholds which are used both for trigger studies and for input to the luminosity monitor. Some of these thresholds are used in combination with other triggers. Other thresholds are used with a prescale and one threshold is used for a stand-alone trigger. The lower threshold  $E_T$  trigger also provides a good diagnostic for the calorimeter and its trigger.

**Table 3.2:** Calorimeter Trigger output data

| Objects                  | Energy bit assignment | Pattern bits | $\eta-\phi$ bit assignment | Total No bits |
|--------------------------|-----------------------|--------------|----------------------------|---------------|
| Isolated electrons (4)   | 6                     | 0            | $5\phi+4\eta$              | 60            |
| Non-isolated electron(4) | 6                     | 0            | $5\phi+4\eta$              | 60            |
| Central jets (4)         | 6                     | 0            | $5\phi+4\eta$              | 60            |
| Forward jets (4)         | 6                     | 0            | $5\phi+4\eta$              | 60            |
| $\tau$ -jet (4)          | 6                     | 0            | $5\phi+4\eta$              | 60            |
| Jet counters (8)         |                       | 4            |                            | 32            |
| $E_T$ triggers (2)       | 13                    | 0            | $6\phi$                    | 32            |
| Quiet bits               |                       | 14x18        |                            | 252           |
| MIP bits                 |                       | 14x18        |                            | 252           |

### 3.3.6 Quiet and MIP Bits

For each calorimeter region (4x4 trigger towers) a Quiet and a MIP bit are computed. The Quiet bit is on if the transverse energy in the calorimeter region is below a programmable threshold. The MIP bit in a calorimeter region requires, on top of the Quiet bit condition, that at least one of the 16 trigger towers in the region has the HCAL Fine Grain bit on. The Quiet and MIP bits are used in Global Muon Trigger (see Chapter 8).

### 3.3.7 Calorimeter Trigger Output

The calorimeter trigger produces for every beam crossing the data specified in Table 3.2. These data are transferred to the Global Trigger.

## 3.4 Algorithm Performance

Trigger performance simulation played a crucial role in the design and optimization of the calorimeter trigger hardware. Simulation of physics events was necessary to understand both rates and efficiencies of various calorimeter trigger objects. The first simulation results obtained using fast parameterized programs were used to design the algorithms. Later, detailed simulations were performed to refine them. The results reported here are from a third set that includes details of both detector and electronics behaviour. All of these results have been consistent with each other giving us confidence in the performance of algorithms that are being implemented in hardware.

### 3.4.1 Simulation Programs

Simulation for the results reported here was carried out using CMS standard tools. PYTHIA or ISAJET programs were run to produce minimum bias, QCD jet, top, Standard Model higgs, MSSM higgs or SUSY sparticle event n-tuples. The saved stable particle n-tuples were then run through CMSIM version 116 program to obtain simulated calorimeter data. The ORCA version 4 program which includes signal development and trigger simulation used the CMSIM input and provided signal and trigger response. In ORCA, the simulated ECAL crystal and HCAL tower data were used to form trigger primitives using the logic described here. The trigger simulation was done using integer scales with appropriate bit resolutions and dynamic range implemented in the hardware. N-tuples generated from this trigger simulation of the QCD jet events are used to make integrated trigger rate plots versus the  $E_T$  values for various trigger channels and combinations. The ORCA simulated trigger data and the PYTHIA Monte Carlo generator information are used together to obtain the trigger efficiencies as a function of generated trigger particle momenta.

### 3.4.2 Electron and Photon Trigger Efficiencies

The ORCA program forms trigger primitives from the calorimeter energy deposits obtained using the CMSIM 116 simulation. The EB trigger towers are formed using 5x5 crystal arrays. The EE crystals are combined into trigger towers based on the location of their centers within the trigger towers defined by the HCAL towers. Two or more longitudinal readout divisions of the HCAL towers are combined to form HCAL trigger towers. FG veto bit for each EB trigger tower is formed using the algorithm described earlier. A simplified EE fine-grain profile within these odd shaped crystal groupings in endcap is obtained by simply comparing the highest energy crystal to the sum energy in the trigger tower.

Efficiency for triggering on electrons and photons in higgs signal events with full pileup corresponding to high luminosity LHC operation is plotted versus the  $p_T$  and  $\eta$  of the electron or photon in Fig. 3.10. At this high luminosity, all cuts of the isolated electron/photon algorithm are used. The isolated electron/photon algorithm efficiency is over 95% at 5 GeV above the 30 GeV trigger  $E_T$  cutoff. For  $E_T$  greater than 55 GeV, the requirement of the two isolation cuts of the

electron/photon algorithm are not required to recover any loss due to them. The candidates are still required to pass the non-isolated electron/photon algorithm, i.e., the FG and H/E vetoes.

At low luminosities, only the non-isolated electron/photon algorithm is needed to sustain the QCD background rates. The efficiency of the non-isolated electron/photon algorithm is shown in Fig. 3.11. The non-isolated electron trigger will provide some efficiency for the semi-



**Fig. 3.10:** The efficiency of the isolated electron/photon trigger with trigger  $E_T > 30 \text{ GeV}/c$  is plotted versus the generated  $p_T$  and  $\eta$  of electrons in higgs decay events with full pileup corresponding to the high luminosity operation.

leptonic B meson decay events. Although the trigger cuts are rather high for the B decay electron spectrum which peaks at few GeV, this small efficiency for the tail end of that spectrum is expected to yield several hundred events per second. The High Level Triggers can select from this sample those events with separated vertices to provide a B meson trigger.

### 3.4.3 Electron and Photon Trigger Background Rates

The integrated QCD background event rate above the trigger  $E_T$  cut for the isolated electron/photon trigger is plotted versus the  $E_T$  cut for the case of the LHC operation at high luminosity,  $10^{34} \text{ cm}^{-2} \text{ s}^{-1}$  in Fig. 3.12. For  $30 \text{ GeV}$   $E_T$  cut, the simulated QCD background rate is  $7.2 \text{ kHz}$ . This rate is dominated by the QCD jets fluctuating to high  $E_T \pi^0$ s.

For low luminosity operation, lower trigger  $E_T$  cutoff, with only the non-isolated electron/photon algorithm, can be sustained as illustrated in the right hand side plot in Fig. 3.12. This non-isolated electron algorithm, with only the FG and H/E vetoes, helps in triggering on B meson decays to electron in presence of nearby jets.



**Fig. 3.11:** The efficiency of non-isolated electron/photon trigger with trigger  $E_T > 20$  GeV/c is plotted versus generated  $p_T$  and  $\eta$  of electrons.



**Fig. 3.12:** The integrated QCD background rate above electron/photon trigger  $E_T$  cutoff is plotted versus the  $E_T$  cutoff for high and low luminosity operation of the LHC. Data for both isolated and non-isolated electrons are shown.

### 3.4.4 Jet Trigger Rate and Efficiency

The jet trigger algorithm involves summation of  $E_T$  in 3x3 trigger regions with the



**Fig. 3.13:** The jet trigger efficiency at high luminosity is plotted versus generator level jet  $p_T$  for single, double, triple and quadruple jet  $E_T$  cutoffs. The jet efficiency for a low  $E_T$  cutoff of 50 GeV is plotted versus  $\eta$ .

central region greater than the neighbours. The entire calorimeter  $\eta$ - $\phi$  space is covered seamlessly with these 3x3 trigger regions sliding in steps of a region. The trigger regions are non-overlapping sums of 4x4 trigger towers in the barrel and endcap making the jet to be a sum of 12x12 trigger towers. Each HF trigger tower is itself a trigger region. In this simulation the jets found by this jet algorithm are corrected for calibration variation as a function of  $\eta$  using the LUTs of the RCT. The jet trigger efficiency turn-on versus the generator level jet  $p_T$  for the location matched jets is shown in the left plot of the Fig. 3.13 for single, double, triple and quadruple jet events. In the right plot of the Fig. 3.14, the jet trigger efficiency for a low  $E_T$  cutoff of 50 GeV is plotted verbs  $\eta$  for location matched jets. This plot shows that the low  $E_T$  jets, which by themselves have unacceptably high rate, are available in the calorimeter trigger output for the Global Trigger to use in combination with other trigger objects.

The jet trigger integrated rate is plotted versus the corrected L1 jet  $E_T$  for single, double, triple and quadruple jet events for high and low luminosities in the Fig. 3.14. For the multi-jet triggers all the trigger jets are required to be above the jet  $E_T$  cutoff. In all cases, the jet can be anywhere in  $|\eta|<5$ .

### 3.4.5 $\tau$ Trigger Rate and Efficiency

Jets in the central rapidity region are classified as  $\tau$ -jets if none of the nine regions in the 3x3 region have more than 2 active ECAL or HCAL towers. The  $\tau$ -jet candidates are sorted and



**Fig. 3.14:** The integrated jet trigger rate from the most profusely produced QCD events, for high and low luminosity LHC operation, is plotted versus the corrected jet  $E_T$  cutoff for single, double, triple and quadruple jets in  $|\eta| < 5$ .

the top 4 objects are available for triggering at global trigger level. The simulated background rates for high and low luminosities after  $\tau$  veto are compared to jet rates in Fig. 3.15.



**Fig. 3.15:** Integrated QCD rates at high and low luminosity for  $\tau$  trigger compared to the jet trigger are plotted versus the trigger  $E_T$  cutoff.

Providing a dedicated trigger stream for  $\tau$ -jets is useful because efficiencies can be measured with better understanding of systematics.  $E_T$  cuts used in this simulation for single and double  $\tau$  triggers respectively are 150, 80 GeV for high luminosity and 80, 60 GeV for low luminosity.

### 3.4.6 Missing $E_T$ Trigger Rate and Efficiency



**Fig. 3.16:** The missing  $E_T$  trigger efficiency computed using MSUGRA events at high and low luminosity are shown.

The missing transverse energy for the events is calculated using the same 4x4 trigger tower transverse energy sums used for the jet algorithm. These sums are further reduced to sums covering  $|\eta|<5$  and 20 degree strips in  $\phi$ . These  $\phi E_T$  sums are converted to  $E_x$  and  $E_y$  with memory lookup tables. The data is added up over the full  $\eta-\phi$  space to obtain the missing  $E_T$ . The  $E_T$  values over the full space are also added to obtain the total  $E_T$ . The missing  $E_T$  trigger efficiency is plotted versus the missing  $p_T$ , calculated at the particle level within the trigger acceptance,  $|\eta|<5$ , in Fig. 3.16, for SUSY sparticle events at high and low luminosities. With the trigger cutoff at 150 GeV, the high luminosity efficiency turn-on saturates at missing  $p_T$  as high as 400 GeV. The turn-on is dominated by the energy resolution in these events that have about 600 GeV total  $E_T$  from the SUSY event itself. The effect due to the additional  $E_T$  in the event due to pileup is seen by comparing to the low luminosity plot. The missing transverse energy trigger is not adequate for SUSY events on its own. However, the SUSY event missing  $E_T$  efficiency is supplemented by the multi-jet triggers earlier.

The integrated QCD rates of the missing transverse energy trigger, for high and low luminosity operation of the LHC, are plotted in Fig. 3.16.

### 3.4.7 Sample Trigger Table

In order to study the physics performance of the calorimeter trigger we select a representative set of  $E_T$  cutoffs for various sub-triggers satisfying the target total rate of 12.5 kHz. For efficiency studies, we select the physics processes that place the most stringent requirements



**Fig. 3.17:** The integrated QCD rates of the missing transverse energy trigger, for high and low luminosity operation of the LHC are plotted.

on the trigger system, i.e., the lowest masses of interest to CMS and channels involving the least number of trigger candidate types. When evaluating efficiencies, we do not include any effect of offline reconstruction inefficiencies or cuts. Therefore, these results are conservative. We perform these studies at two levels of luminosity,  $10^{34} \text{ cm}^{-2} \text{ s}^{-1}$  (high) and  $10^{33} \text{ cm}^{-2} \text{ s}^{-1}$ .

The trigger rate breakdown and the selected  $E_T$  cutoffs for various sub-triggers, at high and low luminosities, are shown in Table 3.3 and Table 3.4 respectively. The  $E_T$  cutoff selection is, of course, arbitrary, but these specific values were selected to emphasize the electron/photon triggers that enable the exploration of high  $p_T$  physics with the best signal to noise ratio. At low luminosity the electron, total and missing  $E_T$  triggers are given higher priority over the jet triggers in this selection. In reality combinations of jet and other triggers will dominate the trigger table.

For the efficiency studies several physics processes were selected. Top quark decays to electron and jets sets the most stringent requirement on the single electron trigger. Decays of W to single electrons are more difficult to trigger. Although it is expected that inclusive W electron sample is too large for archival and physics analysis at high luminosity LHC, it will be used for calibration of the calorimeters. Therefore, it is important to have good efficiency at level 1 for this channel. The Z boson decays to two electrons, and the Standard Model higgs ( $110 \text{ GeV}/c^2$ ) decays to two photons, place the most stringent requirements on the double electron/photon trigger.

The Standard Model higgs production using gluon fusion mechanism and the Minimal Supersymmetric Standard Model (MSSM) higgs production using weak boson fusion were explored with low mass higgs to place the most stringent requirements on the trigger. The Standard Model higgs events were produced with decay photons and leptons within the tracker acceptance of  $|\eta| < 2.5$ . For the MSSM, the processes involving Higgs decays to  $\tau$  pairs and invisible high mass non-interacting particles are considered. The MSSM higgs decay to  $\tau$ , which in turn decayed into hadrons and neutrino, used a generator level  $p_T$  and  $\eta$  cuts on decay particles excluding the

**Table 3.3:** QCD background rate at high luminosity for a representative set of trigger cutoffs and corresponding 90% and 95% efficiency points.

| Trigger Type                      | Trigger E <sub>T</sub><br>cutoff (GeV) | 95%<br>Efficiency<br>Threshold<br>(GeV) | 90%<br>Efficiency<br>Threshold<br>(GeV) | Individual<br>Rate (kHz) |
|-----------------------------------|----------------------------------------|-----------------------------------------|-----------------------------------------|--------------------------|
| Electron                          | 30                                     | 35                                      | 32                                      | 7.2                      |
| Dielectron                        | 15                                     | 20                                      | 18                                      | 0.6                      |
| Single $\tau$                     | 150                                    |                                         |                                         | 1.3                      |
| Double $\tau$                     | 80                                     |                                         |                                         | 2.5                      |
| Jet                               | 250                                    | 285                                     | 275                                     | 0.4                      |
| Dijet                             | 200                                    | 225                                     | 215                                     | 0.4                      |
| Trijet                            | 100                                    | 125                                     | 115                                     | 0.7                      |
| Quadjet                           | 80                                     | 105                                     | 95                                      | 0.2                      |
| $\tau +$ Electron                 | 90 & 15                                |                                         |                                         | 1.4                      |
| Jet + Electron                    | 150 & 15                               | 165 & 20                                | 155 & 18                                | 0.2                      |
| Missing E <sub>T</sub>            | 150                                    |                                         | 350                                     | 0.005                    |
| Electron + Missing E <sub>T</sub> | 15 & 100                               |                                         | 18 & 250                                | 0.005                    |
| Jet + Missing E <sub>T</sub>      | 80 & 100                               |                                         | 95 & 250                                | 0.1                      |
| Sum E <sub>T</sub>                | 1000                                   |                                         | ~1500                                   | 0.03                     |
| Non-isolated electron             | 55                                     | 60                                      | 58                                      | 0.7                      |
| Non-isolated dielectron           | 25                                     | 30                                      | 28                                      | 0.2                      |
| Total Rate (kHz)                  |                                        |                                         |                                         | 12.9                     |

neutrinos,  $p_T > 45$  GeV/c and  $|\eta| < 2.4$ . The MSSM higgs decays  $H \rightarrow \tau, \tau \rightarrow e$ , jet used generator level cuts,  $p_T^{\text{electron}} > 14$  GeV/c,  $|\eta^{\text{electron}}| < 2.4$ ,  $p_T^{\tau\text{-jet}} > 30$  GeV/c and  $|\eta^{\tau\text{-jet}}| < 2.4$ . The weak boson fusion mechanism results in high  $p_T$  forward jets which have been used for triggering for the case of higgs boson that decays into invisible non-interacting particles. For the invisible higgs decay, a generator level missing  $E_T > 100$  GeV/c cut and two tag jets within  $|\eta| < 5$  with  $E_T > 40$  GeV/c are required. The L1 calorimeter trigger efficiencies presented in the Table 3.5 and Table 3.6 are with respect to those events passing these generator level cuts. The results also include the SUSY squark and gluino event triggering efficiency. These events are more difficult to evaluate due to a very large parameter space. We have confined our study to the minimal super gravity model, mSUGRA. The parameters of the mSUGRA model used were set such that it resulted in the undetectable stable lightest supersymmetric particle (LSP) of mass  $\sim 45$  GeV/c<sup>2</sup> and other unstable sparticles of masses in the 300 GeV/c<sup>2</sup> range. We trigger on the decay leptons and jets from the sparticles and the missing  $E_T$  due to the LSPs.

The efficiencies for the above physics channels studied at the appropriate high or low luminosity LHC running are listed in Table 3.5 and Table 3.6. The selected trigger  $E_T$  cutoffs shown in Table 3.3 and Table 3.4 yield high efficiency for these representative physics processes while satisfying the bandwidth requirement.

**Table 3.4:** QCD background rate at low luminosity for a representative set of trigger cutoffs and corresponding 90% and 95% efficiency points.

| Trigger Type             | Trigger $E_T$ cutoff (GeV) | 95% Efficiency Threshold (GeV) | 90% Efficiency Threshold (GeV) | Individual Rate (kHz) |
|--------------------------|----------------------------|--------------------------------|--------------------------------|-----------------------|
| Electron                 | 20                         | 24                             | 22                             | 5.7                   |
| Dielectron               | 10                         | 14                             | 12                             | 2.7                   |
| Single $\tau$            | 80                         | 95                             | 85                             | 3.2                   |
| Double $\tau$            | 60                         | 75                             | 65                             | 1.5                   |
| Jet                      | 120                        | 150                            | 140                            | 1.2                   |
| Dijet                    | 90                         | 115                            | 105                            | 1.0                   |
| Trijet                   | 70                         | 95                             | 85                             | 0.3                   |
| Quadjet                  | 50                         | 75                             | 65                             | 0.3                   |
| $\tau +$ Electron        | 65 & 10                    | 80 & 14                        | 70 & 12                        | 3.5                   |
| Jet + Electron           | 100 & 10                   | 125 & 14                       | 115 & 12                       | 1.1                   |
| Missing $E_T$            | 100                        |                                | 275                            | 0.01                  |
| Electron + Missing $E_T$ | 10 & 50                    |                                | 12 & 175                       | 0.2                   |
| Jet + Missing $E_T$      | 50 & 50                    |                                | 65 & 175                       | 0.6                   |
| Sum $E_T$                | 500                        |                                | 1000                           | 0.02                  |
| Total Rate (kHz)         |                            |                                |                                | 12.24                 |

## 3.5 Overall Structure

### 3.5.1 Calorimeter Trigger Subdivisions

The calorimeter trigger begins with trigger tower energy sums formed by the ECAL, HCAL and HF Trigger Primitive Generator (TPG) circuits from the individual calorimeter cell energies. For the ECAL, these energies are accompanied by a bit indicating the transverse extent of the electromagnetic energy deposit. For the HCAL, the energies are accompanied by a bit indicating the presence of minimum ionizing energy. The TPG information is transmitted over high speed copper links to the RCT, which finds candidate electrons, photons,  $\tau$ s, and jets. The

RCT separately finds both isolated and non-isolated electron/photon candidates. The RCT transmits the candidates along with sums of transverse energy to the GCT. The GCT sorts the candidate electrons, photons,  $\tau$ s, and jets and forwards the top 4 of each type to the global trigger. The GCT also calculates the total transverse energy and total missing energy vector. It transmits this information to the global trigger as well. The RCT also transmits an  $(\eta, \phi)$  grid of quiet regions to the Global Muon Trigger for computation of muon isolation criterion.

**Table 3.5:** Efficiency of the calorimeter trigger at high luminosity. Efficiency contribution due to muons is not included for  $e, e, \mu, \mu$  channel. In calculating these efficiencies, only a selection of trigger objects labelled as electron (e1), dielectron(e2), single jet (j1), double jet (j2), etc. are used.

| Channel                                                     | Efficiency (%) | Triggers Used                      |
|-------------------------------------------------------------|----------------|------------------------------------|
| $H(200) \rightarrow \tau\tau \rightarrow \text{hadrons}$    | 60             | e1, $\tau$ 1, j1, e2, $\tau$ 2, j2 |
| $H(500) \rightarrow \tau\tau \rightarrow \text{hadrons}$    | 86             | e1, $\tau$ 1, j1, e2, $\tau$ 2, j2 |
| $H(170) \rightarrow 4 \text{ electrons}$                    | 99             | e1, e2                             |
| $H(110) \rightarrow 2 \text{ photons}$                      | 98             | e1, e2                             |
| $H(135) \rightarrow \tau\tau \rightarrow e, \text{ hadron}$ | 72             | e1, e2, $\tau$ 1, j1               |
| $H(200) \rightarrow \tau\tau \rightarrow e, \text{ hadron}$ | 74             | e1, e2, $\tau$ 1, j1               |
| $H(120) \rightarrow \text{Invisible (tag jets)}$            | 58             | j1, j2, missing $E_T$              |
| $H(120) \rightarrow ZZ^* \rightarrow e, e, \mu, \mu$        | 73             | e1, e2                             |
| $H(200) \rightarrow ZZ \rightarrow e, e, \text{ jets}$      | 95             | e1, e2, j1, j2                     |
| $t\bar{t} \rightarrow e, X$                                 | 82             | e1, j1, j2, j3, j4                 |
| $t\bar{t} \rightarrow e, H^+, X_1 \rightarrow e, \tau, X_2$ | 76             | e1, j1, j2, j3, j4                 |

### 3.5.2 Trigger Primitives

The Trigger Primitive Generation (TPG) sub-system performs the first computations steps in the synchronous and pipelined Calorimeter Trigger System, interfacing directly the ECAL and HCAL front-end electronics systems, from where it receives detector data. The TPG computes the calorimeter trigger primitives and transmits the results to the Regional Trigger system. The TPG interfaces directly to the Readout and Control sub-system to which it delivers trigger primitives data to be recorded.

The basic function of the TPG sub-system is the computation of the calorimeter trigger primitives which includes:

- transformation of the input scale to transverse energy scale
- digital filtering of the detector signals to extract the transverse energy and the bunch time information
- computation of the trigger towers energy sums

**Table 3.6:** Efficiency of calorimeter trigger at low luminosity. In calculating these efficiencies only, a selection of trigger objects labelled as electron (e1), dielectron(e2), single jet (j1), double jet (j2), etc. are used.

| Channel                                                     | Efficiency (%) | Triggers Used                    |
|-------------------------------------------------------------|----------------|----------------------------------|
| H(200) $\rightarrow \tau\tau \rightarrow$ hadrons           | 93             | e1, $\tau 1, j1, e2, \tau 2, j2$ |
| H(500) $\rightarrow \tau\tau \rightarrow$ hadrons           | 99             | e1, $\tau 1, j1, e2, \tau 2, j2$ |
| H(170) $\rightarrow$ 4 electrons                            | 100            | e1, e2                           |
| H(110) $\rightarrow$ 2 photons                              | 99             | e1, e2                           |
| H(135) $\rightarrow \tau\tau \rightarrow e, \text{hadron}$  | 96             | e1, e2, $\tau 1, j1$             |
| H(200) $\rightarrow \tau\tau \rightarrow e, \text{hadron}$  | 96             | e1, e2, $\tau 1, j1$             |
| H(120) $\rightarrow$ Invisible (tag jets)                   | 96             | j1, j2, missing $E_T$            |
| $t\bar{t} \rightarrow e, X$                                 | 97             | e1, j1, j2, j3, j4               |
| $t\bar{t} \rightarrow e, H^+, X_1 \rightarrow e, \tau, X_2$ | 94             | e1, j1, j2, j3, j4               |
| SUSY squark and gluino production                           | 77             | e1, j1, j2, j3, j4               |

- computation of the fine grain electron/photon identification from the ECAL data (FG Veto bit)
- computation of the muon identification bit from the HCAL data (MIP bit).

The TPG sub-system is located in the CMS electronics room at the underground level. The TPG is housed in the same crates with the ECAL and HCAL readout electronics systems. The ECAL readout and trigger system will be housed in 60 crates in total. Each ECAL crate will serve 68 trigger towers representing 1700 crystals in the barrel and a variable number of crystals in the endcap. Subsets of four trigger towers signals are processed by a single board receiving a maximum number of 100 digital signals coming from the ECAL very-front-end. In the barrel case these four trigger towers are in the same  $\eta$  slice. The four TPG outputs from one board are grouped by two and sent to the regional trigger by two serial links.

### 3.5.3 Regional Calorimeter Trigger

The Regional Calorimeter Trigger (RCT) system receives digital trigger sums from the front-end electronics system, which transmits energy on an eight bit compressed scale. The data for two trigger towers of the same calorimeter (EB, EE, HB, HE or HF), for the same crossing, are received from a single serial link in eight bits apiece accompanied by five bits of error detection code and an FG veto bit characterizing the energies summed into the trigger towers, i.e. isolated energy for EB, EE or an energy deposit consistent with minimum ionizing particle for HB, HE. Presently the fine-grain bit is undefined for the HF calorimeter.

The RCT uses 20 regional processor crates covering the full detector. Eighteen crates are dedicated to the barrel and two endcaps. These crates cover the region  $|\eta| < 3$ . One special crate covers both HF Calorimeters that extend missing  $E_T$  and jet finding coverage to  $|\eta| < 5$ . The

remaining crate collects regional information from these 19 trigger crates and clusters their regions to find jets and  $\tau$ s. It also continues the summation tree to provide sums of  $E_T$  in various  $\phi$  regions.

Each RCT crate transmits to the GCT processor its 4 highest-ranked isolated and non-isolated electrons. The cluster crate sends its 9x4 highest energy central and forward jets and  $\tau$  candidates along with information about their location and sum  $E_T$  for 18  $\phi$  regions covered by it. The GCT then forms  $E_x$  and  $E_y$  using look-up-tables and sums the energies, separately sorts the electrons, jets and  $\tau$ s, and sends the top four calorimeter-wide candidates, as well as the total calorimeter missing and sum  $E_T$  to the CMS global trigger. The muon quiet and MIP bits formed using the HB, HE information are passed to the global muon crates via the GCT.

Eighteen crates of the RCT use three custom board designs which are dedicated to receiving and processing data from the barrel and endcap calorimeters. In these crates there are seven rear mounted Receiver cards, seven front mounted Electron Isolation cards, and one front mounted Jet Summary card for a total of 15 processor cards per crate. These cards and an additional clock and control card are plugged into custom “backplane” which provides point-to-point links between the cards. VME bus is also provided to these cards using high density connectors in top 3U section of the backplane. In addition there are two slots with standard VME backplane connectors for crate processor and monitoring cards. The 19th crate covering the HF calorimeter houses special cards that use portions of circuitry of the Receiver and Jet Summary cards to drive the signals out for forming jets and  $E_T$  sums. The 20th cluster crate is similar to the 18 barrel and endcap crates but is fitted with a different backplane and set of cluster processor cards which implement jet and  $\tau$  finding algorithms and  $E_T$  sums.

### 3.5.4 Global Calorimeter Trigger

The Global Calorimeter Trigger (GCT) is the final component in the Calorimeter Trigger chain. Its purpose is to implement the stages of the trigger algorithms which require information from the entire CMS calorimeter system. The GCT receives trigger object data from the RCT, performs several stages of data processing, and sends a reduced amount of information to the Global Trigger. The baseline GCT functions include:

- Final-stage sorting of  $e/\gamma$ , jet and  $\tau$  trigger objects according to rank
- Jet counting
- Calculation of total and missing transverse energy
- Luminosity monitoring using L1 trigger data.

The GCT is located in the CMS electronics cavern, USC55. The system is housed in two electronics crates, located within a single rack. The main GCT functions are implemented using two types of board. The data processing functions are performed by six Trigger Processor Modules (TPM). An additional TPM is used for the interface to the RCT crate. Input data from the RCT are received, synchronised and reformatted by 15 Input Modules. System control and monitoring is performed by a cluster of embedded processors located on the TPMs. Each of the TPMs and IMs is implemented on a single 9U x 400mm printed circuit board using FPGA technology.

The GCT output to the Global Trigger consists of 4 sorted isolated electrons, non-isolated electrons,  $\tau$  candidates, central and forward jets, missing  $E_T$  value and direction, total  $E_T$  and jet counts.

## 3.6 System Robustness

The calorimeter trigger system is designed to be robust. Any unforeseen difficulties in background rates, beyond the factor of three safety margin used in the design, can be sustained by optimizing many tunable parameters in the system. For instance, look-up tables are provided at several stages and they can be programmed, if necessary, as a function of  $\eta$  to suppress pileup where it is a problem. The FG veto threshold, the H/E look-up table and the neighbour  $E_T$  isolation cut are all programmable as a function of  $\eta$ . Since the trigger candidate location and  $E_T$  information is available for global trigger, a further reduction in rate is possible by requiring separation between multiple trigger objects.

Any spurious data due to malfunctioning hardware can be suppressed by zeroing out the channels. Any bit errors in transmission, on some serial data links, are caught online using error detection Hamming codes and suppressed to avoid spurious triggers. Most electronic circuits are designed with boundary scan for in-situ and power-up testing. Each subsystem input and output are available for readout upon triggers to enable monitoring of trigger electronics online.

## References

- [3.1] W. Badgett et al., "CMS Calorimeter Level 1 Regional Trigger Electron Identification", CMS Note-1999/026.
- [3.2] S. Dasu, W. Badgett, M. Jaworski, J. Lackey and W. H. Smith, "CMS Level-1 Calorimeter Trigger Detailed Simulation", CMS Note-1998/027.
- [3.3] G.P. Heath et al., "The CMS Calorimeter Trigger", in Proceedings of 'Third Workshop on Electronics for LHC Experiments', London, 1997.
- [3.4] S. Dasu, W. H. Smith, "CMS Level-1 Jet Trigger Study", CMS TN-1996/066.
- [3.5] R. Nóbrega, J. Varela, "CMS electron/photon trigger - A simulation study with CMSIM data", CMS TN 96-21.
- [3.6] CMS Calorimeter Trigger Group, "Preliminary specifications of the baseline trigger algorithms", CMS-TN-96-10.
- [3.7] C. Lourenço, A. Nikitenko, J. Varela, "A low  $p_T$  1st level single electron trigger for beauty studies in CMS", CMS TN 95-197, 1995.
- [3.8] A. Nikitenko, J. Varela, "A study of the 1st level  $\tau$  trigger", CMS TN 95-195, 1995.
- [3.9] S. Dasu, J. Lackey, W. H. Smith, W. Temple, "CMS Level 1 Calorimeter Trigger Performance on Technical Proposal Physics", CMS TN-1995/183.
- [3.10] J. Varela, "Requirements for a fine grained calorimeter trigger", CMS TN-1995/143.
- [3.11] S. Dasu, J.Lackey, W.H.Smith, W. Temple, "New Algorithms for CMS Electron/Photon Trigger - Use of Fine Grain Calorimeter Data", CMS TN-1995/112.
- [3.12] C.Lourenco, J.Varela, "A fine granularity calorimeter trigger for CMS", CMS TN-1995/027.
- [3.13] S. Dasu, J.Lackey, W.H.Smith, W. Temple, "CMS Missing Transverse Energy Trigger Studies", CMS TN-1995/111.
- [3.14] S. Dasu, J. Lackey, W.H. Smith, W. Temple, "CMS Level 1 Calorimeter Trigger Performance Studies", CMS TN-1994/285.

- [3.15] S. Dasu, T. Gorski, J. Lackey, D. Panescu, W. H. Smith, W. Temple, "The level-1 calorimeter trigger for the CMS detector at LHC", CMS TN-1994/264.
- [3.16] Ph. Busson, J. Varela, "Calorimeter Trigger in CMS: Algorithm studies", CMS TN-1994/219.

# 4 Calorimeter Trigger Primitive Generation

The Trigger Primitive Generation sub-system (TPG hereafter) performs the first computation steps in the synchronous and pipelined Calorimeter Trigger System, interfacing directly the ECAL and HCAL front-end electronics systems, from where it receives detector data. The TPG computes the calorimeter trigger primitives and transmits the results to the Regional Trigger system (Chapter 5). The TPG interfaces directly to the Readout and Control sub-system (Chapter 7) to which it delivers trigger primitives data to be recorded. Whenever possible, the HCAL and ECAL will employ the same methods for synchronization.

## 4.1 Requirements

### 4.1.1 Functional Requirements

The basic function of the Trigger Primitive Generation sub-system is the computation of the calorimeter trigger primitives as specified in [4.9]. These computations include:

- transformation of the input scale to transverse energy scale
- digital filtering of the detector signals to extract the transverse energy and the bunch time information
- computation of the trigger towers energy sums
- computation of the fine grain electron/photon identification from the ECAL data (Fine Grain Veto bit)
- computation of a minimum ionizing energy flag in the HCAL tower to aid in muon identification (MIP bit)

In order to guarantee a proper operation of the trigger system in synchronous and pipeline mode, the TPG performs the following synchronization functions:

- synchronization of the trigger primitive data
- flagging of the trigger primitive data produced at the bunch 0 of the LHC cycle

The TPG transmits the trigger primitives data through high speed serial links to the regional trigger.

The TPG performs the following readout functions:

- trigger primitives data storage in pipeline memories with programmable length
- data formatting and storing in de-randomizer buffers after L1A
- interface with the Readout and Control sub-system

The TPG performs the following test functions:

- interconnections test with boundary scan
- internal pattern generation at data input, propagation at 40 MHz and storage at the output
- self-trigger generation

### 4.1.2 Performance Requirements

The TPG performs its computations in pipeline mode sustaining a maximal input rate of 40 MHz. The TPG outputs the trigger primitives to the Regional Trigger after a fixed delay with a maximal value of 47 bunch crossing periods after the occurrence of the proton-proton collision.

Due to limitations in the computations performed by the TPG (algorithms approximations, relative timing of the sampling clock versus the signal maximum, limited number of bits for integer representation of the various parameters used in the TPG computations) the measurement process of the transverse energy in a trigger tower results in a degradation of the energy resolution. In order to have sharp efficiency threshold curves, we require the single electromagnetic shower relative resolution, as measured in test beam (i.e. no preshower in front of the crystal matrix), to be less than 3.5% for electrons and photons with transverse energy of 10 GeV. This relative resolution shall not exceed 2.0% when the transverse energy of the electromagnetic shower is greater than 20 GeV.

The time deconvolution performed by the TPG shall allow to have a 95% efficiency to assign the energy deposition of an electromagnetic shower to the bunch crossing where the particle was created if its transverse energy is greater than 0.5 GeV. This will ensure the correct measurement of isolation energy around an electromagnetic cluster as well as allow the good operation of the synchronization of the TPG (see Subsection 4.7.4)

### 4.1.3 Interface Requirements

The ECAL TPG receives digital signals from the very-front-end electronics located in the rear part of the ECAL detector. The light signal produced by the showers of impinging electrons and photons in ECAL is converted to an electrical signal by an Avalanche PhotoDiode (APD). The electrical signal is shaped and distributed to a set of four amplifiers with different gains (1, 5, 9, 33) and four sample and holds which are collectively named Floating Point PreAmplifiers (FPPA). Every LHC clock a dedicated logic selects the non saturating sample with the maximal gain and sends the sample value to a 12-bit sampling ADC working at 40 MHz. The 12-bit word generated by the ADC (D-word hereafter) and a 2-bit word (hereafter G-word) encoding the value of the gain as chosen by the FPPA are complemented by 2 control bits. The 16 bits are multiplexed and the resulting serial signal is converted by an electrical to optical coupler connected to an optical fiber.

Figure 4.1 shows the schematic layout of the FPPA complex as well as the 14-bit code formed by the D-word and the G-word. Sets of 10 channels forming a mechanical sub-structure are connected to the TPG sub-system via optical fiber ribbons of 90 meters. A schematic view of the very-front-end electronics serving 10 crystals is shown in Figure 4.2.

The data transfer between the ECAL and HCAL TPG and the L1 regional trigger crates will take place at 1.2 GBaud using the Vitesse 7216 serial link transceiver chip on both



**Fig. 4.1:** ECAL very-front-end electronics: Floating Point Pre-Amplifier complex

transmission and reception ends. Both HCAL and ECAL TPGs will use the same interface. Connection between the TPG sub-systems and the regional trigger system is done using a copper cable with a maximal length of 20 meters. Provision should be made at both ends to easily connect and disconnect the shielding to ground.

Each serial link will transmit twenty-four bits of serial data every 25 ns period. The format of the data, for every link is described in Subsection 4.8.1. The data on the serial links across the entire detectors, ECAL and HCAL, shall arrive together for each bunch crossing to within one bunch crossing interval. However within each bunch crossing interval, the relative phase of the serial links may not be aligned. The correction for this phase will be done on the regional trigger side.

The TTC system delivers to the TPG the 40 MHz clock as well as the LHC control signals. There are four independent partitions for the ECAL TTC sub-system (2 for the barrel and 2 for the endcap) and six partitions for the HCAL TTC sub-system (see Chapter 16)

The HCAL input signal is shown in Fig. 4.3. This analog signal is conditioned using a multi-range current splitter and gated integrator, the QIE (Q for charge, I for integrating, and E for range encoding) ASIC (see Fig. 4.4). The outputs of this ASIC are 2 bits of range information and a 5 bit analog level corresponding to the integrated charge on the encoded range. After conversion of the analog level by an ADC, the digitized result is in a seven bit pseudo floating-point format with 2 bits of range (or exponent) and 5 bits of charge (or mantissa). A separate digital ASIC is



**Fig. 4.2:** ECAL sub-module very-front-end electronics

needed to control and synchronize the QIE channels and to transfer the results over a high speed link to the trigger and data acquisition electronics.

The front end system operates continuously and synchronously with the accelerator radio frequency time structure, 25 ns between beam crossings. Operations are completely controlled by an external clock. A digital value for the energy deposited in every calorimeter



**Fig. 4.3:** HCAL input pulse



**Fig. 4.4: HCAL QIE ASIC**

channel for each 25 ns interval is transmitted from the front end electronics to the trigger and data acquisition electronics. This 40 MHz clock is provided by a sophisticated distribution and receiver system and includes a synchronization marker which occurs once per orbit of the beam. The marker occurs during the gap in the collider fill reserved for the beam abort function and is used in conjunction with a bunch counter to provide an important check of data validity.

A serial fieldbus is used for communication with and control of the front end systems. Monitor and alarm functions are provided for photodetector high voltages and currents, electronics low voltages and currents, synchronization validity, and temperature values. This fieldbus is also the pathway for exercising test and diagnostic functions, downloading control and parameter data, and selecting operational modes.

Signal processing functions required of the HCAL readout electronics chain during colliding beam operations can be summarized as follows:

- Analogue signal conditioning of photodetector responses.
- Digitization of conditioned analogue signals at the beam crossing rate of 40 MHz.
- Transmission of digitized values from the detector at 40 MHz.
- Linearization and conversion of front end results into deposited energy values at 40 MHz.
- Generation and transmission of filter-extracted first level trigger information at 40 MHz.
- Pipeline storage of linearized energy values during the first level trigger decision interval at 40 MHz.
- Buffering of linearized time samples at first-level trigger accept rate.

All of the 40 MHz signal processing operations at the very front end of the system are synchronous with accelerator operations and are phase locked to the beam crossings. The higher levels of the readout system operate at L1A rate and are decoupled from the synchronous, pipelined front ends by a set of derandomising buffers.

#### 4.1.4 Testing Requirements

The TPG sub-system performs the following test functions:

- a) interconnections test with boundary scan
- b) internal pattern generation at data input, propagation at 40 MHz and storage at the output
- c) self-trigger generation

#### 4.1.5 Upgradability or Flexibility Requirements

Algorithms able to extract the Fine Grain Veto bit in each trigger tower is subject to future modifications. In order to guarantee evolution of these algorithms care should be taken to have provision for future upgrades of the hardware related to this feature.

The flexibility and upgradability resides in the heavy use through of FPGAs in both the ECAL and HCAL systems.

### 4.2 System Overview

The Trigger Primitive Generation sub-system is located in the CMS electronics room at the underground level. The TPG is housed in the same crates with the ECAL and HCAL readout electronics systems.

The ECAL readout and trigger system will be housed in 60 crates in total. Each ECAL crate will serve 68 trigger towers representing 1700 crystals in the barrel and a variable number of crystals in the endcap. Subsets of four trigger towers signals are processed by a single board receiving a maximum number of 100 digital signals coming from the ECAL very-front-end. In the barrel case these four trigger towers are in the same  $\eta$  slice. Each board receives signals from the TTC system and a TTCr<sub>x</sub> recovers the clock as well as the trigger related information.

The four TPG outputs from one board are grouped by two and sent to the regional trigger by two serial links as described in 4.1.3

A schematical representation of the TPG sub-system in the ECAL readout and trigger system is shown in Figure 4.5. The figure represents the digital signals from the very front-end, the optical to electrical conversion with demultiplexing stage, the trigger primitive generation and the synchronization, and the serial link transmission to the regional trigger.

Figure 4.6 shows the functional description of the TPG hardware. From the left hand side of this figure we have:

- a) input signals from the 25 crystals after their demultiplexing.
- b) five linearizers (Subsection 4.3.6) and adders (Subsection 4.5.2) blocks of five channels each. The main function of each block is to build the strip signals which are used to extract the Fine Grain Veto bit (Subsection 4.6.1).
- c) five amplitude filters performing the filtering of the strip signals in order to extract the amplitude of these signals (Subsection 4.5.1).
- d) five peak finders flagging the bunch crossing which corresponds to the bunch collision responsible for the signals (Subsection 4.4.1).



**Fig. 4.5:** The TPG part of the ECAL readout and trigger system shown for two trigger towers.

- e) one single trigger cell processor combining information for the whole trigger tower. The total transverse energy (Section 4.5) and the Fine Grain Veto bit (Section 4.6) are computed using the filtered strip signals.

Histograms in the bottom part of the same figure show the combined effect of the amplitude and time filters on each individual strip signal.



**Fig. 4.6:** TPG functional diagram for one trigger tower

HCAL will have 24 VME crates of receiver and TPG cards. Figure 4.7 shows the HCAL TPG crates as envisioned. Data from the front end QIE devices are transmitted over the same fiber links as used in the ECAL, to HCAL Trigger and Receiver (HTR) cards. These cards use FPGAs to produce trigger primitive data which are transmitted to the L1 trigger over the same Vitesse links as used in ECAL (see Fig. 4.8). The HTR cards also buffer the data in a 40 MHz pipeline waiting for the L1 decision. L1A causes the appropriate data to be moved to derandomizing buffers before their subsequent transfer to the DCC.



**Fig. 4.7:** HCAL readout and trigger crate

## 4.3 Information from the Calorimeter

### 4.3.1 Trigger Tower Definition

Each trigger tower consists of two input channels: in the endcap and barrel regions the two channels consist of the electromagnetic (ECAL) and hadronic (HCAL) compartments. In all of the barrel and most of the endcap regions, the size of the trigger tower is  $\Delta\eta = 0.0870$  by  $\Delta\phi = 0.0873$ . At high  $\eta$  in the endcap region, the towers must become larger in both directions. Figure 3.4 indicates the nominal tower segmentation of one half of the CMS detector in the  $\eta$  direction.



**Fig. 4.8:** The TPG part of the HCAL readout and trigger system shown for two trigger towers.

For all ECAL barrel trigger towers, a total of 25 crystals are summed to form a single trigger tower ECAL channel. In the ECAL endcap region the trigger towers have a variable number of crystals never exceeding a maximum value of 25 crystals as already described in subsection 3.3.1.

### 4.3.2 Data Formats for ECAL and HCAL

Each 25 ns the ECAL TPG receives from the ECAL front-end a 16-bit data word for each electronics channel. There are 76832 electronics channels in total. This 16-bit word is composed as follows:

- a 2-bit word G which encodes the gain selected by the FPPA
- a 12-bit word D which is the sampled and digitized voltage value delivered by the ADC
- 1 bit flagging the data types. A zero value for this bit indicates that the front-end is sending physics data (during normal operation mode) while a one value indicates that the front-end is sending monitoring data (during temperature and dark current measurement phase).
- 1 bit is reserved for future use

Together the G and D digital values allow to encode the energy deposited in the crystal. The full dynamic range is 1.5 TeV in the barrel detector and 3.0 TeV in the endcaps.

This 16-bit word is complemented by 4 control bits in order to form a 20-bit word following the HP G-link protocol convention. The 20 bits are multiplexed and transported to the TPG via a 800 MHz serial link

For the HCAL, there are approximately 13,000 physical towers, divided up into front and back (relative to produced particles) towers. For most of the HCAL, trigger towers are constructed by combining these two front and back towers into a single trigger tower. For the overlap regions (near  $\theta=53$  degrees) as many as 5 towers can be combined into a single trigger

tower. The total number of trigger towers delivered to the trigger system is approximately 4,200. The front-end sends a 7-bit pseudo floating point word corresponding to 2 bits of range and 5 bits of significant digits every 25 ns to the receiver crate. Transmission is also via 20-bit frames in encoded mode, where 16 bits are data and 4 bits control/balancing/synchronization in the usual way. The 16 data bits are the sum of 2 7-bit channels, plus 2 bits of “capacitor id”, corresponding to the particular QIE sample and hold capacitor used (there are 4 to accommodate the 25 ns crossing). The dynamic range for the HCAL in the barrel and endcap is determined from the requirement that the detector responds reliably to minimum ionizing particles at the low end, and 3.5 TeV at the high end, equivalent to 15 bits dynamic range. In the outer compartments, the full scale is 1 TeV since these regions are essential “tail catchers”. This is equivalent to 14 bits of dynamic range. In the forward region where 1 photoelectron constitutes 400 MeV, the dynamic range is set to 12 bits.

### 4.3.3 Synchronization of the Trigger Tower Channels

Difference in the length of the optical fibers transmitting the front-end data results in different arrival times of the front-end data generated by the same proton-proton collision. This could prevent the TPG from properly compute the trigger primitives. An alignment procedure of the demultiplexed data inside the TPG is done using programmable delays before the computational functions are performed. Differences of  $\pm 1$  bunch crossing period from channel to channel in arrival time can be corrected using registers.

For the HCAL, signals will arrive into asynchronous FIFOs which will have the derived HP G-link byte clock on the input and the TTC 40 MHz system clock on the output. This will ensure synchronization within the cards. Programmable delays, calculated *in situ* during commissioning, will determine card-to-card phase differences for transmission to the L1 trigger. However, since each HTR card will receive the same TTC clock, we expect these card-to-card phase differences to be minimal.

### 4.3.4 Suppression of Bad Channels

Each electronics channel identified as permanent bad channel (dead channel in the front-end electronics or dead serial transmission link) is disabled by propagating a zero value for the corresponding channel in the TPG computations pipeline.

During the demultiplexing phase performed by the TPG at reception of the digital data from the ECAL or HCAL front-end detection of synchronization loss results in the disabling of the corresponding channel inside the TPG. Any synchronization loss is logged for monitoring by the control system.

### 4.3.5 Zeroing of Channels during Monitoring

During the reception of monitoring data (measurement of temperature or leakage current) from the ECAL front-end a zero value is propagated to the TPG system. ECAL laser pulses generated during the abort gap are not sent to the regional trigger system.

The HCAL front-end, on the other hand, will only be the result of the QIE operation without regard to the source. These data will appear during run-taking only in the abort gap, and will be dealt with differently than real physics data. At this time, the HCAL group anticipates that

this will be handled in firmware. Such monitoring data (from laser pulses, etc.) will not be sent to the regional trigger system.

### 4.3.6 Linearization of ECAL Data and Scale Transformation

From the 14-bit energy code formed by the 2-bit word G and the 12-bit word D the TPG extracts the corresponding transverse energy. This value is given by  $E_T = \alpha(G)[D - \beta(G)]$  where the multiplicative coefficient is the gain (determined by the value of the digital code G) and the subtracted value corresponds to an offset also function of the digital code G. The values of the multiplicative coefficient are such that the conversion from the input energy scale to the transverse energy scale and the channel calibration are done at the same time as the linearization of the encoded 14-bit word.

### 4.3.7 Linearization of HCAL Data and Scale Transformation

From the 7-bit pseudo floating point value (plus 2 cap id bits) supplied by the QIE, the HTR card will use a lookup table that will take the 9 bit number as address and produce a 16 bit energy. The front and back towers are added together and the bunch crossing is determined. This energy will then be associated with that bunch crossing, and a muon MIP window is applied to form the muon bit. The energy is then transformed to transverse energy via a lookup table, and sent to the L1 trigger system.

## 4.4 Bunch Crossing Assignment

The signal as it is delivered by the preamplifier of the ECAL front-end has a peaking time of the order of 70 ns with a FWHM equal to 115 ns for a total duration of approximately 15 periods of 25 ns. A mathematical model of the signal shape is represented in Figure 4.9.

The Calorimeter Trigger system expects from the TPG time deconvoluted signals. This means that the effects of the time evolution of the signals has to be reversed. This time deconvolution allows in principle to assign a particular signal to the proton-proton collision responsible for the development of the shower in the ECAL detector.

Performing this deconvolution at the level of the TPG allows in particular to register the LHC bunch time structure in a very simple way. This is of paramount importance for the ease of synchronization of the TPG data sent to the Regional Trigger.

In the case of the HCAL, the light signal from the scintillator has a time constant of approximately 11.3 ns. The impulse response of the photodetector and the differences in optical QIE for integration further stretches the signal. By the time it arrives at the QIE for integration, the signal can be found in 3 separate time buckets. The amount in each time bucket is somewhat programmable in the front-end phase delay, optimized for pileup considerations (discussed below). A mathematical model of the signal as it arrives at the QIE is presented in Figure 4.10.

### 4.4.1 Principles of the Bunch Crossing IDentification

Deconvolution filters are very easy to synthesize knowing the signal shape. In the frequency domain the calorimeters signal is of low-pass band type and consequently the



**Fig. 4.9:** Analytical model of the ampli-shaper signal output given by the ECAL very-front-end.



**Fig. 4.10:** HCAL pulse model

deconvolution filter must be of high-pass band type. In this circumstance the high frequency electronics noise is magnified. This well known effect will prevent a deconvolution filter to perform properly.

We propose to design a system using two complementary filters: one filter for optimal amplitude extraction in presence of electronics noise and one filter looking for the maximum of the shaped signal. The time filtering function performed by this peak finder filter will be denominated as the Bunch Crossing IDentification (BCID) function in the rest of this document. This corresponds to the fact that this filter will identify the bunch crossing responsible for the appearance of the signals in the detectors.

#### 4.4.2 Functional Description of the BCID

The BCID filter consists of a peak finder using as input the output values of the energy filter described in the sub section 4.5.1. The peak finder is permanently scrutinizing a set of three consecutive input samples and making two comparisons in parallel between the middle sample value and the two other ones. It outputs 1 when the middle sample value is greater than the others, otherwise it outputs a 0 value.

### 4.5 Energy Sums

Sums are performed at different levels of the TPG: building of the virtual strips signals in the ECAL case or summation of the front and back stacks in the HCAL case and computation of the total transverse energy in the trigger tower. This section will describe the principles used to extract the signal amplitude from a set of consecutive samples of the digital sample values of the electrical signal delivered by the front-end.

#### 4.5.1 Principles of Energy Filtering

The amplitude filters are realized using a sliding weighted sum of five consecutive samples. The weight values are optimized for the shape signal as shown in Figure 4.9 for ECAL case and Figure 4.10 for HCAL above. They are computed in order to take into account the characteristics of the electronics noise and in particular the autocorrelation of the noise values between two distant time samples.

Possible dynamic fluctuations of the baseline are also taken into account using the first weight for a “measurement” of this baseline.

Typical values for the ECAL weights are -0.96, -0.22, 0.62, 0.42, 0.14. Variations of the noise autocorrelation function induce variations on the values of the optimal weight. The most sensitive weight is the third one and its value can change by a factor 3 for extreme variations of the noise characteristics.

For ECAL, all amplitude filters will have 18-bit input buses and 18-bit output buses. This will allow to keep the full dynamic range of the analog signal delivered by the APD and prevent any saturation effect. This is mandatory for the good operation of the peak finders in performing of the bunch crossing identification. Saturation effects in the signals will destroy completely the ability of the TPG to flag correctly the bunch crossing where the signals originated.

For the HCAL, the front-end phasing will be adjusted so that the energy will appear in 3 buckets with the ratios: 0.47, 0.47, 0.06 for the earliest to latest. The values in the previous two buckets will be used as a “before” signal for subtraction. The algorithm will therefore be:

$$\text{EHCAL} = -1.5*(E_1+E_2) + (E_3+E_4+E_5)$$

where  $E_1$  and  $E_2$  are the early (“before”) signals,  $E_3$  and  $E_4$  contain roughly 47% of the energy, and  $E_5$  contains roughly 6%.

### 4.5.2 ECAL Adder Tree Structure

Several adder trees are necessary to compute the different sums used by the TPG. These are:

- a) adder trees to compute the strip signals. All have five 18-bit input buses and 18-bit output buses for the reason explained previously. In case of overflow anywhere inside the tree the output is set to the maximum allowed value by a 18-bit number.
- b) adders to compute the sum of two adjacent strip signals. These are 12-bit input and 12-bit output adders. In case of overflow the output value is saturated to the maximum allowed value for a 12-bit number
- c) adder trees to compute the total transverse energy. Computations are done with 12-bit values as input and truncation to 10-bit is done before sending this value to the LUT performing the non linear transformation (Subsection 4.5.3)

### 4.5.3 Non Linear Transformation of the Energy

The total transverse energy as computed by the ECAL and HCAL TPGs is encoded in a 8-bit word using a non-linear scale. This function will be implemented using a LUT. Internally the transverse energy as computed directly by the adder trees will be truncated to a 10-bit integer (in case of overflow the value will saturate the 10 bit capability). Examples of this non-linear transformation of the transverse energy scale are given in Figure 4.11.

## 4.6 Fine Structure Bits

One important task of the calorimeter TPGs is to analyze the fine structure of the energy deposit in the detector cells forming trigger towers. The algorithm is applied to the energy in individual cells inside trigger towers, taking profit of the fine granularity of the CMS calorimeters, in order to better characterize, at the trigger level, the nature of the particles interacting in the detector.

In the ECAL case, the transverse energy profile inside the tower is used to flag energy deposits that are not compatible with electrons or photons. In the HCAL case, the longitudinal energy profile is used to flag MIPs. The result of both algorithms are single bits, the Fine Structure Bits, that are transferred to the Regional Trigger together with the tower energy sums.

### 4.6.1 Algorithms to extract the Fine Structure Bits

In each ECAL trigger tower the lateral energy deposition is described with a single bit called the Fine Grain Veto bit.

In the barrel a trigger tower is virtually divided in sub-regions called strips which are sets of crystals aligned along the magnetic field bending direction. Each strip has 5 crystals and a



**Fig. 4.11:** Curve showing the output of the LUT versus the transverse energy of the trigger tower 7 or 8 bit cases.

trigger tower comprises 5 strips in total. In order to flag an electron or photon shower as seen by the ECAL, the algorithm relies mainly on the fact that the lateral extension of its shower is very narrow and the energy is concentrated in very few crystals (about 80% of the energy of an electron is contained in a single crystal). Due to effects of bremsstrahlung for electrons or conversion for photons the shower is splitted in the ECAL and efficient collection of the total energy of the particle as generated at the vertex requires to sum the energy of the emerging clusters along the bending direction. The spread along the other direction (due to variable ECAL entry point) is taken into account summing on two adjacent strips.

The computation is done using the following steps:

- a) compute energy released in each pair of adjacent strip (4 different such pairs in trigger tower)
- b) find the strip pair with the maximum energy
- c) compute the total energy released in the trigger tower
- d) compute R, the ratio of the maximum strip pair energy and the total energy
- e) compare the ratio R to a threshold
- f) set the Fine Grain Veto bit using the result of this comparison

Depending on the tower transverse energy range, two R thresholds are used. In the high energy domain (higher than 5-10 GeV) the threshold is optimized to guarantee an efficiency for electrons and photons above 95% (typical value is 0.90). In the low energy domain, a higher

threshold (typically 0.95) is used to trigger on low  $p_T$  electrons from b-quark decays which requires moderate efficiency but an high rejection power.

Subregions equivalent to the strips of the barrel cannot be defined in the endcap trigger towers. Another approach has been developed but it relies also on the fact that an electromagnetic shower is always a compact object compared to a trigger tower.

As an example the following algorithm is described:

- a) find the crystal with the maximum energy inside the trigger tower
- b) find the first neighbor crystal (West, East, North and South) with the next maximum energy
- c) sum the two values of energy as found by the previous steps
- d) compute the total energy released in the tower
- e) compute the ratio R as in the barrel case
- f) compare this ratio with a threshold (preliminary studies showed that a value of 0.80 allows to have good efficiency to flag electromagnetic shower in the endcap)

#### 4.6.2 Generation of ECAL Fine Grain Bit

As described in the previous sub section the Fine Grain Veto bit algorithm requires the computation of the ratio of signal of the strip pair with maximum signal ( $E_T^{\max}$ ) over the total signal in the trigger tower ( $E_T^{\text{tot}}$ ). This ratio is then compared to a threshold (T). As the division operation is time consuming we replace the two previous steps by the equivalent mathematical operation: sign of ( $E_T^{\max} - T \times E_T^{\text{tot}}$ )

In the actual implementation we use for the two energies values truncated to 10 bits and represent the value of T using a 7-bit number. Normalization is done performing a simple right shift of seven places. In this way the threshold value can be changed by steps of 1/128th.

#### 4.6.3 Generation of HCAL feature bit

Before the energy is converted to transverse energy, but after the front/back tower summation, a muon window will be applied to assist in finding minimum ionizing muons in the HCAL. If the HCAL trigger tower is within this window, a feature bit will be set. The values for the minimum ionization window are 1.5 and 2.5 GeV.

### 4.7 Synchronization and Latency

In this section we discuss the synchronization of the calorimeter trigger primitives and of the calorimeters readout chain up to the output of the L1 pipelines. It includes the synchronization of the calorimeters signals with the clock, the synchronization of the calorimeter channels at the input of the trigger primitive generators and the overall synchronization of the trigger primitives. Estimates of the latency from the proton-proton collision up to the input of the regional trigger are also given.

### 4.7.1 Synchronization Principles

The Calorimeter Trigger is a synchronous and pipelined system working at 40 MHz. In order to improve the trigger latency some sectors of the trigger chain work at multiple frequencies, 80 and 160 MHz. At the input of every processing stage the data has to be synchronized and should belong to the same bunch crossing.

The alignment of the data (trigger links and DAQ pipelines) is based on the identification of the LHC bunch structure (see Chapter 17). Histograms of the bunch crossing number for events with energy (per channel or group of channels) above a threshold are used for this purpose. Empty bins in the histogram should correspond to the gaps of the LHC beam structure [4.8].

The histograms are incremented at LHC frequency using dedicated synchronization circuits (SyncTx/Rx, see Section 4.7.6) in the ECAL and HCAL readout and trigger primitive boards. Each circuit receives the output of the TPG for two trigger towers and increments the histogram if one of the two towers is above a given energy threshold. The bunch crossing number is reset by the TTC fast command BC0, synchronous with the beginning of each LHC orbit. Masking all but one detector channel feeding a particular TPG allows measuring differences in time alignment on a channel basis. The histogramming energy threshold is applied after L1 filtering so that the histograms show net gap boundaries.

The content of the accumulator (histogram) is accessed by the crate CPU where the correlation function between data and the expected bunch profile is computed allowing the monitoring of the time alignment. Local misalignments (at board level) are compensated by a corresponding number of programmable steps in individual synchronization pipeline registers at the input stage of the readout and trigger boards. Global misalignments are compensated in the SyncTx/Rx circuits.

The threshold for histogram incrementing is set at 1 GeV, in order to guarantee efficient clock tick assignment by the L1 Filter. At this threshold the crystal occupancy is around  $10^{-5}$  for minimum bias collisions. Requiring 10 counts per bin in the histogram and assuming one collision per crossing ( $L=10^{33} \text{cm}^{-2}\text{s}^{-1}$ ) we estimate that the determination of the alignment constants for the ECAL channels will take about 2 hours of beam time.

The histogramming method can tolerate a noise level of collisions during the gaps, which will nevertheless increase the time needed to accumulate significant statistics.

### 4.7.2 Distribution of Clock, L1A, Reset and BC0

The distribution of the Clock, L1A, Reset and BC0 signals to the calorimeter readout and trigger primitive boards uses the TTC system (see Chapter 16). The ECAL TTC system is divided in four independent partitions, two barrel and two end-caps, each one driven by one TTCvi module [4.20]. The HCAL TTC system is divided in six partitions (two barrel, two end-caps and two forward). HCAL will use the same TTC distribution system as ECAL.

Optical splitting of the TTC signal is used to feed each calorimeter readout and trigger primitives crate. Electrical fan-out is used to feed the modules in each crate (Figure 4.12). One TTCrx chip [4.21] is used per board.

All TTC distribution fibers and cables in the counting room will have the same length, so that the path from a TTC distribution source to the TTCrx chips is identical. The length of the TTC distribution fibers is of the order of 10m and within each crate the TTC distribution cables have about 1m. The goal is to achieve synchronization between the L1A and BC0 commands arriving at all modules in the system better than 1ns.

In the ECAL case, the distribution of the clock and control signals to the Very Front-End (VFE) electronics is done per group of ten crystals using fiber pairs (one clock, one control) linking the ECAL readout and trigger modules (ROSE100) and the VFE. In the HCAL case, the TTC signals are distributed to the detector electronics by TTCrx radhard chips located in the very front-end.

The detector signals, after amplification, must be synchronized with the sampling 40 MHz clock. The phase of the clock relative to the detector signals has to be adjusted in order to provide optimal signal sampling. The readout pipeline-derandomizer in the calorimeter readout boards allows reading sequences of samples of programmable length (called time frame). A



**Fig. 4.12:** Layout of the TTC distribution in the crates.

suitable pulse parameterization is fitted to the time frame for each individual channel to extract the phase between pulse maximum and the clock. Fine deskewing of the clock allows adjusting the phase so that one sampling occurs at pulse maximum.

The measurement is made with pulses of energy above a few GeV in order to be insensitive to noise. Based on the expected  $\pi^0$  rate, we estimate that at very low luminosity ( $10^{31} \text{cm}^{-2}\text{s}^{-1}$ ) the time needed to accumulate about 1000 pulses per crystal of transverse energy above 3 GeV is of the order of 5-6 hours.

A local trigger in each calorimeter upper-level readout and trigger board, based on the trigger towers energy sums, allows selecting the interesting events for readout by the crate CPU. The data is processed locally in the crate CPU, new clock deskewings are estimated, stored in the database and loaded in the electronics. During normal data taking, spying events collected by the crate CPU are used to monitor the phase stability.

The deskewing of the Clock, L1A and BC0 signals is programmable in the TTCrx circuits.

### 4.7.3 Synchronization of the Detector Links

In this section we discuss the synchronization of the detector links that carry data from the detector very front-end to the counting room. By link synchronization we mean the synchronization of the data frames, after deserialization, with the 40 MHz clock.

These links are used to transfer digital data from the detector to the counting room. The transfer is synchronous at 40 MHz. The links have to guarantee that after transmission the frames are correctly synchronized with the 40 MHz clock. Losses of link synchronization can occur and they must be identified and flagged by the receiver circuit. In order to recover synchronization the transmitter shall send a sequence of known patterns that are recognized by the receiver.

After de-serialization, the data frames are synchronized with the local clock and eventually delayed by a programmable number of clock cycles to achieve inter-channel alignment.

The synchronization protocol between the Serializer and Deserializer is a sub-set of the HP G-Link protocol, the Conditional Invert Master Transition protocol. The frame structure consists of 16 bits of data (D-Field) and 4 bits of control (C-Field), with the Master Transition always present at the same place in the frame.

Link synchronization is obtained during a special SYN operation phase. The deserializer receives fill frames FF1(a) FF1(b) sent by the serializer upon reception of a synchronization command (through the control link). The synchronization procedure is fully automated. It is initiated by the ROSE (for ECAL) or HTR (for HCAL) board controller upon reception of a VME command or a request from a deserializer that loses synchronization. On the ROSE boards, the synchronization procedure can be applied to groups of ten links while the remaining links work in normal mode. On the HTR card, the synchronization will be available on a link by link basis. For both systems, during the link synchronization procedure the deserializer sets a flag that accompanies the data in the readout pipeline.

The deserializer uses the control bits (C-Field) to recognize link synchronization losses. Two consecutive errors set the link in out-of-sync state, the NSYN bit is asserted and a synchronization procedure is initiated. The ROSE and HTR board controller counts the number of

sync losses per link. The content of the counters is checked periodically by the crate CPU. Links with a too high synchronization loss rate are disabled.

The serialiser can be operated in a special mode where a link identifier is transmitted. This feature will be used to check the interconnections between the detector and the counting room.

#### 4.7.4 Synchronization of Trigger Primitives

The trigger primitives are generated in the same boards were the DAQ pipelines are located (calorimeter readout and trigger boards). Both share the same system of bunch crossing identification using beam data as described in Section 4.7.1.

Without beam, the synchronization of the trigger primitives is checked:

1) With test patterns generated synchronously at the input of the readout/trigger boards, upon reception of a TTC broadcast command, measuring the location of the test pattern in the trigger synchronization FIFO (see Section 4.7.6);

2) With laser pulses distributed synchronously to a given detector region, measuring the location of the laser pulse in the same way.

The output of the trigger primitive generator is itself stored in DAQ pipelines. Similar methods as the ones used to synchronize detector data are used to synchronize the trigger primitives pipelines (see Section 4.7.5).

The trigger primitives transmitted to the regional trigger are globally aligned at the output of the trigger synchronization circuits SyncTx/Rx (Section 4.7.6). An appropriate setting of the BC0 command guarantees that for every orbit the first word written in the synchronization FIFO in the circuit corresponds to data from bunch crossing zero. A common synchronized BC0 signal starts the FIFO reading at each new orbit synchronously in all trigger channels. This common signal is distributed by the TTC system.

The trigger primitives are transmitted to the regional trigger through copper serial links. With programmable periodicity the serializer enters in Sync mode sending synchronization patterns during the Abort Gap. When a loss of synchronization is identified by the deserializer a bit is set which causes the corresponding trigger channel to be masked. Masking is also applied if the EDC decoder recognizes a frame error. Synchronization is recovered in the next re-sync cycle.

The Gap Flag is part of the 24-bit data frames that are sent every crossing to the Regional Trigger (see Section 4.8.1). This Flag is used by the Regional Trigger to check the proper alignment of the trigger primitives.

#### 4.7.5 Synchronization Setting-up, Monitoring and Recovery

The synchronization of the TTC distribution system is verified in order to guarantee that the fast commands, in particular BC0 and L1A, arrive synchronously to all readout and trigger modules in the ECAL and HCAL systems.

The synchronization between L1A and the pipeline data is first established at the level of the electronics chain using test patterns. This procedure is repeated for all channels, in order to check that the calorimeter trigger latency (counted from the input of the readout/trigger boards) is the same irrespective of the trigger geographical origin.

The next step is to measure the latency of the detector optical fibers using the laser monitoring system. The laser pulses maintain a constant phase relative to the clock within a few nanoseconds. A single laser pulse is distributed to a sub-section of the detector and in parallel the corresponding trigger signal is distributed by the TTC system. The analysis of the location of the laser pulse in the readout pipelines allows to measure differences between the propagation times on the detector fibers in that sub-section.

The synchronization procedures with beam follow three steps: setting the clock phase, alignment of all channels to the BC0 reference and synchronization of L1A with pipeline data [4.19].



**Fig. 4.13:** Programmable delays in the ECAL/HCAL readout and trigger system used for synchronization.

The programmable delays available to the ECAL/HCAL readout and trigger primitives are the following (Figure 4.13):

- a) Clock phase fine deskewering in the TTCrx chips (per groups of 100 channels)
- b) Channel synchronization pipelines (per channel)
- c) Trigger synchronization FIFOs in the Sync chips (per trigger tower pairs, see Section 4.7.6)
- d) Deskewering of L1A, BC0 and broadcast commands in the TTCrx chips (per groups of 100 channels)
- e) Delay, relative to LHC orbit signal, of BC0 and other fast commands in the TTcvi module (per partition)

The synchronization of the detector links is monitored in permanence by the deserializer circuit. If the link de-synchronizes the circuit sets a flag that accompanies the data in the readout pipeline. This flag stays up until synchronization is recovered, so that all data produced during that interval are flagged. In parallel the board controller is warned to initiate a synchronization procedure. A command is sent to the VFE which instructs the serializer to send synchronization patterns during a fixed interval of time. This procedure is expected to take less than 10  $\mu\text{s}$  and can be executed during data taking without the need for a full system reset.

The bunch crossing assignment is monitored in permanence with the bunch profile histograms accumulated on the readout/trigger boards. The monitoring is done by groups of 50 channels in the ECAL and groups of 4 channels in the HCAL (two trigger towers). Histograms with about 1000 events per bin are transferred every 5-10 minutes to the crate CPU which analyze its content to check the bunch crossing assignment. For redundancy, similar histograms are build by the CPU using spying events. Assuming 10 kHz spying rate and a coarser monitoring (one histogram per 100 channels) it will be possible to accumulate one such histogram (with ten times less statistics) per day ( $L=10^{33} \text{ cm}^{-2}\text{s}^{-1}$ ). The database stores the status of bunch crossing assignment with a period of 5-10 minutes. The recovery from bunch crossing misassignments implies an adjustment of the BC0 deskewing at the level of the TTCrx circuits concerned. This operation requires a system reset.

#### 4.7.6 The Trigger Synchronization Circuit

The synchronization of the trigger data is achieved with a dedicated circuit, the SyncTx/Rx circuit [4.11], placed between the trigger primitive generators (transmitter side) and the regional trigger (receiver side). The circuit is divided in two main blocks. The SyncTx block gets data from the TPG and is responsible for flagging the bunch zero data and for the accumulation of the bunch profile histogram. The SyncRx block contains the synchronization FIFO (64x24 bits), monitors the synchronization errors and sends data to the Regional Trigger. EDC encoding functions are also provided. The circuit includes a Control block which performs the I/O control functions and executes Boundary Scan commands as well as a Built-In Self-Test (BIST).

The input data bus has 24 bits, as well as the output data bus. The Gap Flag output is set at zero during the gap and at one during the remaining orbit periods. The circuit is operated with two clocks, the TxClock synchronous with the input data, and the RxClock that is a common clock to all trigger primitive modules (same phase). The TTC command lines allow to give the circuit the needed fast commands (BC0 Tx/Rx). The control lines are used for circuit programming and for readout of the accumulator or FIFO content. The circuit is also equipped with JTAG boundary scan lines.

The SyncTx/Rx circuit implements two operation modes: the normal Synchronization Mode and a special Test Mode. In Test Mode, several options are available, namely the generation of counter data or constant data, the FIFO filling mode and the transparent mode. The FIFO can work in Sync Mode, driven by the external BC0 commands, or in Pipeline Mode, with a pre-programmed delay. In Sync Mode, the delay is determined by the BC0 Tx/Rx commands and is available in the FIFO Latency Register. The behavior of the Synchronization FIFO is independent of the relative phase between RxClk and TxClk.

The accumulator is a RAM block with 3564\*10 bit words. When receiving the command BC0, the accumulator address is reset and the accumulator logic prepares to start



**Fig. 4.14:** Block Diagram of the SyncTx/Rx circuit.

operating in the next clock period. Every clock periods, the content of 2 masked 8 bits data words of the input data bus (tower energy values) are compared to a programmable threshold. If one of the energies is above the threshold the content of the Accumulator at the current address is incremented. Then the current Accumulator address is updated. When a given accumulator address reaches the maximum, further increments at that address are inhibited. Clear or Read accumulator commands can be issued at any moment, without disturbing the main data path.

A prototype implementation of the SyncTx/Rx circuit was developed and successfully tested, using the FPGA XILINX-XC4020. All the functionality represented in Figure 4.14 was included. However, due to the FPGA limited capacity, the accumulator memory was limited to 297\*10 bit words. The larger RAM size available in recent FPGAs will allow to increase in the final version the accumulator size to 3564 words corresponding to the full LHC orbit.

In order to increase the testability of the trigger system, we have included Built-In Self-Test (BIST) capability in the SyncTx/Rx circuit. The execution of the BIST uses the Boundary Scan test path. Boundary Scan is a special type of scan path with a register added at every I/O pin of the IC with the important benefit of allowing fault isolation at component level. BIST allows the test of the IC without the need to load complex data patterns and without the need to analyze individual circuit output. In BIST operation mode, the circuit automatically generates test patterns and compresses its outputs. These compressed outputs, called “signature”, are shifted out when the test ends and then compared with the fault-free circuit signature.

#### 4.7.7 Latency

The latency from the collision time to the input of the Regional Trigger receiver cards is estimated at 46 bunch crossings. The details of this latency are described in Table 4.1.

**Table 4.1:** Latency of the calorimeter trigger primitive generation.

| Item                                  | Latency<br>in prototype | Latency in<br>final<br>version |
|---------------------------------------|-------------------------|--------------------------------|
| Time of flight to ECAL                | 0.5                     | 0.5                            |
| ECAL Very Front-End                   | 4.5                     | 4.5                            |
| Optical Link (90 m)                   | 18                      | 18                             |
| O/E and deserializer                  | 1                       | 1                              |
| Linear, BCID, Energy sums, Fine grain | 22                      | 15                             |
| SyncTx/Rx                             | 2                       | 2                              |
| Vitesse serializer                    | 1                       | 1                              |
| Copper Link (20 m)                    | 4                       | 4                              |
| Total                                 | 54                      | 46                             |

## 4.8 Interface to Regional Trigger

### 4.8.1 Data Frame Format

Twenty-four bits of serial link data are transmitted every 25 ns crossing period, per trigger link. The format of the data, for every link (two towers data per link) is:

- <7:0> Eight bit compressed transverse energy, tower 0 (lower  $\phi$ ), bit <0> is LSB
- <8> Tower characterization bit, tower 0
- <16:9> Eight bit compressed transverse energy, tower 1 (higher  $\phi$ ), bit <9> is LSB
- <17> Tower characterization bit, tower 1
- <22:18> Five bit inverted Hamming code
- <23> Gap flag (bunch crossing data valid flag)

The least significant byte is transmitted first in time over the serial link.

Towers 0 and 1 are adjacent in  $\phi$  at the same slice of  $\eta$ . Tower 0 is at lower detector  $\phi$  in the CMS coordinate system.

The tower characterization bit for the ECAL is the fine-grain veto used for electron identification. The tower characterization for HCAL is a minimum ionizing flag used for muon identification.

The five bit inverted Hamming  $c_{0..4}$  code is computed from the data bits  $d_{0..17}$  and the gap flag  $d_{23}$  thus:

$$\begin{aligned}
 c_0 &= d_{18} = \text{NOT } (d_0 \oplus d_1 \oplus d_3 \oplus d_4 \oplus d_6 \oplus d_8 \oplus d_{10} \oplus d_{11} \oplus d_{13} \oplus d_{15} \oplus d_{17}) \\
 c_1 &= d_{19} = \text{NOT } (d_0 \oplus d_2 \oplus d_3 \oplus d_5 \oplus d_6 \oplus d_9 \oplus d_{10} \oplus d_{12} \oplus d_{13} \oplus d_{16} \oplus d_{17}) \\
 c_2 &= d_{20} = \text{NOT } (d_1 \oplus d_2 \oplus d_3 \oplus d_7 \oplus d_8 \oplus d_9 \oplus d_{10} \oplus d_{14} \oplus d_{15} \oplus d_{16} \oplus d_{17}) \\
 c_3 &= d_{21} = \text{NOT } (d_4 \oplus d_5 \oplus d_6 \oplus d_7 \oplus d_8 \oplus d_9 \oplus d_{10} \oplus d_{23}) \\
 c_4 &= d_{22} = \text{NOT } (d_{11} \oplus d_{12} \oplus d_{13} \oplus d_{14} \oplus d_{15} \oplus d_{16} \oplus d_{17} \oplus d_{23})
 \end{aligned}$$

where the  $\oplus$  symbol is the exclusive or (XOR) operator. Note also that the net XOR is inverted so that in the case of all zero data bits, then all the error detection code bits will be one. Note that the gap flag is included in the error code, even though it is used only for synchronization purposes and is not considered data.

The Gap flag is a level defined as follows:

- It is set to zero during the injection gap (the effective length of the gap is programmable and can be smaller than the machine gap in order to allow for tests during the gap)
- It is set to one during the rest of the LHC orbit.

Bunch crossing zero is defined to be the first crossing after the gap which has colliding beams. This flag will be used on the receiving side in the regional trigger to verify the synchronization of the data on the serial links on a per-crossing per-link basis. In case of mis-synchronization, a system error flag will be generated; the individual bits will not be retained in the dataflow after verification.

#### 4.8.2 ECAL Interface to Regional Trigger

The Synchronization and Link Board (SLB) is a transition board between the ECAL Readout and Trigger cards (ROSE100) on one side, and the Regional Trigger and the ECAL Data Concentrator Card (DCC), on the other [4.18]. The main purpose of the transition board is to house on a small ROSE100 daughter board both the trigger and readout output links. In this way the maintenance of the link components is independent of the main ROSE100 board and future upgrades of the link technology are possible at reasonable cost. The same principle is used for the input detector links in the ROSE100 board. The trigger synchronization circuits are included in the SLB because they provide an useful tool for independent test of the trigger links. In the DAQ path, dedicated FIFOs are included for testing the links to the DCC.

In summary the functions of the SLB are the following:

- a) Synchronization of the trigger data and the selective readout (SR) flags;
- b) Serial transmission of the trigger data to the Regional Trigger;
- c) Transmission and reception of the selective readout flags (SR) to/from the DCC;
- d) Fast transmission of the ECAL readout data to the DCC.

The SLB is a 3U-size board connected to the RJ2 connector of VME64 crates. The communication with the ROSE motherboard is made through the user pins of the P2/J2 connectors. The communication with the DCC is made through point-to-point cable connections using the LVDS standard.

Figure 4.15 shows a block diagram of the SLB board. We can divide the SLB board in four main blocks, the SLB Controller, the Trigger path, the Selective Readout path and the DAQ path. Here we are mainly concerned with the Trigger path.

The SLB receives, from the ROSE motherboard, the Trigger Bus, the Selective Readout Input Flags, the DAQ Bus, the TTC bus, the TTC clock, the Serial Protocol Bus and the JTAG Bus.

The SLB transmits to the DCC the Selective Readout Input Flags and the DAQ Bus, and transmits to the Regional Trigger the trigger data. The SLB receives from the DCC the RxClock, the RxBc0 signal and the Selective Readout Output Flags. The SLB transmits to the ROSE motherboard the Selective Readout Output Flags. The DAQ Bus is accessible in a Spy Connector.

The SLB Controller controls all the operations to be performed on the SLB Board. It includes the TTC command decoder, the Serial Protocol slave and a Control and Configuration block. The TTC command decoder receives the TTC Broadcast Bus and decodes the TTC commands (Start, Stop, BC0 and Reset). The Serial Protocol is used for control communications with the ROSE motherboard.

Due to the large number of bits (48 bits + grounds) needed to transmit all the data between the two modules (the ROSE100 and the SLB) and in order to use only the number of pins available on the VME P2/J2 connector, use is made of the Channel Link technology that converts the 48 bits parallel TTL data bus on a 9 pairs LVDS Channel Link bus. This allows connecting both cards using a differential high-speed technology with good immunity to noise, while reducing the number of pins needed for interconnections.



Fig. 4.15: SLB Block Diagram

### 4.8.3 HCAL Interface to Regional Trigger

The HCAL will use the same interface to the Regional Trigger as the ECAL.

### 4.8.4 The Trigger Link

The data transfer between the calorimeters TPG and the L1 Regional Trigger crates will take place at 1.2 Gbaud using the Vitesse 7216 serial link transceiver chip on both transmission and reception ends. The selection of the Vitesse chip is provisional, and may change depending on commercial developments.

Both HCAL and ECAL TPGs will use the same interface. Note that the cable to be used has not been decided; provision should be made at both ends to easily connect or disconnect the shielding to ground.

## 4.9 Simulation Results

### 4.9.1 Software Tools

Dedicated Monte-Carlo programs have been developed to study the ECAL TPG performance. The first one simulates the showering of an electron or photon in a simplified geometry of the ECAL. It is able to reproduce the main characteristics of the electromagnetic showers inside crystals of lead tungstate. A model of the very-front-end has also been developed in order to simulate the principal functions of the very front-end (shaping, amplification, sampling as performed by the FPPA and digitization as performed by the ADC). Special care has been taken to reproduce the electronics noise generated by the very-front-end elements and in particular the effect of correlation of the values of the electronics noise at different sampling times which is due to the limited bandwidth of these devices. The output of this first program consists of sets of time frames of 32 bunch crossing depth for each crystal inside a trigger tower. Each value uses the same code (12-bit D-word and 2-bit G-word, see Subsection 4.1.3) as produced by the FPPA and the ADC.

A second program describing the hardware of the TPG in great details uses as inputs the data generated by the previous one. All computations are done using integer representation with the actual number of bits used in each stage of the hardware of the TPG. For the time being the simulation has been performed only for the barrel case.

### 4.9.2 Performance of the Bunch Crossing Assignment

We generated several sets of 500 events with energy of the showering particle variable between 0.5 and 50 GeV. Intrinsic fluctuations due to the 2% photostatistics term and of the lateral leakage fluctuations lead to a resolution on the energy deposited in a trigger tower which is reported in Figure 4.16. This resolution curve can be well described using a resulting stochastic term equal to 3% and a constant term equal to 0.55%.

Several types of internal structure of the TPG hardware were compared. The simplest one uses only one peak finder per trigger tower. The bunch crossing efficiency curve (open points)



**Fig. 4.16:** Intrinsic electron/photon shower energy resolution.

versus the energy of the incident particle is shown in Figure 4.17. This very simple internal structure is unable to detect with full efficiency particle with energy lower than 2.5 GeV. Case with one energy filter coupled to one peak finder per trigger tower already gives good performance (open squares) but is unable to have full efficiency for showers with energy of 0.5 GeV which are mainly used in the isolation computation and also in the on line survey of synchronization as defined in Section 4.7.5.

Finally the proposed architecture as described in the Sections 4.5 and Sections 4.4 has the best performance and corresponds to the curve with full squares.

Effects of pileup on the bunch crossing identification of a fixed signal of 5 GeV were studied. The 5 GeV value was adopted because it corresponds to the equal sharing of a 10 GeV electron or photon shower between two adjacent trigger towers.

Pileup signal with variable energy in the 0. to 7. GeV range has been superimposed on this fixed signal with a separation of two bunch crossings (upper part of the Figure 4.18). The fixed signal bunch crossing identification efficiency curve is the curve with squares. This figure shows that if the pileup signal amplitude is lower than 1.8 GeV the TPG sub-system is able to detect the fixed signal with almost full efficiency in the right bunch crossing.

Increasing the pileup signal amplitude starts confusing the TPG sub-system which detects a false signal between the two superimposed signals until the pileup signal amplitude reaches a value of 5.5 GeV for which the TPG sub-system only detects the pileup signal.



**Fig. 4.17:** Efficiency curve of bunch crossing identification for different TPG filtering architectures versus particle energy

The effect of a pileup signal at only one bunch crossing signal distance is shown in the lower part of the same figure. This is the most difficult situation to be handled by the TPG. In this extreme case the bunch crossing identification works well until the pileup signal amplitude reaches a 2.4 GeV value.

The main conclusion of this study is that in any case the bunch crossing of a fixed signal with 5 GeV amplitude remains unaffected for pileup signal amplitudes lower than 1.5 GeV.

### 4.9.3 Transverse Energy Resolution and Linearity

The TPG architecture allowing the best performance in terms of bunch crossing identification efficiency has been evaluated also in terms of the resolution on the measurement of the transverse energy. In this architecture each strip signal is filtered by an amplitude filter and the resulting filtered signal is time deconvoluted by a peak finder. The transverse energy of the trigger tower is then computed using the five amplitude and time filtered strip signals. All the computations performed for the Fine Grain Veto bit are also performed using the same filtered strip signals.

For electron and photon showers the energy resolution measured by the TPG hardware has a predicted stochastic term of 3% and a constant term of 0.55%. The last term entering in the resolution function is due to the electronics noise. It is 170 MeV in this study where a 30 MeV equivalent electronics noise has been used for each individual crystal electronics channel. This value is slightly different from the ideal value of 150 MeV.



**Fig. 4.18:** Bunch crossing identification efficiency of a 5 GeV signal versus pileup signal energy (distance of the two maxima : 2 bunch crossings in the top part and 1 bunch crossing in the bottom part of the figure).

Figure 4.19 shows two curves concerning this resolution on the energy measurement.

Curve with circles corresponds to an “ideal measurement” where effects of lateral leakage are not taken into account and where no electronics noise has been introduced. The curve with squares corresponds to the predicted curve for the TPG proposed architecture.

The measurement performed by the TPG hardware has a resolution of 1% for electrons or photons with energy greater than 20 GeV.

Non-linearities of the TPG response, due to truncations and inefficiency to detect low amplitude signals, are predicted by this Monte-Carlo study. These non linearities can be as high as 7% for 5 GeV electron or photon shower as shown in Figure 4.20. Negligible effects for transverse energy greater than 10 GeV are predicted.

#### 4.9.4 Resolution on the Fine Grain Variable

Figure 4.21 shows the distribution of the value of the ratio of the energy of the strip pair with the maximum amplitude to the total energy in the trigger tower. This distribution was obtained using the values generated by the Monte-Carlo program of these two quantities in the case where the incident particle has a 10 GeV energy (shaded histogram). The same variable as computed by the trigger cell processor of the TPG is superimposed (non shaded histogram) on the same figure. Effects of quantization due to integer arithmetic usage can be well seen for this variable.



**Fig. 4.19:** Resolution of  $e/\gamma$  shower energy versus the particle energy: the dots curve is the resolution of the deposited energy the squares curve is the resolution of the TPG.

## 4.10 Prototypes and Tests

### 4.10.1 ECAL Results

A prototype of the trigger and digital processing electronics for the electromagnetic calorimeter of the CMS experiment, coupled to a prototype of the PbWO<sub>4</sub> crystal calorimeter, was tested during summer 96 in the H4 beamline at the CERN SPS.

The main goals of the test reported here were to validate the concepts used in the digital signal processing on data from the high precision CMS crystal calorimeter, both for triggering and data acquisition purposes. A front-end trigger and readout system with the basic functionality needed to process the CMS electromagnetic calorimeter data was put in operation at the LHC clock frequency in a test beam environment. A matrix of 6 by 6 PbWO<sub>4</sub> crystals, corresponding to one barrel trigger tower, as foreseen in the CMS Technical Proposal, was equipped with prototypes of the FERMI readout system [4.1], consisting of a dynamic non-linear analog compressor, a 10-bit sampling ADC, a linearizer look-up table and a readout pipeline memory. In the trigger chain, individual channel samples were summed in strips of six channels and then filtered in order to extract the energy and timing information from the sampled pulses, using circuits provided by the FERMI collaboration [4.2]. The strip energy data were processed by a prototype of the TPG, which



**Fig. 4.20:** Mean energy of  $e/\gamma$  showers as measured by the TPG versus the particle energy

computes the trigger tower energy sum and the variable characterizing the transverse profile of the electromagnetic shower. The trigger data were stored in pipeline memories for readout by the data acquisition system. The system operated in synchronous and pipelined mode at 40 MHz clock frequency, and data were collected using electron beams with energies ranging from 15 to 150 GeV.

The FERMI prototype readout electronics was implemented in 6U VME modules, each processing three channels. Each FERMI channel is composed of a compressor ASIC, a custom-designed 10 bit sampling ADC and a digital channel ASIC performing the linearization and memory functions.

The compressor performs a non-linear transformation, compressing the signal according to an approximately piecewise linear transfer function, producing on the output a signal in the 0-2 V range. This circuit forms a sum of the outputs from four linear amplifiers with gains of approximately 18.5, 1.3, 0.14 and 0.09 and upper cutoffs on the input voltage of 60, 470, 1200 and 2000 mV respectively. The compression functions of different circuits are similar but not identical, so that the functions have to be computed individually for all channels.

In this test, the three channels in a FERMI board are added sample by sample using a dedicated adder ASIC and the result is available at clock frequency at the board front panel. This three channel sum is the first step in the computation of the CMS calorimeter trigger primitives.

The outputs of two FERMI boards, corresponding to a strip of six crystals, are fed into the filter module. After summation of the two inputs, the time profile of the strip energy pulse is filtered in the Filter ASIC. The circuit performs the amplitude filtering as described in Subsection 4.5.1 on a window of six consecutive samples of the input pulse. In parallel, a peak finder selects the maximum sample of the amplitude filter output and inhibits the output of the



**Fig. 4.21:** Fine Grain Veto bit variable for 10 GeV particles.

other samples. The latency of the filter circuit is of eight clock periods, including six clocks for data input. In Fig. 4.22 we show three different events collected with the filter programmed in different ways.



**Fig. 4.22:** Three different events collected with the following filter programming: a) neutral filter, all filter coefficients identical to zero except one; b) filter performing pedestal subtraction and energy estimation; c) the same as b with peak selection.

The trigger primitive generator system (TPG) receives input from six filter modules, which correspond to the energy values in the six crystal strips. The inputs are coded to 10-bit integer numbers with a LSB value of 250 MeV in order to cover a 255 GeV dynamic range.

The TPG is a set of three different VME 6U boards and a special board to connect them[4.5]: 1) the system board (SB); 2) the processor board (PB); and 3) the optical link board. In this test, only the first two boards have been used. The PB has been designed to emulate the whole functionality of the Trigger Primitive ASIC which is needed for the fine grain calorimeter trigger algorithm in CMS. The processing unit makes use of three L-Neuro 2.3 chips from Philips[4.7]. The SB controls the PB, stores in the readout FIFOs the input data as well as the results of the computations done by the PB and handles the VME interface[4.10]. The readout FIFOs behave as a pipeline memory of programmable length (a length of 32 was used in the test). Self-test and boundary scan facilities were also implemented in the SB for test purposes.

The PB makes the summation of the six input signals and performs the fine grain algorithm as previously described.

After calibration runs, a first estimate of the decompression functions, ADC pedestals and calibration constants was made, so that look-up tables with values appropriate for the TPG on-line operation could be loaded. A simplified data analysis was used for this purpose, which nevertheless guaranteed a level of precision of a few percent, sufficient for the trigger measurements. Under these conditions, the trigger tests were performed in the situation typical at experiment setting-up, when the calibration and linearization constants are not yet well known.

Very high efficiency in the bunch cross identification is one of the main concerns for a proper operation of the trigger systems in the LHC experiments. In the tested system, this functionality is provided by the Filter ASIC which converts the pulse samples spanning six clock periods into a single sample different from zero, with a fixed phase relative to the beam crossing.

At LHC, the ADC clock is expected to be synchronous with the beam interaction with a precision better than 1 ns. This situation does not occur in the test beam, since the beam particle timing is random relative to the clock. In these conditions, the events were classified into different time jitter windows, and the time distribution of the filter output as a function of the jitter was studied. The time jitter was estimated from the derivative of the pulse shape at the maximum sample, a variable called  $\alpha$ . The variable  $\alpha$ , computed for the signal collected in the hit crystal, has a distribution almost flat in the interval -0.20 to 0.26 as shown in Figure 4.23, so that in first approximation an interval in  $\alpha$  of 0.01 corresponds to a jitter of 0.55 ns.

The results are presented in Fig. 4.24, which shows the distribution of the pipeline memory address where the filter output was recorded for different intervals of  $\alpha$ . The events selected in the  $\alpha$  interval between -0.02 and 0.02, which corresponds to a time jitter window of about 2 ns, have the filter output always at the same address. This means that, within the limited statistics available, a constant phase between the beam trigger signal and the filter output is maintained. When the jitter window becomes larger we start to observe a spread in the filter output timing.

The resolution of the trigger energy measurement is an important parameter since it determines the sharpness of the efficiency curves at trigger threshold. Figure 4.25 shows the resolution of the energy computed by the TPG system. For comparison, the trigger energy resolution obtained off-line with optimized decompression tables, calibration constants and filter coefficients is presented in the same plot, showing an improvement of a factor two compared to the on-line results.



**Fig. 4.23:** Distribution of  $\alpha$  for the pulses measured in the hit crystal and for the events collected with electrons of 50 GeV

The Monte Carlo simulation results have shown that the efficiency of the fine grain algorithm is rather sensitive to the value of the threshold in the fine grain variable R (see Section 4.6.1), making the measurement of the R distribution for real electrons an important target for this test. The results are shown in Fig. 4.26. The measurement of R performed with beam data agrees reasonably with the Monte Carlo expectations. In any case the distribution sits well above



**Fig. 4.24:** Time distribution of the filter output for different intervals of  $\alpha$ . The events selected in the  $\alpha$  interval between -0.02 and 0.02 (about one thousand) peak in the same clock period, showing that for a time jitter of the order of 2 ns, the efficiency of bunch crossing identification is larger than 99.9%



**Fig. 4.25:** Trigger energy resolution. The on-line results are well represented by a 9% stochastics term convoluted with a 2% constant term as shown in the plot.

the threshold value assumed in the algorithm simulations ( $R_{th}=0.89$ )[4.6]. For this cut, the electron identification efficiency estimated with the test beam data is of the order of 99%.

## 4.11 Status and Schedule

### 4.11.1 ECAL Status and Schedule

Work for integration of the TPG functions is a continuous process which follows the progress of the technology of the field programmable gate array circuits available on the market.

A fully functional TPG system composed of two different boards:

- TPG board processing 50 digital 16-bit signals
- SLB board performing the synchronization of the Trigger Primitives and interfacing the TPG to the Calorimeter Regional Trigger is currently under testing in Lisboa and Palaiseau (see Fig. 4.5 for a layout of this system).

The final architecture of the calorimeter trigger primitive generator and readout system is now in development. Compared to the present prototype, a higher level of logic integration is foreseen making use of larger FPGAs.



**Fig. 4.26:** Distribution of the fine grain variable R.

The following milestones are foreseen for the ECAL TPG system:

- December 2001: final prototype of the TPG
- June 2002: start pre-production
- January 2003: start of production
- January 2004: start of installation

### 4.11.2 HCAL Status and Schedule

The HCAL TPG system approach is to decrease the input density to the board, necessitating the construction of more boards. The gain realized is in the ability to use affordable FPGAs to do the entire processing jobs.

With the development of high density FPGAs with memories approaching Mbits, these HCAL boards become simpler to build. The engineering saved will of course be passed on to the programmable logic task, however the engineering risk will decrease.

The project will proceed with the following milestones:

- December 2000-January 2001: demonstrator stage completed. This will consist of a demonstration of functionality in 6U or 9U VME boards, but will not consist of any demonstration of hardware implementation. The basic issues are synchronization, I/O, and connectivity.

-December 2001: prototype stage completed. This will consist of taking the demonstrator to a full hardware implementation.

-November 2002: pre-production stage. This will consist of iterating on the prototype to produce a final pre-production design of all cards.

-April 2003: end of production. These cards will be clones of the final pre-production design.

-December 2003: start of installation

## References

- [4.1] H.Alexanian et al., Nucl. Instr. Meth. A357 (1995) 306.
- [4.2] H. Alexanian et al., Nucl. Instr. Meth. A357 (1995) 318.
- [4.3] J. Varela, "Requirements for a fine grain calorimeter trigger", Proceedings of 'First Workshop on Electronics for LHC Experiments', Lisbon, 1995.
- [4.4] Ph. Busson, R. Nobrega, J. Varela, "A Compression Scheme for ECAL Trigger Primitives", CMS IN 1996/007.
- [4.5] Ph. Busson et al., Trigger Primitives Boards, CMS IN-1996/008.
- [4.6] R. Nobrega et al., The CMS e/ $\gamma$  trigger; simulation study with CMSIN data, CMS TN/96-021.
- [4.7] M. Duranton et al., L-Neuro 2.3: A VLSI for Image Processing by Neural Networks, Proceedings of MicroNeuro'96-Lausanne.
- [4.8] J. Varela, "A method for synchronization of the trigger data", CMS NOTE/1996-011.
- [4.9] CMS Calorimeter Trigger Group, "Preliminary specifications of the baseline trigger algorithms", CMS-TN-96-10.
- [4.10] A. Almeida et al., TPB System Board Technical Documentation, CMS IN-1997/013.
- [4.11] L. Berger, R. Nóbrega, J.C. da Silva, J. Varela, "Trigger synchronization circuits in CMS", proceedings of 'Electronics for LHC experiments', Oxford, Sept 97.
- [4.12] J.C. Silva, J. Varela, "Specifications of the prototype trigger synchronization Tx/Rx circuits", CMS IN 1997/009.
- [4.13] N. Leonardo, J. Varela, "Study of a Neural Approach for lower level e/ $\gamma$  calorimeter trigger in CMS", CMS NOTE 1998/081.
- [4.14] R. Benetta et al., "Beam tests of the trigger and digital processing electronics for the electromagnetic calorimeter of the CMS experiment", Nucl. Instrum. Meth. A 413 (1998) 31-42, CMS NOTE-1998/008.
- [4.15] Ph. Busson, "Digital Filtering for ECAL Trigger Primitives Generator", CMS NOTE-1999/020
- [4.16] T. Monteiro et al., "Recent developments in the CMS Calorimeter Trigger", Proceedings of Int. Conf. on Calorimetry in HEP (CALOR99) Lisbon (Portugal), 1999.
- [4.17] C. Beltrán Almeida, I. C. Teixeira, J. P. Teixeira, J. Augusto, M. Santos and N. Cardoso, J. Varela, "Testability Issues in the CMS ECAL Upper-Level Readout and Trigger System", Proceedings of 'Fifth Workshop on Electronics for LHC Experiments', Snowmass, Colorado, USA, 1999.

- [4.18] J. Varela, J. C. Silva, “Technical Specifications of the ECAL Trigger Synchronization and Link Board Prototype”, CMS IN-2000/005.
- [4.19] R. Benetta et al., “Synchronization of the ECAL data”, CMS IN-2000/006.
- [4.20] Ph. Farthouat, P. Gallno, “TTC-VMEbus Interface TTCvi”, RD12 Working Document.
- [4.21] J. Christiansen, A. Marchioro, P. Moreira, TTCrx Reference Manual, RD12 Working Document.



# 5 Regional Calorimeter Trigger

## 5.1 System Requirements

### 5.1.1 Physics Requirements

The L1 calorimeter trigger must provide triggers based on the presence of physics objects such as photons, electrons,  $\tau$ s and jets, as well as global sums of  $E_T$  and missing  $E_T$  (to find neutrinos). It must also provide additional information for the muon trigger system for isolation and minimum ionization signal identification. The  $E_T$  thresholds for each of these objects are required to be tunable so that the QCD background rates are tolerable and the efficiency for discovery physics is high, while providing sufficient sample of control events. At high luminosity, some of these discovery modes place stringent requirements on the required thresholds. For example higgs ( $115 \text{ GeV}/c^2$ ) decays to two photons, charged higgs production in association with top or W decays to electron, higgs ( $\sim 200 \text{ GeV}/c^2$ ) decays to  $\tau$  pairs (single prong hadronic or electron decays), SUSY sparticle ( $\sim 300 \text{ GeV}/c^2$ ) decays to multi-jet events, etc. For example, the required thresholds are about 50, 25 GeV with full efficiency for single and double electrons, 100, 50 GeV for single, double  $\tau$ s and 200, 100 GeV for single, double jets respectively. At low luminosity reduced thresholds are needed to study lower mass particles. The trigger system must be designed such that the efficiency is measurable.

### 5.1.2 Data Acquisition Requirements

The CMS data acquisition system places a requirement on the output rate of the trigger system. The calorimeter trigger system is required to be designed to operate with a maximum output rate of 12.5 kHz while satisfying its physics requirements. In order to understand the trigger system in detail output from every trigger subsystem including this is required to be readout by the DAQ system for every L1 trigger issued.

## 5.2 System Specifications

### 5.2.1 Input Specification

Input to the Regional Calorimeter Trigger (RCT) for each trigger tower of the CMS calorimeters consists of 8-bit non-linear  $E_T$  and a tower characterization bit. There are  $52 \eta \times 72 \phi$  trigger towers matching in size for both barrel and endcap electromagnetic (EB, EE) and hadronic (HB, HE) sections of the calorimeter. The trigger tower segmentation in the very forward calorimeter (HF) is  $2 (\text{F/B}) \times 4 \eta \times 18 \phi$ . The input signals are organized such that there are two towers worth of data and error detection code bits per link, i.e., 18 bits data and 5 bits EDC. The 5 bit Hamming code generated from the 18 bits of data are sufficient to detect all single and double bit errors as well as many multiple bit errors. The present design uses transmitter and receiver links capable of handling 24 bits of information in 25 ns with a baud rate of 1.2 Gbaud. The link technology being considered for this design is based on the VSC7216 chip made by Vitesse which has 4 full duplex channels capable of running between 1.2 and 1.3 Gbaud. We will run it at 1.2

Gbaud, transferring 24 bits of information per crossing per channel. Copper cables under 20 m length are used for these gigabit links between calorimeter front-end electronics and trigger crates. The data across all the ECAL and HCAL trigger towers are expected to be synchronized.

### 5.2.2 Output Specification

The Regional Calorimeter Trigger output consists of lists of top 4 candidates of various types and energy sums. The candidate lists are provided separately for isolated and non-isolated electron/photons, central and forward jets, and  $\tau$ s (isolated narrow jets). Each of these candidates is specified by a 6-bit rank based on candidate  $E_T$  and location information 4 or 5 bits to uniquely identify the 4x4 trigger tower region that the candidate belongs to. Eighteen  $E_T$  sums covering  $|\eta| < 5$  and  $20^\circ \phi$  segments are reported. All of these data are sent to the Global Calorimeter Trigger using 80 MHz parallel differential ECL signals on 34 pair copper cables. In addition, muon quiet and MIP deposit identification bits are sent to the Global Muon Trigger system via the Global Calorimeter Trigger system.

### 5.2.3 Latency

The total latency of the level-1 trigger is set to be about 3.2  $\mu\text{s}$  or 128 crossings. The latency is expended in propagation of particles from the interaction (4 crossings), transmission of signals between various subsystems that make up the trigger decision and the trigger decision logic itself. A significant fraction, i.e., ~40% of crossings, are expected to be spent just in the data transmission on cables. This estimate includes optical link between calorimeter front-end and trigger primitives subsystems, copper link between trigger primitives and regional trigger subsystems, copper link between regional and Global Calorimeter Trigger subsystems, copper link between global calorimeter and final trigger subsystem and finally the link back to the front-end. This leaves ~70 crossings for processing in all five subsystems and contingency. About a third of this latency is allocated for the Regional Calorimeter Trigger processing as it is used for bulk of the data reduction process.

## 5.3 System Overview

### 5.3.1 System Functionality

The Regional Calorimeter Trigger system [5.1], located in the CMS underground shielded counting room, receives digital trigger sums from the front-end electronics system in neighbouring racks. The data for two trigger towers of the same calorimeter (EB, EE, HB, HE or HF), for the same crossing, are sent on a single link in eight bits apiece for compressed non-linear  $E_T$  accompanied by five bits of error detection code and a fine grain bit. The fine grain bit characterizes the  $E_T$  profile of the constituents of the trigger tower that are summed together, i.e. isolated energy for EB, EE or an energy deposit consistent with minimum ionizing particle for HB, HE. Presently the fine-grain bit is undefined for the HF calorimeter.

The Regional Calorimeter Trigger system uses 20 regional processor crates covering the full detector. Eighteen crates are dedicated to the barrel and two endcaps. These crates cover the region  $|\eta| < 3$ . One special crate covers both HF Calorimeters that extend missing  $E_T$  and jet finding coverage seamlessly to  $|\eta| < 5$ . The remaining crate collects regional information from these 19

trigger crates and clusters their regions to find jets and  $\tau$ s. It also continues the summation tree to provide sums of  $E_T$  in various  $\phi$  regions.

Each RCT crate transmits to the Global Calorimeter Trigger processor its 4 highest-ranked isolated and non-isolated electrons. The cluster crate sends its 9x4 highest energy central and forward jets and  $\tau$  candidates along with information about their location, and summed  $E_T$  for 18  $\phi$  regions covered by it. The Global Calorimeter Trigger then forms  $E_x$  and  $E_y$  using look-up-tables and sums the energies, separately sorts the electrons, jets and  $\tau$ s, and sends the top four calorimeter-wide candidates, as well as the total calorimeter missing and sum  $E_T$  to the CMS global trigger. The muon isolation and identification bits formed using the HB, HE information are passed to the global muon crates via the Global Calorimeter Trigger system.

Eighteen crates of the Regional Calorimeter Trigger use three custom board designs which are dedicated to receiving and processing data from the barrel and endcap calorimeters. In these crates there are 15 processor cards plugged into a custom backplane which provides point-to-point links between them. VME bus is also provided to these cards using high density connectors in the top 3U section of the backplane. In addition there are two slots with standard VME backplane connectors for crate processor and monitoring cards. The 19th crate, covering the HF calorimeter, houses to form jets and  $E_T$  sums. The 20th, cluster, crate is similar to the 18 barrel and endcap crates but is fitted with a different backplane and a set of cluster processor cards which implement jet and  $\tau$  finding algorithms and  $E_T$  sums.

### 5.3.2 Calorimeter Trigger Tower Mapping

The barrel and endcap trigger towers, in the region  $-3 < \eta < 3$  and  $0 < \phi < 2\pi$ , are processed in 18 RCT crates. The mapping of the calorimeter to the trigger crates is shown in Fig. 5.1. Each crate covers a  $40^\circ \phi$  region and one half of the  $\eta$ . Each crate has 7 fully occupied cards covering fourteen 4x4 trigger tower regions. The neighbour data needed for seamless coverage of the electron isolation algorithm is obtained either on the crate backplane or by using inter-crate cable connections. The very forward HF calorimeter mapping is shown in Fig. 5.2. Nine HF Cards are used to receive the HF signals and to drive them to the cluster processor crate. The mapping for the cluster processor crate which serves the entire calorimeter is shown in Fig. 5.3. There is a one-to-one mapping between the 9 HF crate cards and 9 cluster processor crate cards.

### 5.3.3 Crate, Backplane and Cards

The 20 regional trigger crates will be located in racks with two crates per rack. The remaining rack front panel space will be occupied by fans, heat exchangers, and crate power supplies. Front panels will be used at all card locations to provide an enclosed environment for the chilled air.

The crate, shown schematically in Fig. 5.4, is based on standard Eurocard hardware. The height is 9U and the depth approximately 700mm, as determined by the front and rear card insertion. The CMS rack dimension (900mm deep) can handle the crate depth with some reserve for cabling, plumbing, and other services. The backplane is completely custom with a full 9U height. The top 3U is reserved for a 32 bit VME interface. The remaining 6U is used for the high speed data paths between individual cards. The front section of the crate is designed to accommodate 280 mm deep cards, leaving the major portion of the volume for 400mm deep rear mounted cards.



**Fig. 5.1:** Calorimeter trigger tower mapping for the barrel and endcap region.



**Fig. 5.2:** HF trigger towers and their mapping to 9 HF trigger crate cards shown as differently shaded regions.

Eighteen regional trigger crates use three custom board designs which are dedicated to receiving and processing data from the barrel and endcap calorimeter. There are seven rear mounted Receiver cards, seven front mounted Electron Identification cards, and one front mounted Jet/Summary card for a total of 15 cards per crate dedicated to processing data from the calorimeter. In addition, there are several support cards. The first of these is a readout crate controller and communication module (ROC) selected by the DAQ group. The second is a crate

**Fig. 5.3:** Cluster crate mapping**Fig. 5.4:** Schematic view of a typical Regional Calorimeter Trigger crate.

environment monitor (CEM). Finally, the third card will be dedicated entirely to clock distribution and logging status for the cards in the data processing path. Similar design is adopted for HF and Cluster crates. The cards used in the crates offer different functionality desired in those crates.

### 5.3.4 Crate Power, Cooling

Power supplies will be mounted in a separate chassis above each crate. They will be located in the forward 280mm of the volume in consideration of the lower heat load (per unit area) of the forward cards. It is desirable to put the supplies above the crates to place the cards closer to the heat exchangers. Testing is required to determine whether this up and down arrangement will be successful. On-board DC-DC converters are used for power distribution.



**Fig. 5.5:** Card spacing and data sharing on backplane.

The front and rear insertion of cards in the data processing section of the crate was chosen to allow greater separation between cards and to provide a more protected environment for the cables connected to the rear mounted Receiver cards. The increased separation promotes better cooling of the cards, and will enable a wider selection of front panel components. The staggering of the slots between front and rear cards, shown in Fig. 5.5, is as much a result of the style of connector selected as the fact that piggybacking of connectors is inappropriate in this situation. Almost half of the signals entering the Electron Identification board come from neighbouring Receiver cards. The spacing between cards, in the data processing area, is 35.6 mm front or rear, with a 17.8 mm stagger front to rear. Therefore, the seven Receiver cards, seven Electron Identification cards, Jet/Summary card, and Clock distribution card occupy 16 slots with a span of only 320.4 mm across the front of the crate. The remaining 86.1 mm will be allocated to a DAQ

Readout card (ROC) (40.6 mm), Crate Environment Monitor (CEM) (20.3 mm), and a transition region for the change in form factor between the standard and non-standard VME.

### 5.3.5 Clock and Control

The Local Timing, Trigger and Control card (LTTC) will interface to a TTCrx ASIC located on a card in a separate crate. A single TTCrx ASIC is used in a separate crate and signals relevant to Regional Calorimeter Trigger are extracted from it and are distributed synchronously to all processing crates. Within a crate the Clock card will distribute the signals to all processor cards in the crate using differential point-to-point links on the custom backplane.

### 5.3.6 Backplane

The crate backplane is completely custom with a full 9U height. The top 3U is reserved for a 32 bit VME interface. The remaining 6U is used for the high speed data paths between individual cards. The backplane is a 425.2 mm x 397.0 mm multilayer printed circuit board ~3.4 mm thick. The other dimensions are fixed by the physical size of the crate. There are on board VME terminations, multiple studs for power supply connections, bypass capacitors, and mounting holes.

The design impedance is 50 ohms for the lower 2/3 of the backplane which contains all the trigger data paths. This impedance matches the AMP connector impedance and the impedance of the individual boards. The wiring in this section is all point-to-point, making for a fairly straightforward transmission line. Terminations for these lines are on the receiving cards. The multi-drop transmission line of the VME backplane in the upper 1/3 of the backplane has an impedance of 100 ohms – before holes and connectors are added. The effective impedance of this section will drop to a little less than 50 ohms after connectors and trace stubs on the individual boards are taken into account.

In order to ensure there are sufficient wiring channels and to maintain symmetry, five buried layers are used in the trigger data section. A picture of the board stack-up is given in Fig. 5.6. An alternating power plane/signal plane structure produces a buried stripline design and surface micro-strip traces. The stripline structure provides good control over crosstalk and general noise immunity. The outside signal layers will be used mostly in the VME section of the backplane. Signals in the trigger data portion are constrained, but not strictly limited to, the buried stripline layers.

### 5.3.7 VME

The VME specification is written for crates containing no more than 21 cards. In addition, the specification requires that VME signal stubs on individual boards be no greater than 5.08 cm in length. The present design for the trigger processor crates contains 20 cards with a VME interface. Reducing the VME interface from two connectors to one has increased the difficulty of staying within the 5.08 cm requirement. Special care has been taken throughout the design process to stay within the VME standards.



**Fig. 5.6:** A cross section of the Backplane showing the board stack-up.

### 5.3.8 Backplane Data Sharing

All signals in the trigger data portion of the backplane will be transmitted on point to point links at 160 MHz. This data rate was chosen because it offers the opportunity to compress the number of data lines on the backplane and in the pipeline data logic by a factor of four and because it should be realizable by available technology. All signals in this section, including clocks, are transmitted point to point. In the present design, all signals are differential.

### 5.3.9 Inter-crate Data Sharing

Inter-crate data sharing required for seamless coverage of  $(\eta, \phi)$  plane is done using differential ECL signals transmitted at 80 MHz on copper cable. The data are sent from crate-to-crate after any deserialization, phase adjust and linearization, in parallel format. No major effort for resynchronization of signals is expected to be needed.

### 5.3.10 Implementation of Algorithms

Both electro-magnetic and hadronic calorimeter data are needed for electron/photon trigger algorithm implementation [5.2]. However, the EB, EE data are needed at 7-bit resolution to calculate the candidate energy whereas the HB, HE data can be compressed as it is only needed to veto if the energy profile is not consistent with that of an electron/photon. Look-up-tables on the Receiver cards are used for linearization of energies and to look-up H/E value which is combined with the FG veto bit. The linearized ECAL  $E_T$  and the veto bit data for each tower are driven to the electron identification card. The electron identification card implements the algorithm which

includes formation of the candidate electron energy and four identification criteria. These data are used to form ranked isolated and non-isolated electron/photon candidates. These ranked candidates from all electron identification cards in the system are sorted to obtain the top 4 candidates of each type separately and forwarded to the Global Calorimeter Trigger.

The jet and energy triggers [5.3] need data from all calorimeters. The  $\tau$  trigger uses the same data but in the central  $\eta$  region only. The input signal processing from the calorimeter is shared by the EM and jet logic up to the phase adjustment. After this stage separate look-up-tables are used to linearize the energy and count active towers per region. The Receiver card is also used to make 4x4 trigger tower region sums of  $E_T$  in both electromagnetic and hadronic calorimeters. The Jet/Summary card receives  $E_T$  and active tower counts from all Receiver cards in the crate, thresholds the active tower count for each region and stages all data out to the Cluster crate. The Cluster crate also receives  $E_T$  from HF towers. The Cluster crate makes overlapping 3x3 sums of these regions with the requirement that the central region is larger than its neighbours, spanning the entire  $(\eta, \phi)$  plane seamlessly, defining 12x12 sums. These sums are classified as  $\tau$  objects or central or forward jets, separately sorted to find top 4 candidates of each type, and forwarded to the Global Calorimeter Trigger subsystem. The sum of  $E_T$  in each 20 degree  $\phi$  strip is also formed, and staged to the Global Calorimeter Trigger subsystem for calculating missing  $E_T$ , total  $E_T$  and making luminosity monitor histograms.

## 5.4 Receiver Card

### 5.4.1 Overview

The Receiver card is the largest board in the crate. It is 9U by 400mm. The rear side of the card receives the calorimeter data on serial copper cables, and converts from serial to parallel format. The front side of the card contains circuitry to synchronize the incoming data with the local clock, and check for data transmission errors. There are also lookup tables and adder blocks on the front. The lookup tables translate the incoming information to transverse energy on several scales. They are also used to test for Quiet and Minimum Ionization thresholds for each trigger tower. The energy summation tree begins on these cards in order to reduce the amount of data forwarded on the backplane to the Jet/Summary card. Separate cable connectors and buffering are also provided for inter-crate sharing.

Each card is designed to receive 32 high speed copper links from the calorimeter readout electronics. Each link transmits either two towers of hadronic or electromagnetic information per crossing for a total of 64 channels from 32 ECAL and 32 HCAL towers per card. The present design for the data uses a 24-bit frame including 18 bits of data and 5 bits of error detection code as described in Chap. 4. The data consists of 8 bits of energy on a compressed scale and one bit of fine-grain information per tower. The error code is sufficient to detect all single and double bit errors as well as many multiple bit errors. The error bits are necessary for error logging and to zero problem channels. The 24-bit word uses 8/10 bit encoding, which implies a 1.2 GHz serial link. The cable length to the calorimeter electronics is estimated at 20 m.

The rear side of the Receiver card has serial receivers based on the specifications of the Vitesse 7216 4-channel Interconnect Chip. The design provides for cable/connector equalization and the option of transformer isolation on daughter cards. The front side of the Receiver card contains the synchronization circuitry followed by the memory look-up tables, adder tree and



**Fig. 5.7:** Receiver card views.

backplane drivers. The outputs of the receivers are not only unsynchronized with the local clock but are also not necessarily aligned to the same bunch crossing. The phase alignment circuitry is contained on an ASIC (Phase ASIC). The Phase ASIC deskews the data, decodes the error detection codes and multiplexes the output at 160 MHz. The Phase ASIC also provides test vectors for board and system diagnostics.

In order to achieve maximum utilization of board space, all the logic following and including the Phase ASIC is run at 160 MHz. There are also four Error Detection Codes (EDCs) associated with the four input channels of each Phase ASIC. After synchronization, each EDC is checked against the data. If an error is detected a single bit is set, one for each incoming channel, and appended to the original EDC code.

Lookup tables are required to translate the information coming from the calorimeter readout electronics, in compressed format, onto the several different scales used by the energy adder tree and the Electron Identification logic. The Hadronic and Electromagnetic energies are individually translated into eight bits of linear  $E_T$  with a resolution of approximately 1 GeV. These values are summed to provide total energy in  $4 \times 4$  trigger tower regions of the calorimeter. The summation is performed by an Adder ASIC. Thirty-two towers, in a  $4 \times 8$  array, are processed on each card. The transverse energy for each of the two  $4 \times 4$  trigger tower regions is independently summed and forwarded to the Jet/Summary card. These two 13 bit numbers will be multiplexed onto a single set of 13 differential pairs at 160 MHz.

The data required for electron/photon finding algorithm is transferred from the Receiver cards to the Electron Identification cards at 160 MHz. In order to retain point to point transmission, data must be transmitted through separate drivers on separate backplane lines. Every Receiver card shares its data with at most 6 Electron Identification cards within the same crate. In addition each Receiver card sends some of its data off crate at 40 MHz to two or three neighbouring crates. Crate-to-crate communication is handled by special cables running between the Receiver cards. This

distributes the inter-crate buffering among the eight Receiver cards in a crate, rather than attempting to put it all on one or two special cards at the ends of each crate.

### 5.4.2 Serial Links

The serial data arriving at the inputs to the Receiver Cards is generated in neighbouring Calorimeter Readout crates and received over 20 m of cable. We believe this length to be sufficient for the area containing the HCAL, ECAL, and Trigger racks. All cables will be cut to the same length. Equal length is required by equal timing and a single design for the equalization circuit on the receiving end of each cable.

The serial links are implemented using the VSC7216 chip made by Vitesse, which has 4 full duplex channels capable of running between 1.2 and 1.3 Gbaud. It is designed to support Gigabit Ethernet. The data are encoded 8B/10B on the chip. The serializer and deserializer is also included. The parallel data I/O is eight bits wide per channel. A 3.3V bias is required for power. Inputs and outputs are low voltage TTL except for the serial data path which is PECL. Power dissipation is approximately 3.3W. The chip comes in a 160 pin PQFP package. We run this chip at 1.2 Gbaud, transferring 24 bits of information per crossing per channel.

The receiver chips are placed on a mezzanine card along with the cable connector and an equalization circuit. The mezzanine card provides a certain amount of power supply isolation and allows for the replacement of faulty receivers with minimal trauma to the Receiver Card itself. There is also less conflict between the routing requirements of the receiver circuitry and the dense high-speed ECL circuitry resident on the front side of the Receiver Card. Mounting the connectors for the serial link on the mezzanine card also avoids a collision between those connectors and the inter-crate data connectors on the front side of the main card. Surface mount board-to-board connectors with locating pins are used to mount the mezzanine cards to the Receiver card. Two of these connectors are dedicated for power from the Receiver card to the mezzanine card. A DC/DC converter on the back of the Receiver card provides the required 3.3V bias.

The Vitesse chip is capable of treating its four channels as a single word. Each channel has a deskewing buffer of 2 bit times. This buffer can compensate for different delays in cable routing and small phase drifts between transmitting and receiving ends. The four channels can also be treated independently. The data are transferred from the mezzanine card to the Phase ASIC mounted on the front side of the Receiver Card shown in Fig. 5.7.

### 5.4.3 Phase ASIC

The block diagram of the phase ASIC is shown in Fig. 5.8. This ASIC receives four channels of two-tower data, multiplexed onto two input channels. Each of the four input channels is comprised of 8 bits of data, from the 10bit/8bit decodes, and 3 bits of status. One of the three status bits is an error bit set by the receiver as a result of several illegal conditions that might occur. The remaining two status bits can either indicate the current operating mode of the receiver chip or qualify the error bit. Each channel's parallel output is arriving at 120MHz. Three frames, or 24 bits of data per channel, complete the data transfer for a single bunch crossing. The four channels,

taken together, produce 44 bits of information every 8.33ns. These 44 bits are clocked into the input stage of the Phase ASIC using the clock recovered from the VSC7216.



**Fig. 5.8:** Phase ASIC block diagram.

The input stage of the Phase ASIC is a 44-bit wide FIFO that is six frames deep. The FIFO can accommodate minor phase shifts between the transmitter and local clocks and is kept approximately half full. The FIFO is followed by a circuit which establishes the proper phase between the incoming data and the local bunch crossing clock. This circuit makes use of status information from the VSC7216 to set the final phase. Once properly phased, the data and error bits can be separated into 18 bits of data (two channels) and 5 bits of Hamming code.

The error detection Hamming code is recomputed from the data and compared with the received Hamming code bits. This Hamming code catches all single and double bit errors and most other multi-bit errors. The data leaves the Phase ASIC in two data channels, 9 bits apiece, and one 9 bit error channel. The error bits are made up of the transmitted EDC, a subset of the status bits from the VSC7216, and an overall error indicator. The status bits from the VSC7216 provide sufficient information to determine the state of the serial links at any point in time.

As we have four input channels, each handling two towers per crossing, the two output channels produce four towers of information per crossing. The outputs are clocked at 160MHz. This rate was chosen to match the processing circuitry on the rest of the board. By running the Receiver Card at 160MHz we realize a substantial saving in the amount of logic required to process the data.

The last storage element of the Phase ASIC will be implemented as a loadable counter. During normal operation the counter will be loaded with data each 6.25 ns. During testing the counter can be reset and enabled to count synchronously with the rest of Phase ASIC outputs. The counter outputs will address the look-up tables just as detector data would. The combination of these counters and look-up tables can be used to provide any data pattern necessary to test the remainder of the Trigger Processor system. The error outputs will be idle during testing.

The Phase ASIC will have a JTAG controller and scan cells on all the outputs. Total signal pin count, including JTAG should be around 80. The technology for the Phase ASIC is expected to be the same Vitesse 0.6  $\mu\text{m}$  GaAs process as used for the Adder ASIC described below.

#### 5.4.4 Link Error Handling

The data are zeroed on link errors so that loss of an individual link does not inhibit data taking. Broken links are expected to be resynchronised periodically. However, link error flags from the Phase ASICs are counted and the counts are readable over VME by the local crate processor for monitoring.

#### 5.4.5 Look-up Tables

Look-up tables are required to translate the information coming from the calorimeter front end electronics, in compressed format, onto the several different scales used by the energy adder tree and the Electron Identification logic. The Hadronic and Electromagnetic energies are individually translated into eight bits of linear  $E_T$  with a resolution of approximately one GeV. The processor on the Electron Identification (Electron ID) card requires electromagnetic and hadronic transverse energy with several dynamic ranges. Seven bits of electromagnetic transverse energy with a resolution of 0.5 GeV are needed for all but the corner towers of the 3x3 trigger tower area used in the electron algorithm. Additionally, a HAC/FG Veto bit which characterizes the tower based on comparison of electromagnetic to hadronic  $E_T$ , and a fine-grain profile veto that accompanies each EM tower input, is formed using another look-up table. The corner towers for each 4 x 8 region are reduced to 3 bits with a resolution of 0.5 GeV. To guard against wrap-around when the 3 bit electromagnetic values and 4-bit hadronic values are generated, any value greater than the maximum of the bit range is set to the maximum three or four bit value. Spare bits in these memories are used for thresholding  $E_{T\gamma}$  to determine the number of active towers. These towers are counted to later determine  $\tau$ -like energy deposits. A block diagram of these look-up tables is shown in Fig. 5.9.

With a 6.25 ns period the memories must have access times less than 3.0 ns in order to provide a margin for board propagation and setup and hold times. Data are downloaded to the memories and read back through the VME interface. This requires support circuitry located in the area of the memory chips. The data inputs to the memories can be tied in parallel for writing the chips, but all the address lines (128) need to be individually buffered. The buffering is located near the memory in order to maintain short board traces in the high-speed section of the logic. Reading the data out of the memories back to the VME interface requires buffering for all the data output lines. For example, this buffering is provided by an 8:1 multiplexer within each Adder ASIC.



**Fig. 5.9:** A block diagram of the Receiver Card look-up tables.

#### 5.4.6 Energy Sums

The beginning of the energy summation tree is on the Receiver card. The transverse energy for each of the two  $4 \times 4$  trigger tower regions is independently summed and forwarded to the Jet/Summary card. Though the input values at the top of the adder tree have only 8 bits of range, the adder tree has been designed to handle a dynamic range of 10 bits for either positive or negative values. This implies an overflow at approximately 1000 GeV. The exact value will depend on the resolution required of the input transverse energy for other trigger functions. Any overflow (or underflow) generated as the result of an arithmetic operation (AOV) will stay in time with the data and be ORed with any other overflow that might have occurred in the same crossing. All values are handled as 11 bit 2's-complement numbers.

A second overflow condition can also occur. The value sent to the trigger processor from the calorimeter front end electronics may be at the highest possible count. The look-up tables will be programmed to output the hexadecimal value FFH for this particular input. This output code is recognized, at the top level of the adder tree, as indicating an input data trigger tower overflow (TOV). It is handled much like the arithmetic overflow in that it is ORed with other TOV overflows and passed down the tree in time with the data that generated it. If the overflow is caused by a hardware failure, the look-up table can be re-written to zero out the affected channels. The arithmetic and tower overflow bits are handled separately through to the end of the adder tree. TOV recognition can be disabled. These summations are made using an Adder ASIC.

#### 5.4.7 Adder ASIC

The adder ASIC is implemented as a 4-stage pipeline with eight input operands and one output operand. There are only three stages of adder tree, but an extra level of storage has been added to ensure chip processing is isolated from the I/O. We have determined that the ASIC must work reliably at a clock period of 5.0 ns in order to ensure safe operation at an in-circuit period of 6.25 ns.

This ASIC uses 4-bit adder macro cells to implement 12-bit wide adders. Eleven bits are wired, left justified, to each operand of an adder. The LSB of each adder will be internally set to

zero. The MSB is treated as a sign bit. Therefore, although the adder tree may be constructed from three 4 bit adders, the width of the operand data paths has been limited to eleven bits. An Adder ASIC chip is designated as “master” if it is in the top rank of the adder tree and as “slave” if it is further down. Masters can generate Tower overflow (TOV), but slaves can only propagate TOV. Both masters and slaves can generate and propagate arithmetic overflow/underflow (AOV). These bits are appended to each input and output operand, making all operands 13 bits wide. TOV becomes the twelfth bit of the output result and AOV the thirteenth bit. The data outputs of the chip are forced to the hexadecimal value 3FFH when either an overflow or underflow occurs.

The top of the adder tree is composed of four 12-bit adders and includes the logic required to detect and propagate TOV and AOV. The TOV generate circuitry is a filter designed to detect the input code 3FFH. The AOV generate circuitry examines the sign bits of the input operands and the results operand, together with the carry out, to determine whether or not an overflow or underflow has occurred. All eight of the TOV bits are ORed together and all four of the AOV bits are ORed together to form two separate overflow bits that are forwarded with the data in the pipeline. Edge triggered registers are used to store the results for the next stage of the adder tree. A block diagram of the Adder ASIC is shown in Fig. 5.10.



**Fig. 5.10:** Block diagram of the Adder ASIC.

The second stage contains two more 12 bit adders and includes the logic needed to propagate TOV and to detect and propagate AOV. From this point on, TOV is forwarded down the pipeline from register to register. AOV is generated in the same manner as in the first stage and the resulting two bits are ORed with the AOV from the previous stage. Edge triggered registers are used to store the results for the next stage of the adder tree.

The third stage contains the final adder as well as a continuation of the TOV/AOV circuitry. The register at this level is the last storage element before the ASIC output. If either TOV

or AOV have been detected, the output operand stored in this register has the value 3FFH. TOV and AOV are stored along with the operand. Adder Tree ASICs further down in the tree are designated “slaves” and are blocked from using the operand 3FFH to generate TOV. Thus we retain the identity of the tower overflow bits through the entire tree.

The last register is presented to one input of a 2:1 multiplexer before leaving the chip through the boundary scan cells and pads. The other side of the multiplexer is fed by an 8:1 multiplexer which passes any one of the eight input operands, less the two overflow bits, to the output of the ASIC. This feature was provided to minimize the external logic needed to read back the values of the lookup tables that feed the first stage of the adder tree logic.

The chip also contains boundary scan support described later. The Adder ASIC implementing all the above features was manufactured by Vitesse in 0.6 µm GaAs technology and is packaged in a 195-pin PGA. It requires approximately 4 W of power.

#### 5.4.8 Backplane Drivers

Thirty-two towers, in a 4 x 8 array, are processed on each Electron Identification board. Data from twenty-eight neighbouring towers are required to determine isolation for towers on the edge of the 4 x 8 region. Data are transferred between the Receiver card and the Electron Identification card at 160 MHz. In order to retain point-to-point transmission, data going to a neighbouring Electron Identification board must be transmitted through separate drivers on separate backplane lines.

Every Receiver card shares its data with at most 6 Electron Identification cards within the same crate. In addition each Receiver card sends some of its data off crate at 80 MHz to a maximum of three neighbouring crates. Crate-to-crate communication is handled by special cables running between the Receiver cards. This distributes the inter-crate buffering among the eight Receiver cards in a crate rather than attempting to put it all on one or two special cards at the ends of each crate. The maximum amount of inter-crate information that can enter and leave a Receiver card is carried on 120 twisted-pair cables.

The order in which the data are transmitted to the Electron Identification card is important. Since connector pins are at a premium, it is necessary to ensure that all lines have useful data for each of the four 6.25 ns cycles making up a 25 ns crossing. Board space is also at a premium, so it is necessary to limit the amount of circuitry committed to staging the data for the backplane. The most efficient way to satisfy these goals simultaneously is to pay careful attention to the order of cabling between the detector front end electronics and the input channels of the Receiver card. This, in conjunction with the multiplexing of data in the Phase ASIC, handles most of the data staging. Some of the data transmitted between the Receiver card and Electron Identification (Electron ID) card comes from neighbouring crates. This data has been delayed in time with respect to the data originating on the local Receiver card. The local data must be delayed to allow the shared data to “catch up”. This function is incorporated into the Boundary SCAN ASIC that includes the differential drivers for the backplane data and the JTAG boundary scan.

#### 5.4.9 Boundary Scan ASIC

The Boundary Scan ASIC has several functions. Firstly, it provides control for board level boundary scan functions. Secondly, it provides drivers for sending data over the point-to-

point links on the backplane and inter-crate cables. Thirdly, it provides simple algorithms needed for manipulating data, e.g., to reduce the corner tower data from 7 bits to 3 bits while ensuring that the setting of any upper bits in the input saturates the 3-bit scale. This ASIC is also implemented in Vitesse 0.6  $\mu\text{m}$  GaAs technology.

## 5.5 Electron Identification Card

### 5.5.1 Overview

The electron isolation algorithm for each 4x8 trigger tower region is performed on a smaller 240 mm deep Electron Identification card. Data for thirty-two central towers and twenty-eight neighbouring towers is required to determine isolation for towers on the edge of the 4 x 8 region. This card receives linearized transverse energy on a 7-bit scale for ECAL and a 4-bit scale for HCAL and ECAL fine-grain bit from corresponding receiver card. It also receives neighbour tower data from up to three other receiver cards in the crate. The neighbour crate data are transferred through the receiver cards where it undergoes any realignment of phase. The algorithm which finds isolated and non-isolated electron/photon candidates is implemented in an ASIC. The candidates with the highest  $E_T$  of both types in each of the two 4x4 regions covered by the card are transmitted to the Jet/Summary card.

### 5.5.2 Input

All data received by the Electron Identification card come from local Receiver cards in differential mode. Terminations for the lines will be on the cards rather than the backplane. Each card processes 32 towers of electromagnetic and hadronic information organized in a 4 x 8 array. Data are also required from the neighbouring 28 towers to determine isolation on the boundary of the 4 x 8 region. The 306 input lines required by the Electron Identification card are serviced by a 340-pin AMP stripline data connector. The additional pins beyond 306 will be used to forward the results to the Jet/Summary card and to input clock and control information. As in the case of the Receiver card, the top part of the Electron Identification card will use a 128-pin DIN connector to interface to a 32-bit VME bus.

### 5.5.3 Electron Isolation ASIC

The Isolation ASIC, shown in Fig. 5.11, handles four electromagnetic energies on a 7 bit scale, along with the corresponding HAC/FG Veto bit, every 6.25 ns. These 8-bit inputs are designated as Ain, Bin, Cin, and Din in the Fig. 5.11. The nearest neighbours are also included in the data flow. During the first cycle of every crossing the four neighbouring energies (TEA, TEB, TEC, TED) from the adjacent 4 x 4 region (top) are also be strobed into the ASIC. The neighbours along either edge of the 4 x 4 region (LEin, REin) are also included, two at a time (left and right edges), during each 6.25 ns period. Finally, the last cycle strobes in the four neighbouring towers of the bottom edge (BEA, BEB, BEC, BED). Thus, in one bunch crossing time, a total of 36 towers are clocked into the Isolation ASIC.

Separate inputs are used to clock in the top and bottom neighbouring towers in order to avoid unfavourable routing or extra components on the board due to board level multiplexing. The top and bottom neighbouring edges require a total of  $2 \times 4 \times 8 = 64$  input pins. The central region



**Fig. 5.11:** Electron Isolation ASIC block diagram.

needs  $4 \times 8 = 32$  input pins, and the left and right edges need  $2 \times 8 = 16$  pins. The total number of data inputs is 112 pins, for 36 towers of information. In addition pins are allocated for three bits of EM threshold, reset, and clock inputs. All signal I/O, with the exception of the clock, is single-ended. The single maximum two tower sum for the full  $4 \times 4$  region is obtained by comparing the four values output by the Find Max block to determine the largest of the four values in each 6.25 ns time frame and retaining the largest of four values generated over the four 6.25 ns cycles required to process the region.

The main data flow of the Isolation ASIC processes the data through three separate blocks. The purpose of the first of these, the Input Staging, is to receive the data at the time when it is available and change the time relationship to one suitable for the processing that follows. At the beginning of a crossing, the first row of the  $4 \times 4$  array is available, along with the top edge. The signal Cycle 1 selects the Top Edge input on the right hand multiplexer. After the first 6.25 ns clock, the first row of registers contain one of the towers in the  $4 \times 4$  array (a reference tower) along with its top neighbour. The leftmost register in the top rank is undefined at the beginning of the sequence. After a second clock cycle, the reference tower is in the middle register of the bottom rank of registers and its top neighbour is in the right hand register. The left-most register in the bottom rank contains the next successive reference tower, as does the middle register in the top

rank. This value is the bottom nearest neighbour for the first reference tower. The sequence continues through to the cycle where the last reference tower in a column of 4 towers is clocked into the middle register in the bottom rank. During the same cycle the Bottom Edge data are available from the neighbouring card. It is clocked into the bottom left register during Cycle 1 at the beginning of the next sequence.

Once the pipeline has been filled, data will continue to be output from the Input Staging block four towers at a time, each with their corresponding top and bottom neighbour. The left and right neighbours are either the adjacent reference towers in the  $4 \times 4$  array or the left and right columns of data from neighbouring boards.

The Input Staging block places each reference tower and its neighbours in the same time frame. The remaining blocks in the chip can now handle the processing in parallel. The function of the Add/Compare block is to form four sums between a reference tower and its top, bottom, left and right neighbours. At the same time the sums are being formed, four compares are made to determine for each pair of towers whether the reference tower is larger than or equal to its neighbour (“equality check”). When a reference tower and its neighbour satisfy the “equality check” the sum of the pair is enabled to the Find Max block. When the sum is disabled, a value of zero is passed on to the next block. If the adder has overflowed the result is set to the maximum possible value, hexadecimal 3F.

The next-to-last stage in processing the electromagnetic information is the Find Max block. The four sums are presented, in parallel, to two comparators. The outputs of these comparators are used to select the maximum of each pair which are placed in intermediate storage. These two maxima are presented to a single comparator during the next clock cycle. The output of this comparator is the maximum two tower sum for an individual reference tower. The single maximum from the original four values is stored in a register. The HAC/FG Veto, neighbour HAC/FG Veto, and the neighbour EM Veto bits described below are stored with each of these sums. A final stage of logic sorts through all 16 maxima generated over a bunch crossing time and places that value, along with its Veto bits, on the outputs of the ASIC. The total latency for the electromagnetic data path is  $12 \times 6.25$  ns or 3.0 bunch crossing times.

Three isolation criteria, central tower HAC/FG Veto, HAC/FG neighbour Veto and EM Veto computations are performed within the ASIC in parallel with the Find Max logic described above. A tower's HAC/FG Veto enters the ASIC in the same time frame as the 7 bit  $E_T$  information. The same staging circuitry used to equalize arrival times of the neighbour  $E_T$  information for a single reference tower is also used to put the HAC/FG Veto bits into the proper time sequence. Once all eight neighbours are timed in, an eight-way OR is performed to compute at the HAC/FG neighbour Veto.

The EM neighbour Veto requires a little more processing than that used on the HAC/FG neighbour Veto. The  $E_T$  information has already been timed in by the staging circuitry. It is presented to a bank of comparators where each value is compared against the same three bit threshold. The single-bit results from each of the compares are passed on to a large OR-AND array. The ORs check that each group of five towers, centered on one corner of a reference tower, are all below threshold. The ANDs gather the four results from the ORs for each reference tower and form the AND. If any one corner of the five-tower ORs has a value of zero, the AND (EM Veto) will be zero.

The results from the EM Veto logic are produced ahead of the two-tower sums and must be delayed within the ASIC by two clock cycles. This Veto, along with the HAC/FG neighbour veto and HAC/FG veto, is appended to the two tower  $E_T$  sum before it enters the Max tower circuitry. The result from the Max tower block has the correct Veto information associated with it and all 10 bits are placed on the outputs of the Electron Isolation ASIC.

We have implemented the same controller and instruction decoder for boundary scan as used in the Adder ASIC. All ASICs in the L1 trigger processor will have compatible boundary scan circuits.

### 5.5.4 Output

Isolated and non-isolated candidates, formed using the  $E_T$  values and three veto bits for each of the two 4x4 regions handled by the Electron Isolation ASIC are compressed to a 6-bit rank using a look-up table. These 6 bits, and a single position bit to distinguish the two regions handled by the card, are transferred to the Jet/Summary card at 160 MHz over the backplane.

## 5.6 Jet/Summary Card

### 5.6.1 Input

The Jet/Summary card receives 4x4 trigger tower energy sums, active trigger tower counts and minimum ionizing particle (MIP) identification bits from all Receiver cards in the crate, and isolated and non-isolated electron/photon candidate energies from all Electron Identification cards in the crate.

### 5.6.2 Electron/Photon Processing

The isolated and non-isolated electron/photon information from each of the 14 regions covered by the Jet/Summary card is received directly into SORT ASICs. After assigning the position bits the top-ranked four isolated and non-isolated electron/photon candidates are determined, keeping track of their 4-bit position information. The details of the sort logic are described below.

### 5.6.3 Sort ASIC

Sort ASICs are used for receiving electron data and sorting them by their rank bits at 160 MHz. This ASIC is also used simply to receive some data, e.g., 4x4  $E_T$  sums, without the sorting operation. Two ASICs are required to receive the data from the Electron Identification cards and four more are required to handle the  $E_T$  sums from the Receiver card. For the electron sorting, each pair of ASICs generates eight values. These eight values are multiplexed together in pairs and each fed into a third Sort ASIC to produce the final set of four largest values. The outputs from the third electron Sort ASIC are these four largest 6 bit values along with 4 bits each of positional information. These results are single-ended.

In the case of the sort of electron candidates the inputs are used in all four cycles. The sort reduces data so that only two cycles are needed for the output data. One cycle produces four

Isolated candidates, the other produces four Non-isolated candidates. Two Sort ASICs produce eight values (Isolated or Non-isolated) in the same cycle at 80 MHz. These values are multiplexed at 160 MHz into a third Sort ASIC. The output from this last ASIC produces four sorted Isolated electron candidates and four Non-isolated electron candidates on two cycles every bunch crossing.

The Sort ASIC has differential inputs so that data may be received directly from the backplane. This reduces the amount of receiver logic on the Jet/Summary card and allow us to use the JTAG Boundary Scan built into the ASIC to do backplane testing and to set up test data for the board itself. In addition to the data inputs we need three bits of positional information to tag the location of the  $4 \times 4$  region providing each piece of data. These bits are differential as well so that they can be used as differential signal receivers.

The Sort ASIC is designed to find the four largest of eight 6-bit values. Fig. 5.12 is an



**Fig. 5.12:** The Sort ASIC block diagram.

illustration of the major functional blocks that make up the ASIC. Rather than try to design an ASIC that will handle eight 6-bit operands in parallel, it was decided to shift the data in, four operands at a time, over two 6.25 ns cycles. The electron candidates come from the Electron Identification card in two groups of two. The first group will be the Isolated electrons, and the second group will be the Non-isolated electrons. The two groups are separated in time within the ASIC with the results of a sort appearing every 12.5 ns. In the case of the  $E_T$  energy sums, the two regions are received from each Receiver card by the Sort ASIC in successive 6.25 ns cycles. The results from this sort appear at the output of the ASIC once every bunch crossing.

The algorithm implemented within the Sort ASIC is based on a simple rotation of operands. The eight operands are divided into two groups of four. The operands are compared in pairs between the two groups, with the larger of the two taking over the position of the left hand

member of the pair. This comparison is performed in four stages with a rotation of compared pairs occurring between each stage. By the end of the fourth stage a sufficient number of comparisons have been made to ensure the four largest values are in the left-hand group. In order to save steps, and thus minimize the total latency, these four values are not placed in any rank order. The final four values, produced by the Calorimeter Global Trigger Processor, are ordered during the final sort.

#### **5.6.4    4x4 $E_T$ Sums**

The Jet/Summary card receives 10-bit 4x4  $E_T$  sums and 1 overflow bit for each of the 14 regions covered by the crate. These data are multiplexed for transmission at 80 MHz to the Cluster Processor crate for finding jets and  $\tau$  candidates.

#### **5.6.5    $\tau$ Veto Bit**

The Jet/Summary card receives 2-bit ECAL and HCAL activity counts for each of the 14 regions covered by the crate. If the trigger tower activity counts from ECAL or HCAL are greater than two, the 4x4 region  $\tau$  veto bit is set ON. There is enough room on the card to implement this algorithm in discrete logic components. The logic will be used at least twice per crossing to determine  $\tau$  veto bits for all 14 regions handled by this card.

#### **5.6.6    MIP Bit**

The HCAL trigger primitives generator from HB and HE regions sends on the serial links, in addition to tower  $E_T$ , one bit characterizing the energy profile. This bit is expected to be set ON for towers with  $E_T$  consistent with passage of a minimum ionizing particle through it. These fine-grain bits from each 4x4 region are ORed on the Receiver Card to obtain a single MIP identification bit. Two bits corresponding to the two 4x4 regions covered by each Receiver Card are forwarded to the Jet/Summary card.

#### **5.6.7    Quiet Bit**

In addition to the MIP identification bit, the Jet/Summary card determines quiet regions by thresholding 4x4 energy sums it receives from the Receiver Cards. There are two such bits corresponding to the fourteen 4x4 regions covered by the Jet/Summary card.

#### **5.6.8    Output Processing**

The Jet/Summary sorts the electron/photon candidates and forwards the top 4 isolated and non-isolated electron/photon candidates to the Global Calorimeter Trigger crates on copper cables. The Jet/Summary card forwards the MIP and Quiet bits from the 14 regions handled by it to the Global Muon trigger via the Global Calorimeter Trigger.

### **5.7    HF Crate**

The HF crate handles data from HF calorimeter which covers the regions  $-5 < \eta < -3$  and  $3 < \eta < 5$ . For this calorimeter there are  $2 (+z/-z) \times 4 (\eta) \times 18 (\phi)$  trigger towers each corresponding

approximately to the same size as 4x4 region sums of the HE. The HF crate has 9 HF cards which combine a part of the functionality of the Receiver and Jet/Summary cards in the 18 normal crates. Each of the 9 HF cards handles a  $40^\circ$  segment in  $\phi$  and both ends of the calorimeter as shown in Fig. 5.2. These cards receive data on Vitesse VSC7216 based Cu serial links using 2 mezzanine cards, adjust their phase to be aligned with the rest of the calorimeter, and provide look-up tables to linearize the data just as in the regular Receiver card. The data are then staged out to the Cluster crate on two 34-pair connectors as on the Jet/Summary card. The linearized data from HF is on 8-bit scale which is programmed the same as the jet  $E_T$  look-up-tables on the regular Receiver card.

## 5.8 Cluster Crate

The eighteen regional trigger crates and the single HF crate send 4x4 trigger tower  $E_T$  sums to a single Cluster crate where 12x12 overlapping  $E_T$  sums are calculated to form jet and  $\tau$  candidates. The Cluster crate consists of 9 Cluster Processor cards each receiving data from two regional crates and one HF crate on six 34-pair cables. The data from two regional crates, covering  $|\eta|<3$  and a  $40^\circ$  bin in  $\phi$  consist of 10 bit  $E_T$  and 1  $\tau$  veto bit per 4x4 region, are received on 80 MHz parallel differential ECL links by each Cluster Processor card. The same technology is used to receive data from HF crate covering  $-5<\eta<-3$  and  $3<\eta<5$ . In order to seamlessly cover the  $\eta-\phi$  plane these data are exchanged on a custom point-to-point backplane similar to the backplane of regional trigger crates. The mapping shown in Fig. 5.3 is implemented. Adder ASICs are used to make  $E_T$  sums of 3x3 trigger tower regions centred around each trigger tower region. The central 4x4 region  $E_T$  is required to be greater than the neighbours to the right and bottom, and to be greater than or equal to the neighbours on the left and top. Each Cluster Processor card produces 36 such 12x12  $E_T$  sums. These candidates are divided into central and forward jets. The central 28 sums are classified as  $\tau$  candidates if they are not vetoed by any of the nine 4x4 regions contained in it. The remaining candidates are classified as central jets.

### 5.8.1 Jet and $\tau$ Sorting

The central and forward jet, and the  $\tau$  candidate,  $E_T$  sums are compressed using a look-up table with a 6-bit rank and are also associated with 5 bits defining their position on each Cluster Processor card. These candidates are separately sorted based on their rank while keeping track of their position bits to find the top four candidates of each category. Sort ASICs are used for this operation. The top four candidates of each type, from each Cluster Processor card, are staged to the Cluster Output card for sending to the Global Calorimeter trigger. The Global Calorimeter trigger system continues the sort tree to obtain top four candidates over the entire  $\eta-\phi$  plane. Additionally, it also thresholds these candidates to count numbers of jets above programmable thresholds in various  $\eta$  regions to provide triggers for events with more than four jets.

### 5.8.2 Missing and Total $E_T$ sums

Four  $E_T$  sums per Cluster Processor card, corresponding to two  $20^\circ$   $\phi$  bins served by it covering  $-5 < \eta < 0$  and  $0 < \eta < 5$  separately, are calculated using Adder ASICs. The resulting 10-bit  $E_T$  and overflow bit for each of these four strips are passed to the Cluster Output card for sending to the Global Calorimeter Trigger system, where they are used for calculating both missing and total  $E_T$  sums and luminosity monitoring.

### 5.8.3 Connection to Global Calorimeter Trigger

Data are sent to the Global Calorimeter Trigger from 18 regional crate Jet/Summary cards on two 34-pair cables in parallel format at 80 MHz. In order to provide a low-cost and reliable solution that works on cable runs up to 20 m in length, we use ECL differential signals in parallel. This data includes the top 4 isolated and non-isolated electron/photon candidate ranks and position amounting to 10 bits per candidate and muon quiet and MIP bits for fourteen 4x4 regions handled by the card. The muon data are simply passed on to the Global Muon Trigger by the Global Calorimeter system. Twenty four such 34-pair cables at 80 MHz are used to transfer 36 counts each of central, forward and  $\tau$  jet objects, and 36  $E_T$  sums, with 11-bit datum each, from the Cluster crate to the Global Calorimeter Trigger subsystem.

### 5.8.4 Error Detection

### 5.8.5 System Test Errors

On system power up an automatic test of each card is envisioned. The result of the test can result in interrupting the crate processor. A power-up reset test and more detailed diagnostics using the boundary scan technology can be initiated by the crate processor. The results of the check and other status flags are expected to be indicated visually on the front of the card using colour coded LEDs. It is envisioned that the system can function with reduced functionality if a portion of it has errors.

### 5.8.6 Boundary Scan

All ASICs and some commercial components used in the Regional Calorimeter Trigger system contain boundary scan support. The ASIC boundary scan implementation, along with a proper board level implementation should provide full testing capability of the ASIC while it is in circuit. The boundary cells can also be used to verify circuit integrity (shorts, opens, and stuck at one/zero) at the board level. IEEE standard 1149.1 has been strictly adhered to in order to ensure compatibility with other ASIC and board level boundary scan controllers. The full JTAG controller and a major subset of the commands has been implemented. All inputs and outputs, with the exception of the five boundary scan control signals, have scan I/O cells.

An outline of the features under consideration for board level boundary scan are listed here. The Phase ASIC has boundary scan on its outputs only. However, this should be sufficient to provide JTAG control over most of the circuitry on the card. The Phase ASIC can be used to set up data across all the inputs to the look up tables. The Adder ASIC can capture the resulting data on its inputs or pass it through to the downstream logic. The Boundary Scan ASIC has scan cells on both its inputs and outputs. The inputs can be used to capture test data set by either the Phase or Adder ASICs. The outputs can be used to set up test data for the backplane. Scan registers in ASICs on the Electron Identification card and Jet/Summary card can capture the data directly off the backplane. The Boundary Scan ASIC also provides the differential drivers for the backplane data and delay registers for data alignment. The delay is programmable from zero to seven 6.25 ns cycles.

The board level boundary scan controller will have a hard wired program that is entered on power-up or by command through the VME interface. This program will use vectors stored in

a PROM on the board to perform a minimal test on intra-board circuit integrity and make simple tests of data paths within the ASICs. The vectors will check continuity, stuck-on ones and stuck-on zeroes, and shorts for those nets in the data processing path. The results of the applied vectors will be read back into the boundary scan controller via the scan loop and compared with the expected results, also stored in PROM.

The boundary scan controller interprets commands sent through the VME controller to perform the individual JTAG instructions implemented in the ASICs, captures the resulting data, and sends it back to the crate controller on request. Several boundary scan loops will be implemented rather than a single long loop in order to minimize the time required for tests of a specific sections of the logic.

### 5.8.7 Run Time Errors

#### Synchronization Errors

Synchronization errors result in zeroing of data so that rest of the trigger system continues to function. The error itself can interrupt the crate processor. It is also counted to accumulate diagnostic information over a power-up period. The counters can be read by the crate processor to keep tallies by link over extended periods. The synchronization is expected to be recovered automatically during LHC abort gaps when the resynchronization signal is distributed by the TTC system.

#### Data Errors

Data errors are found using the Hamming code analysis in the Phase ASIC. Again any errors result in zeroing of data bits to allow continued functionality of the rest of the subsystem. The zeroing of data ensures that no spurious triggers are issued. Such triggers can overwhelm the data acquisition system and are required to be avoided even at the cost of incurring some dead regions. Keeping track of error counts and enabling of error counter monitoring from VME will be built in. Any streaming errors from a particular link can be identified and corrective action at an appropriate level in the calorimeter readout can be made. The trigger system can function as a very fast monitor of any detector difficulties.

## 5.9 Latency

The latency of the Regional Calorimeter Trigger logic divided amongst several stages is estimated to be about 20 crossings as shown in Table 5.1. Since the logic is mostly clocked at 160 MHz, the operations are composed of 6.25 ns cycles, four of which add up to a single 25 ns crossing time. The signal data flows through various stages of processing are independent for jet and electron/photon streams after the Phase ASIC on the Receiver card. The electron/photon stream incurs a 20 crossing latency before its data are available at the output of the Jet/Summary card. The jet, muon MIP and quiet bit data are available after 11 crossing latency. The jet and  $\tau$  cluster formation in the Cluster crate is expected to take an additional 9 crossings latency including the cable between the Jet/Summary card and the Cluster crate. The total latency of the system, not including the cables to and from the Regional Calorimeter Trigger, is 20 crossings. It is expected that the 20 m cables which carry data in and out of the RCT system add an additional latency of 4 crossings each.

**Table 5.1:** Regional Calorimeter Trigger latency.

| Description                                       | Latency (# of 25 ns crossings) |
|---------------------------------------------------|--------------------------------|
| Electron/Photon output at Jet/Summary card        | 20                             |
| Jet and muon output at Jet/Summary card           | 11                             |
| Cable to Cluster crate and jet, $\tau$ clustering | 9                              |
| Total Regional Calorimeter Trigger                | 20                             |

## 5.10 Prototypes and Tests

### 5.10.1 Overview

The goals of prototype development [5.4],[5.5] for this trigger system are many fold. Our strategy was to build as far as possible full-scale prototypes so that system issues are confronted up front. Particular attention is to paid to include all sections of the trigger processing so that nominal component density and power are seen. This also enables accurate estimation of latency incurred at each step of processing.

### 5.10.2 Crate

A full-size crate housing various prototype cards has been built using standard VME components. Pictures of rear and front views of the prototype crate holding a prototype backplane and cards are shown in Fig. 5.13. The cards are mounted on both ends of the crate and a central backplane is used to share data using point-to-point links.

### 5.10.3 Backplane

To prove that high speed and high signal density can be handled, we have built a full sized backplane with point-to-point 160 MHz links and VME control. The backplane is a monolithic printed circuit board with front and back card connectors. The top 3U of the backplane holds 4 row (128-pin) DIN connectors, capable of full 32 bit VME. The first two slots of the backplane use three row (96-pin) DIN connectors in the P1 and P2 positions with the standard VME pinout. Thus, a standard VME module can be inserted in each of the first two stations. The form factor conversion to the remaining slots is performed on the custom backplane. The bottom 6U of the backplane, in the data processing section of the crate, utilizes a single high-speed controlled-impedance connector for both front and rear insertion. The design is based around a 340 pin connector, by AMP Inc., to handle the high volume of data transmitted from the Receiver cards to the Electron Identification and Jet/Summary Cards. There are 1419 differential 160 MHz point-to-point links on the backplane between the various cards. The backplane is constructed with five ground and power planes and five signal layers with the differential pairs held to the same layer.



**Fig. 5.13:** Rear and front views of prototype Regional Calorimeter Trigger crate showing the custom backplane, a receiver card, a clock card and an electron identification card.

#### 5.10.4 Clock and Control Card

We built a clock and control card, shown in Fig. 5.14, to provide the necessary signals to drive other cards in the crate. The signals on even the longest links on this backplane preserve their characteristics well. Results from testing clock signals on the backplane indicated rise and fall times of 0.8 ns from 20% to 80% height with reasonable signal levels even when measured at the farthest card slot. This performance meets the requirements of 160 MHz operation of the backplane.



**Fig. 5.14:** Picture of prototype Clock and control card.

### 5.10.5 Receiver Card

The Receiver card is by far the most complicated card in this subsystem. It involves receiving high speed signals, ensuring that their phase across the whole system is synchronized, reformatting the data using look-up tables for both electron/photon and jet data streams, pre-processing the data to reduce the backplane IO and staging the data to other cards within and outside the crate. During the design of the Receiver card the input link technology was changed from optical links to copper links. We have decided to make the system somewhat independent of link technology by designing the links on mezzanine plug-in cards. The phase adjustment is crucially dependent on the link technology. Given that the input requirements were not fully specified at that time we have not built in this portion of the Receiver card in the initial prototype. However, we left the back side of the Receiver card, where most of this functionality resides, free in the prototype. The front side of the prototype receiver card shown in Fig. 5.15 implemented all other functionality and also served as the Adder ASIC test card. The three Adder ASICs and all look-up tables needed in the design are all included. The look-up tables provided the test data for the Adder ASICs. Board level boundary scan is also implemented. The data staging circuitry both for backplane and crate-to-crate communication is included. The discrete component circuitry for data sharing ended up dominating the Receiver card layout. Therefore, we redesigned the Boundary Scan control, Electron Isolation and Sort ASICs to include the data driving/receiving capability.



**Fig. 5.15:** Picture of prototype Receiver card.

### 5.10.6 Adder ASIC

The Adder ASIC prototype implements the full functionality, i.e., sums 8 signed 10-bit operands to a single signed 13-bit result at 160 MHz in four 6.25 ns clock-steps. These ASICs, built by Vitesse in 0.6  $\mu\text{m}$  H-GaAs technology, chosen for its speed and ECL output capability, have

been tested to work at 200 MHz, well above our specifications. The Adder ASIC consists of approximately 11,000 cells and uses 4 W. The tests of the Adder ASIC on the receiver card were successful so that we deemed this design of the Adder as final and have procured production quantities.

### 5.10.7 Electron ID Card

A prototype of the Electron ID card has been made. The primary purpose of this card was to test dataflow between the Receiver card and the Electron ID card. As such this prototype included only the backplane data receiver and capture into on-board memory. This memory could be read out using the VME. This card could also be plugged into the Jet/Summary card slot in the crate for verification of the lines used by that card. The card did not actually implement the electron isolation algorithm as it is included in the ASIC that was still under design at that time. This card was used for bulk dataflow tests within the crate. These tests have been successful validating our 160 MHz point-to-point backplane link based design for the system.

### 5.10.8 Receiver Mezzanine Cards

The receiver card uses mezzanine cards mounted on its rear side to receive its input from calorimeter front-end crates. We have designed these mezzanine cards with serial link chips (Vitesse 7214 in this prototype) and associated signal equalization support. These circuits and an independent test card are in fabrication. We plan to use these cards to test the feasibility of receiving data on 20 m copper cables at 1.2 GHz.

### 5.10.9 Serial Link Test Card

The serial link test card provides for mounting of a transmitter and a receiver mezzanine cards. These cards are shown in Fig. 5.16. The test card also includes memory for driving a programmed pattern of signals through the serial links and capturing it. The full-rate transmitted and received data can be compared on the fly to count any errors. This test card provides features necessary to validate the link technology and compute bit error rate. The receiver mezzanine cards used on this crate have the same form factor as those to be used on the Receiver card. The initial test of serial links shows good signal transmission at 1.2 GHz as shown in the “eye” pattern in Figure 5.17. A Belden 9182 copper twinax cable with foamed dielectric was used with optimized signal equalization circuit. The signal on one side of a differential pair with random input, triggered on the 120 MHz signal that clocks in input data, is shown in Figure 5.17. The data transitions are clean as seen by the clear “eye” indicating good data transmission quality. The bit error rate for all four channels is below  $10^{-13}$  when measured using constant, random or other patterns.

### 5.10.10 Prototype Summary

The success of our Adder ASIC prompted us to select the same technology for making other ASICs in our system. We have made designs of the Phase, Electron Identification, Boundary Scan/driver and Sort ASICs and have agreement with Vitesse to produce them.

We are also carrying over the knowledge gained in making the prototypes discussed here in designing the final backplane and other cards. We are incorporating the refinements made to the electron finding algorithm that resulted in some changes in the dataflow. The Receiver card



**Fig. 5.16:** Pictures of Serial link test card, Transmitter and Receiver mezzanine cards showing the Vitesse 7214 link chips.

prototype prompted us to move most of the discrete logic, used on that card to stage data to EID and Jet cards, into the Boundary scan ASIC. Another change we made is to use on card DC-DC converters to distribute needed power at appropriate voltages.

## 5.11 Status and Schedule

The schedule and milestones for the Regional Calorimeter Trigger construction are shown in Fig. 5.18 and Fig. 5.19. The schedule is driven by the fact that CMS trigger system must be ready before any data taking period can begin in 2005. After passing the early milestones which



**Fig. 5.17:** The Signal on one side of a differential pair of a serial link transmitting random data at 1.2 GHz, triggered on the 120 MHz clock that feeds in parallel input data, is shown.

involved validating the design with prototypes the project has moved on to the phase of completing the design of ASICs and validating choice of key commercial components used in the design. Early manufacture of ASICs is advantageous both from budgetary and card manufacture schedule stand point. An agreement with Vitesse Semiconductor to manufacture all of the ASICs used in the RCT has been signed. ASIC designs are now complete and prototypes are being manufactured. Input gigabit serial link validation is also critical for finalizing the card designs and is being pursued. Full functionality prototype cards implementing the final algorithms discussed above are being designed to include the ASICs and serial link. The ASIC and card designs will be fully validated by the year 2002 when the final manufacturing phase begins. The system will be completely installed by the end of 2004 to begin commissioning for the CMS operations in 2005.

**Fig. 5.18:** Schedule for RCT construction.

**Fig. 5.19:** Milestones for RCT construction.

## References

- [5.1] J. Lackey et al., *CMS Calorimeter Level 1 Regional Trigger Conceptual Design*, CMS NOTE-1998/074.
- [5.2] W. Badgett et al., *CMS Calorimeter Level 1 Regional Trigger Electron Identification*, CMS NOTE-1999/026.
- [5.3] S. Dasu et al., *CMS Calorimeter Level 1 Tau, Jet and Missing  $E_T$  Trigger*, CMS NOTE in preparation.
- [5.4] S. Dasu et al., *CMS Regional Calorimeter Trigger Prototypes*, 5th Workshop on Electronics for LHC Experiments, Sep. 1999, Snowmass, Colorado, USA.
- [5.5] W. H. Smith et al., *CMS Regional Calorimeter Trigger High Speed ASICs*, 6th Workshop on Electronics for LHC Experiments, Sep. 2000, Krakow, Poland.

# 6 Global Calorimeter Trigger

## 6.1 Introduction

The Global Calorimeter Trigger (GCT) is the final component in the Calorimeter Trigger chain. Its purpose is to implement the stages of the trigger algorithms which require information from the entire CMS calorimeter system. The GCT receives trigger object data from the Regional Calorimeter Trigger (RCT), performs several stages of data processing, and sends a reduced amount of information to the Global Trigger (GT). The GCT design is described in [6.1]. The baseline functions, shown in Figure 6.1, include:

- Final-stage sorting of e/ $\gamma$ , jet and  $\tau$  jet trigger objects according to rank
- Jet counting
- Calculation of total and missing transverse energy
- Luminosity monitoring using L1 trigger data.

The detailed specifications for each of these functions are given in the next section. The GCT implementation is then described, along with the interfaces to other trigger system components and the strategies for system control, setup and test. Finally, the GCT prototyping programme and project status and schedule are discussed.



**Fig. 6.1:** Global Calorimeter Trigger block diagram.

## 6.2 System Specification

In this section, the GCT algorithms are specified, and the functional requirements for data capture and system control are given. This description corresponds to the baseline GCT system. However, the GCT is flexible by design, and extensions to the system functionality are expected in the future; an example of one such extension is provided in Section 6.2.7.

The trigger object data entering and leaving the baseline GCT system at each bunch-crossing are summarised in Table 6.1.

### 6.2.1 Trigger Object Sort

The principal function of the GCT is to reduce the number of trigger object candidates that need to be considered by the GT. This is achieved by sorting the trigger objects identified by the RCT according to a computed rank, and forwarding to the GT only a fixed number of objects with the highest rank. The rank of each object is in general based on its  $E_T$ , but may be modified by other factors such as isolation status.

The RCT identifies objects of five different classes:  $e/\gamma$ , isolated  $e/\gamma$ , central jets, forward jets and  $\tau$ -jets. For the  $e/\gamma$  and isolated  $e/\gamma$  classes, the four highest-rank objects of each type from each barrel / endcap trigger region are sent to the GCT. For central jets and  $\tau$ -jets, trigger regions in the barrel / endcap are grouped into pairs to give region pairs covering 40 degrees in  $\phi$ . For forward jets, a ‘forward trigger region’ also covers 40 degrees. The four highest rank objects of each type from each jet region are sent. The rank ordering of each incoming set of four objects is not specified by the RCT; that is, the objects are sent to the GCT in random order. The GCT sorts incoming objects separately within each of the five classes, and forwards four from each class to the GT. All sorting is performed strictly according to object rank; in the rare case of two or more objects of a class having identical rank, a simple priority algorithm is applied, based on object positions.

The information received by the GCT for each trigger object consists of a rank and a location in  $(\eta, \phi)$  space. The position information identifies the trigger subregion in which each object lies. Only the relative position within each trigger region or region pair is specified; each region consists of a maximum of  $7 \times 2$  subregions, and so the location can be represented in four bits for  $e/\gamma$  objects and five bits for jets. The location information for each object is recoded and expanded during GCT processing to reflect the absolute position within the global  $(\eta, \phi)$  space; this requires nine bits per object at the GCT output. The datum and sense of the final  $\phi$ -coordinate match those defined in Chapter 1. The rank is encoded in six bits, and is not altered by the GCT.

### 6.2.2 Jet Count

In order to improve the trigger efficiency for rare multi-jet events, a jet multiplicity trigger is implemented alongside the main jet algorithms. The GCT determines the number of central, forward and  $\tau$ -jet objects fulfilling each of several sets of rank and position criteria. The input for this calculation is the complete set of central, forward and  $\tau$ -jet objects sent to the GCT. The jet counts are sent to the GT.

**Table 6.1:** Trigger object data input and output for the GCT system. Quantities in parentheses denote output data.

| Source / Destination          | Data Type                                                                | Multiplicity | Total Bits per Bunch-Crossing  |
|-------------------------------|--------------------------------------------------------------------------|--------------|--------------------------------|
| Sort                          |                                                                          |              |                                |
| 18 regions                    | e/ $\gamma$ , isolated e/ $\gamma$ objects                               | $2 \times 4$ | $18 \times 8 \times 10 = 1440$ |
| 9 jet regions                 | Central jet, forward jet, $\tau$ -jet objects                            | $3 \times 4$ | $9 \times 12 \times 11 = 1188$ |
| Output to GT                  | e/ $\gamma$ , iso. e/ $\gamma$ , cent. jet, fwd jet, $\tau$ -jet objects | $5 \times 4$ | ( $20 \times 15 = 300$ )       |
| Jet count                     |                                                                          |              |                                |
| Output to GT                  | Jet count                                                                | 8            | ( $8 \times 4 = 32$ )          |
| Energy Sum                    |                                                                          |              |                                |
| 18 B/E and HF trigger regions | Half-region $E_T$ sums                                                   | 2            | $18 \times 2 \times 11 = 396$  |
| Output to GT                  | Total $E_T$                                                              | 1            | (13)                           |
|                               | $E_T^{\text{miss}}$ magnitude                                            | 1            | (13)                           |
|                               | $E_T^{\text{miss}}$ $\phi$ angle                                         | 1            | (6)                            |
| Total                         |                                                                          |              |                                |
| Input data from RCT           |                                                                          |              | 3024                           |
| Output data to GT             |                                                                          |              | (364)                          |

Eight separate jet counts are generated. Each count may have associated maximum and minimum cuts on rank,  $\eta$  and  $|\eta|$  position. The input and output jet counts are encoded in four bits; an overflow in the number of jets is recorded as the maximum value.

### 6.2.3 Global Energy Calculation

The GCT is required to calculate the total scalar transverse energy,  $E_T$ , and the direction and magnitude of the missing transverse energy vector,  $E_T^{\text{miss}}$ , for each bunch-crossing. Transverse energy sums are calculated by the RCT for each trigger subregion in the barrel / endcap and in the HF. The GCT uses these data to calculate the total scalar  $E_T$ , and combines them with programmable information on the mean  $\phi$  angle of each scalar energy integration region to obtain  $E_T^{\text{miss}}$ . The final energy sums are sent to the GT.

In order to achieve adequate missing energy resolution, it is necessary to calculate  $E_T^{\text{miss}}$  using scalar transverse energies integrated over no more than  $20^\circ$  around  $\phi$ . The RCT

therefore calculates and sends to the GCT the total transverse energy sum for the two  $\phi$ -halves of each trigger region separately, each half containing  $7 \times 1$  subregions in  $(\eta, \phi)$ . In the HF, the RCT sums transverse energy separately for each of 18 sectors, each  $6 \times 1$  trigger towers in  $(\eta, \phi)$ ; the corresponding energy sums for the barrel / endcap and HF subdetectors are added in the RCT, and 36 separate energy sums are sent to the GCT, each corresponding to an area covering 5 units of  $\eta$  by 20 degrees in  $\phi$ . For each scalar transverse energy sum received by the GCT, the orthogonal components ( $E_x, E_y$ ) are calculated, and  $E_T, E_x$  and  $E_y$  are summed for the whole detector. Finally, the total ( $E_x, E_y$ ) are used to calculate the magnitude and direction of  $E_T^{\text{miss}}$ .

Each transverse energy sum sent by the RCT is represented on a linear 10-bit scale, and is accompanied by an overflow bit. If the overflow bit for any input energy is set, the overflow condition is propagated through the  $E_T^{\text{miss}}$  calculation, and a corresponding flag bit is set in the output sum to the GT. The global  $E_T$  calculated by the GCT is represented on a 12-bit scale, plus the overflow bit; the energy scale may be programmed as linear or compressed. The  $E_T^{\text{miss}}$  magnitude is likewise on a programmable 12-bit scale plus overflow, and the  $E_T^{\text{miss}}$   $\phi$  angle is represented in six bits. The datum and sense of the  $\phi$  angle match those defined in Chapter 1, allowing a straightforward comparison between the missing energy vector and muon candidate directions in the GT.

### 6.2.4 Luminosity Monitoring

Accurate knowledge of the LHC luminosity at the CMS interaction point is necessary in order to measure physics cross-sections. Absolute luminosity measurements will be performed using specialised detectors, with the resulting data evaluated offline. However, it is also important to perform continuous online monitoring of luminosity in order to provide rapid feedback to the CMS and LHC operators, and to monitor the luminosity for each LHC bunch-pair individually. Since the GCT receives data from the whole calorimeter system for every bunch-crossing, it is possible to provide online luminosity monitoring on a bunch-by-bunch basis.

In principle, the rate of any distinguishable physics signal can be used as a measure of relative luminosity. In practice, it is necessary to use signals with a rate that is low enough to count, but high enough to provide a rapid statistical measurement, and which are reasonably free of contamination by background and pile-up effects. The exact choice of channels is under study, but the measurements are likely to be based on rates of high- $p_T$  jets and/or global energy flows. It is therefore probable that the luminosity subsystem of the GCT will make direct use of the jet counts described in Section 6.2.2, along with summary data from the global energy calculation subsystem.

The calculated bunch luminosities are sent to the CMS detector control system at regular intervals. The goal is to provide an updated luminosity estimate for each LHC bunch every few seconds during normal running, with a relative precision of less than 10% for measurements integrated on this time scale [6.2].

### 6.2.5 Trigger Data Capture

The GCT is required to provide information to the CMS DAQ system for each event passing the L1 trigger. This information will be used for online performance monitoring and offline analysis of trigger efficiency. The DAQ and higher-level trigger systems may also make use of event summary data from the L1 trigger in order to identify regions of interest.

All input and output data associated with the GCT are made available to the DAQ system, along with selected intermediate data. This allows the performance of the GCT to be monitored and communication errors with other components of the L1 trigger system to be identified. The fraction of the available data which is sent to the DAQ system is programmable, as more detailed information will be required during setup and diagnostic periods than during normal running.

The interface between the GCT and DAQ systems is through the calorimeter Trigger Readout Crate (TRC) described in Chapter 7. The GCT captures information from the trigger data stream and buffers it for the duration of the remaining L1 latency. Upon receipt of a L1 accept signal (L1A), the data is passed into a derandomiser, grouped into a single event block, and sent to the TRC. Sufficient capacity is provided by the various buffers and by the GCT to TRC data transmission channel to allow, for example, the storage of all GCT input/output data corresponding to 4 bunch-crossings before and after each bunch-crossing passing the L1 trigger, at the maximum L1A rate.

### 6.2.6 Control, Test and Monitoring

Automated control, test and monitoring of the GCT system is an important requirement. It must be possible to perform system setup, reset, test and reconfiguration without physical access at any time during LHC running.

The GCT is capable of performing rigorous self-test on a chip, board, and system level whilst *in situ*, and without manual intervention. Tests of almost every system component may be performed using simulated physics data generated by software, previously captured input data or real input data from the RCT. The GCT can also fulfil a variety of test functions in cooperation with other trigger subsystems, in order to test synchronisation and communication link performance.

Continuous online monitoring of GCT operation is performed by capturing data from the normal trigger datastream. The system is controlled by software capable of diagnosing problems using the monitoring data, and alerting the supervising trigger control system to take corrective action (e.g. halt, system reset, buffer flush) when necessary. Fast status signals are also derived directly from the GCT processor logic in order to detect conditions such as synchronisation loss and buffer overflow, and sent to the trigger control system via dedicated hardware links. The GCT provides data to the TRC system to allow monitoring in conjunction with the rest of the calorimeter trigger (see Section 6.2.5).

High-level control of the GCT system will be coordinated through the trigger control system; the interface to this system will be implemented at a software level (see Chapter 7).

### 6.2.7 GCT Extensions

The GCT is designed to be sufficiently flexible to accommodate future extensions. It will be built from a small number of generic module designs, allowing the implementation of additional functions by simply installing extra modules of the same type.

One such extension which has been studied in detail is the possibility of performing the sliding window jet cluster algorithm using the GCT hardware. In this option, the GCT receives 14 subregion energies from the Jet/Summary Card in each RCT barrel / endcap crate (see Chapter 5),

along with the  $\tau$  feature bits. The subregion energy data from the HF is also input. A clustering algorithm based on a  $3 \times 3$ -subregion sliding window is employed to find jets over the full range  $|\eta| < 5$ . The jets found by this procedure are then sorted as in the baseline design in three streams: central jets, forward jets and  $\tau$ -jets. The energy summation needed for the  $E_T$  and  $E_T^{\text{miss}}$  calculation is performed simultaneously.

To allow the jet processing to be performed in the GCT, the generic hardware module designs must allow some sharing of data between different processing elements. This is not a requirement for the implementation of the baseline functionality, but significantly increases the scope for further upgrades. Such features can be implemented easily within the architecture described in the following sections. More details on the jet cluster implementation in the GCT can be found in [6.3]. The number of modules required to perform the jet processing is similar to that required for the baseline GCT. The baseline system and the cluster processing together occupy three crates of electronics in a single rack, with some slots free for future possible upgrades.

## 6.3 System Overview

In this and the following sections, the implementation of the baseline GCT system is described. The overall layout of the system is presented first, followed by the detailed design of each of the components in later sections.

The design goals for the GCT implementation are:

- A high degree of modularity, to simplify the design, test and maintenance of both hardware and software
- Flexibility and room for expansion
- Comprehensive self-test and monitoring capability
- Low latency
- Reasonable cost, through use of commodity components.

In order to fulfil these goals, the hardware design makes use of two key technologies. Firstly, most processing functions are performed using fast  $0.18\mu\text{m}$  FPGAs. This approach results in a significant reduction in system cost, since small production runs of a number of different ASICs would otherwise be required. The use of programmable logic also allows a large degree of flexibility and modularity, since identical hardware may be used for a variety of processing functions. Secondly, all data transfer within and out of the system is performed using high-speed serial links built around a low cost off-the-shelf chipset. The elimination of large numbers of parallel connections on cables and backplanes enables a reduction in complexity and an increase in system density.

The GCT system layout in the baseline configuration is shown in Figure 6.2. The GCT is located within the CMS underground counting room USC55. The system is housed in two electronics crates, located within a single rack. The main GCT functions are implemented using two types of board. The data processing functions are performed by seven Trigger Processor Modules (TPMs). An additional TPM is used for data collection and the interface to the TRC system. Input data from the RCT are received, synchronised and reformatted by 15 Input Modules (IMs). System setup, control, test and monitoring are performed by a cluster of embedded



**Fig. 6.2:** Layout of the GCT system within USC55.

processors located on the TPMs. Each TPM and IM is implemented on a single  $9\text{U} \times 400\text{mm}$  printed circuit board.

Input data are transmitted from the RCT to the GCT as 80 Mbit/s parallel differential ECL signals, carried on 60 34-pair twisted-pair cables. Each IM receives data from up to four such cables. The input signals are sampled at 320 MHz and synchronised with the 80 MHz GCT system clock using variable-length FIFO buffers. The data are then serialised, and sent on 1.68 Gbit/s twisted-pair links to the TPMs. The IMs are all identical at the hardware level, but are programmed to provide various mappings of parallel data onto serial links as required by the different GCT subsystems. Clock and synchronisation signals are distributed to each IM optically through the TTC system[6.4]. Control and setup functions for the IMs are carried out over a serial backplane bus.

The TPMs also share a common hardware design. Trigger data is sent to, from, and between TPMs using identical fast serial links. Each board has up to 18 input channels, divided into banks of six. Each input channel has associated logic and deep memory buffers for data capture and test data injection. Data from each group of six input channels form the input to a ‘Stage A’ algorithm processor, and three such Stage A blocks on each TPM in turn feed a single ‘Stage B’ processor. The final output from the TPM is sent over up to six output channels, each

with associated data capture memory. A further output channel on each board is used for DAQ purposes, and is controlled by dedicated readout logic. Each TPM contains an embedded CPU which is responsible for board control, setup, test and monitoring. Communication between CPUs, and with the trigger control system, is via standard 10 Mbit/s ethernet. Clock and synchronisation signals are distributed optically to each TPM through the TTC system.

**Table 6.2:** GCT module allocation.

| Subsystem                 | IM  | TPM |
|---------------------------|-----|-----|
| e/ $\gamma$ sort          | 4.5 | 1   |
| Isolated e/ $\gamma$ sort | 4.5 | 1   |
| Central jet sort          | 1.5 | 1   |
| Forward jet sort          | 1.5 | 1   |
| $\tau$ -jet sort          | 1.5 | 1   |
| Global energy sum         | 1.5 | 1   |
| Jet count                 | 0   | 1   |
| Luminosity                | 0   |     |
| TRC interface             | 0   | 1   |
| Total                     | 15  | 8   |

The number of IMs and TPMs required by each GCT subsystem is shown in Table 6.2. Six of the TPMs directly process data received from the RCT via the IMs. Another TPM is devoted to calculation of jet counts and measurement of luminosity using output data from the jet sort and global energy sum logic. An eighth TPM collects and concentrates data from the others and sends it via the TRC system to the DAQ.

The GCT IMs are also used to distribute HCAL feature data from the RCT directly to the Global Muon Trigger (GMT); this data does not pass through any TPM. The feature data are sent to the GMT in the same format, and on the same type of serial link, as the output data to the GT.

The overall latency of the GCT system is estimated at 15.5 bunch-crossings, including four bunch-crossings of contingency. This estimate excludes input and output cable delays.

## 6.4 Implementation: Interfaces

### 6.4.1 Regional Calorimeter Trigger Interface

#### Data Input System

The RCT sends 3024 bits of calorimeter trigger data to the GCT per bunch-crossing, plus 504 bits of data for the GMT. These are accompanied by extra bits used for synchronisation

between the two systems. The RCT system design dictates that data must be transmitted over parallel copper cables using differential ECL signalling. The distance between the RCT and GCT systems will be around 20m, giving a practical upper limit of 80 Mbit/s on the bit rate per differential pair. Data are sent over 60 34-pair twisted-pair cables, with up to six trigger objects or  $E_T$  sums transferred on each cable per bunch-crossing. A full technical specification for the RCT-GCT interface is given elsewhere [6.5].

The use of parallel data transmission requires a large number of electrical connections between the RCT and GCT systems. This places an effective lower limit on the physical size of the GCT, both in terms of front-panel space required for connectors and board space required for receiver circuitry. Distributing input and algorithm processing functions across many boards would lead to a large number of inter-board links and a corresponding increase in latency and system complexity. A dedicated input system is therefore used to reformat the parallel input data onto short-haul serial links, and thus allow each GCT subsystem to be accommodated on a single TPM. This approach also has the advantage that input and algorithm processing functions are effectively decoupled, allowing the use of a ‘generic’ TPM with fixed I/O capability for all algorithm functions.

### Synchronisation

The mapping of data bits from the parallel input cables to the serial output links varies between IMs, and data from several sources within the RCT may need to be combined on a single serial link. All input data must therefore be synchronised to the GCT clock and to the correct bunch-crossing before serialisation.

Signals are transmitted from each RCT output board with less than 1ns combined skew and timing jitter. However, several effects contribute to timing variation of signals arriving at the GCT. These include:

- Clock phase differences between the RCT and GCT, and between RCT crates
- Cable length tolerances and pair-to-pair skew within cables
- Data-dependent deterministic jitter
- Intentional latency differences between different parts of the RCT system.

In order to ensure the robustness of the RCT-GCT interface over a wide range of timing conditions, the GCT input system is designed to receive data of arbitrary phase on each input line. There are therefore four main requirements for the synchronisation system. Firstly, the incoming signals on each differential input must be sampled with the optimum phase in order to minimise bit errors during data capture. Secondly, the data stream for each input bit must be synchronised to the local GCT clock. Thirdly, all input data must be globally aligned to the same 80 MHz cycle within the same bunch-crossing. Fourthly, bunch-crossing identification must be correctly carried out for all input data; that is, the GCT must know when data corresponding to any given LHC bunch number is entering the system.

The organisation of the GCT input synchronisation logic is shown in Figure 6.3. Each incoming signal is sampled at 320 MHz, and the resulting bit stream stored in a FIFO buffer. This is followed by another FIFO clocked at 80 MHz. The length of each 320 MHz FIFO is set such that the best-placed sample is clocked into the next stage. The 80 MHz FIFOs have their length individually adjusted so that data from the same bunch-crossing is globally aligned across all IM



**Fig. 6.3:** Organisation of the GCT input synchronisation logic

outputs. The serial cables from the IMs to the TPMs are of uniform length, and the phases of the IM clocks are adjusted such that the timing of transitions at the deserializer outputs on each TPM are compatible with the local board clock. The cables carrying e/γ and isolated e/γ objects from the RCT also contain a ‘BC0’ flag signal which is asserted once per LHC orbit. This bit is included in the serial data stream from the relevant IMs, and is used during system setup, in conjunction with the TTC BC0 flag bit, to correctly align the bunch counter logic on all TPMs.

The adjustment of FIFO lengths and IM clock phase may be carried out automatically under software control whenever requested. During timing setup, test patterns are sent from the RCT in place of normal trigger data. The control software systematically varies delays at different points in the signal chain in order to locate the transitions in the incoming signals, and thus identify the optimum timing setup. The initial ‘timing-in’ of the TPMs and IMs with one another is carried out manually. Clock and synchronization signals are distributed around the GCT system through a passive optical fan-out, and so the TTC clock phase between boards is therefore not expected to vary significantly after system installation.

### Input Module Design

The input module block diagram and conceptual layout are shown in Figure 6.4. The IM is implemented in a standard 9U × 400mm Eurocard format. Each IM terminates up to four input cables, which connect at the rear. The connectors are 68-pin SCSI-2 type, chosen for their high pin density and mechanical robustness. All incoming differential lines are parallel terminated, and the ECL signals are converted to differential LVPECL by discrete line receivers. Even though the differential transmission scheme can tolerate up to 1 V or so of common mode offset, it is possible that problems may occur due to ground level differences between the RCT and GCT systems or



**Fig. 6.4:** GCT Input Module block diagram and layout.

due to excessive common-mode noise caused by ground loops. It is foreseen that the IMs will couple the connector shields to the GCT chassis ground via a parallel resistive-capacitive coupling designed to minimise low-frequency ground noise; this arrangement may be changed to a direct DC coupling via jumper selection on the IMs, or the coupling may be removed altogether. A similar grounding arrangement is used at the RCT end of the cables.

All synchronisation logic is implemented within FPGAs; the device currently under consideration for this task is the Xilinx Virtex XCV100E [6.6]. An IM contains four such FPGAs, each handling the 34 signals from one cable. The FPGAs contain the synchronising FIFOs, control registers, and a pattern generator for testing the serial links. One of the FPGAs also contains additional logic for interfacing to the board control bus and the TTCRx ASIC. Each FPGA drives two serialiser chips, and the serial signals leave from the front of the board.

The serial links are implemented using the National Semiconductor DS90CR217/218A Channel Link chipset [6.7], which is capable of transmitting at 1.68 Gbit/s over distances of up to 3 m; the distance from IM output to TPM input is foreseen to be around 1 m. The parallel data are presented to the serialiser chip and emerge from the deserialiser chip as 21 80 Mbit/s streams. The Channel Link chipset offers the advantages of low cost, low power consumption and simple control interface. The parallel input data is transmitted using 7-to-1 time multiplexing as three LVDS signals of 560 Mbit/s each; a fourth LVDS signal is used to transmit a synchronised clock. This transmission format allows the use of commodity four-pair STP cable and RJ45 connectors, both designed for Ethernet transmission. A single serial link is therefore capable of carrying four ten-bit e/ $\gamma$  objects per bunch-crossing from one parallel input cable, or two links may be used to carry six 11-bit jets or local  $E_T$  sums. The mapping of parallel input bits to each serial link may be selected by programming FPGA registers; this allows the same FPGA programs to be used for every IM.

Clock and synchronisation signals are distributed to the IMs through the CMS-wide TTC system. Each IM contains a TTCSR ASIC and optical receiver, and the TTC signals are distributed via a passive optical fanout. The programmable deskew facility of the TTCSR is used to synchronise the IMs with one another to within 500 ps, and to globally adjust the phase of the serial link clocks to suit the TPMs. The IM and TPM clocks are therefore not generally phase-synchronous. Clock signals are distributed around the board at 40 MHz in LVPECL format; all higher-frequency clocks, including the 80 MHz LVTTL clock signals needed for the serialisers, are generated within the FPGAs.

Each IM has a JTAG interface which is used for downloading and verifying the FPGA programs, and a low-speed serial control interface which is used for programming the FPGA and TTCSR registers. These signals are bussed along a passive crate backplane and enter the IMs through a connector at the bottom rear of each board. The IMs are addressed geographically, the address being programmed by a set of identity pins on the backplane connector; for JTAG, board addressing is achieved through use of a discrete control chip. The JTAG and control buses are both asynchronous with respect to the 40 MHz system clock, and are distributed via multi-drop Bus LVDS.

The IMs are supplied with regulated power at +/-5V through the backplane connector. In order to minimise the number of supply voltages which must be distributed, each IM contains converters to generate the additional 1.8V, 2.5V and 3.3V supplies required by the board chipset. The backplane bus contains an asynchronous reset line which is asserted by the main crate power supply at startup.

## Input Crate Layout

The baseline GCT input system requires 15 IMs. The modules are housed in a standard 21-slot 9U crate. An IM is identical to a TPM in terms of physical size and power / control / TTC interface; IMs and TPMs are thus interchangeable and may be mixed in the same crate. However, it is anticipated that the IMs will be housed in a dedicated ‘input crate’ positioned above a corresponding ‘processor crate’ within the GCT rack. The system is designed with capacity for future expansion of the system, and so each crate is constructed to service a full complement of 21 cards if necessary. The IMs are cooled by forced vertical airflow. A passive backplane runs the length of the crate, and distributes power and bussed signals. The backplane signal lines are controlled via a twisted-pair cable from the processor crate; the cable connects via a rear header at

one end of the backplane, and a bus terminator connects to an identical header at the other. Regulated power is supplied to the backplane from a PSU mounted at the rear of the crate.

The RCT system is located on the floor above the GCT within USC55 [6.8]. In order to minimise the length of the parallel input cables, they are passed directly through the intervening concrete deck to drop down vertically at the rear of the GCT rack. The cables are then routed to the rear of the input crate. Since the input cables are of heavy gauge, appropriate support hardware is provided at the rear of the input crate to relieve strain on the connectors. The serial cables leave from the front of each IM, and are restrained during the short run to the processor crate in order to prevent undue vibrational stress on the connectors.

A possible assignment of data cables to IMs is given in the RCT-GCT interface specification document [6.5]. In the proposed scheme, the total number of parallel cables entering the input crate is 60, and there are 84 serial links used for data transmission to the GCT TPMs, plus a further 36 for the connection to the GMT.

#### 6.4.2 Global Trigger Interface

The GCT sends 364 bits of trigger data to the GT per bunch-crossing. These are accompanied by further bits used for synchronisation and test purposes. The GCT and GT systems will be housed in adjacent racks, which allows the use of short-haul serial links for data transmission. The same Channel Link chipset is used for the GT interface as for the IM-TPM links, but the data rate is halved to 840 Mbit/s (corresponding to 40 Mbit/s parallel data rate). The maximum link distance is thereby extended to around 6m. Similar grounding precautions to those used for the RCT interface are observed.

The GCT-GT interface makes use of 24 serial links; each of the 20 final sorted trigger objects is carried on one link, and four further channels are used to transmit the global energy sums and jet counts. Further details of the interface are given in Chapter 15, and elsewhere [6.9]. The bits on each serial link which are not required for trigger data are used for parity bits, a continuous bunch-crossing counter for synchronisation verification, and a flag bit to indicate whether test data or real trigger data are being sent.

Responsibility for data synchronisation at the GCT-GT interface lies with the GT system. In order to facilitate automatic synchronisation and test of the interface, the GCT sends predefined test patterns in the place of trigger data when requested by the trigger control system. All output data from the GCT are bunch-crossing aligned, regardless of any latency differences within TPMs.

#### 6.4.3 Global Muon Trigger Interface

The GCT receives 28 bits of HCAL feature data from the RCT for each barrel / endcap trigger region. These data are carried using spare capacity on the e/ $\gamma$  and isolated e/ $\gamma$  object cables. The data are synchronised and reformatted by the IMs in an identical way to the GCT input data, with the exception that the incoming data are forwarded directly to the GMT and are not bunch-crossing aligned with data destined for the TPMs.

The feature data are sent directly to the GMT over serial links, without passing through any TPM. The GMT is physically integrated with the GT (see Chapter 14), and the transmission distance will therefore be identical to that for the GT links. The physical transmission scheme and

data format are identical to those used for the GT interface, though the GT and GMT output data are not synchronised. The GMT interface requires 36 serial output links.

#### 6.4.4 TRC Interface

The amount of data transferred from the GCT to the TRC (and thence to the CMS DAQ) may be varied according to the level of monitoring or debugging required. The TRC interface is designed to cope with a ‘worst-case’ scenario of recording of all input and output data associated with the baseline GCT, for 4 bunch-crossings before and after every L1A, at the maximum rate of 100 kHz. This corresponds to an aggregate data transfer rate of around 4 Gbit/s, or 100 bits per bunch-crossing.

The DAQ data stream is assembled by the DAQ concentrator TPM using data accumulated from all other TPMs. The output from the concentrator module is carried on the same type of serial link as from the other TPMs, and four of the six available output links are used. The TRC system is likely to be located several tens of metres from the GCT, so it is not possible to use Channel Link directly for the interface. Instead, optical links are employed to transmit the data to a Trigger Concentrator Module (TCM) within the TRC using the S-LINK protocol [6.10].

The S-LINK interface module is implemented on a single 9U × 400mm board within the processor crate. Each S-LINK channel is 16 bits wide, and runs at a parallel data rate of 40 MHz. Eight channels are therefore required to accommodate the full DAQ data stream. It is foreseen that commercial S-LINK transmitter daughterboards will be used to implement the main functionality of the TRC interface.

#### 6.4.5 TTC Interface

The GCT relies upon the CMS-wide TTC system to provide timing, synchronisation and some high-level control signals. Specifically, the 40 MHz clock and BC0 signals are used by all modules in the system for synchronisation purposes. The L1A signal is used by the TPMs to control the DAQ interface on each board. The broadcast command facility is used to provide a hard reset signal for the TPM processors in case of serious software failure, but this facility is not expected to be used under normal circumstances. The GCT system will correctly interpret any CMS-wide or trigger-wide standard broadcast commands, and take appropriate action.

The physical interface to the TTC system is an optical receiver on each IM and TPM. The GCT rack contains a passive optical fanout in order to provide signals to every board, and the signal distribution from the TTC transmitter to the GCT system is via a single fibre.

### 6.5 Implementation: Processing

#### 6.5.1 Trigger Processor module

##### Introduction

The purpose of the TPM is to provide a generic processing platform using which a variety of trigger algorithms may be implemented. All TPMs share the same hardware design; in addition to the large reduction in design effort this approach entails, it also allows the GCT system

functionality to be easily extended, modified or upgraded, reduces the quantity of spare components required, and makes the development of system software more straightforward.



**Fig. 6.5:** GCT Trigger Processor Module block diagram and layout. Clock and synchronous control lines are omitted for clarity.

The TPM block diagram and conceptual board layout are shown in Figure 6.5. Each module contains several distinct blocks of logic, each of which is discussed in the following sections. All trigger processing functions are implemented in FPGAs, with external memories used as buffers and lookup tables. The bussed data lines between FPGAs are implemented in a single-ended low-swing logic standard (e.g. SSTL), and run at 80 Mbit/s. The board clock is distributed as 40 MHz differential LVPECL through a low-skew fanout tree. Higher-speed clocks are generated within the FPGAs, and used to clock the associated external memories and serialiser chips where necessary. Four other synchronous signals (reset, BC0, L1A, sync) are multiplexed

together into a single 160 Mbit/s signal, and distributed through a fanout tree in a similar way to the main clock.

The boards are constructed in standard 9U × 400mm format, and are housed in a 21-slot crate (the ‘processor crate’). Since the baseline GCT functionality requires only eight TPMs, there is capacity for extension of the system; the GCT implementation is able to scale up to 19 TPMs if necessary. No separate crate controller board is used. Regulated power at 5V and 12V is supplied via a connector at the bottom rear of each board, the connector also serving to provide mechanical support and location. The boards contain converters to generate the 1.8V, 2.5V and 3.3V supplies which are also required.

## Trigger Data Input/Output

Trigger data enter, leave, and travel between TPMs through ‘data ports’ at the front and rear of the boards. Each data port is implemented as a high-speed serial Channel Link connection running at up to 1.68 Gbit/s (i.e. 80 MHz parallel data rate). Shielded 8-way RJ45 connectors are used due to their good electrical performance, proven reliability, and adequate connection density. Each TPM has 24 main data ports, grouped into banks of six; three banks are positioned at the front of the board, and one at the top rear. The front-mounted banks contain dedicated input ports, running at 1.68 Gbit/s. Each port within the rear bank may be configured separately as an input or output, by manually inserting or removing a passive termination network. The serial outputs may transmit at 1.68 Gbit/s or 840 Mbit/s parallel data rate; all data transfer within the GCT system takes place at full speed, whereas the serial links to the GT run at half speed. A further full speed output port at the bottom-front of each board is used for data transfer to the DAQ concentrator TPM.

Each bank of data ports has an associated block of I/O logic. This is implemented in two FPGAs, each handling three ports; a possible device for this role is the Xilinx XCV300E [6.6]. For input ports, data emerging from the Channel Link deserialisers are registered and reformatted by the I/O logic for transmission to the algorithm logic. For output ports, data from the algorithm logic are reformatted, demultiplexed to 40 MHz if necessary, and clocked into the Channel Link serialisers. The output logic includes programmable-length FIFOs to bunch-crossing align the data from all GCT outputs, if necessary due to latency differences between subsystems. The I/O FPGAs are each connected to the algorithm logic by 64-bit wide data paths. Clock and control signals for the serialiser and deserialiser chips are generated within the FPGAs.

During normal TPM operation, all incoming and outgoing data are captured in deep buffers implemented using external dual-port memories (DPMs). These memories may act as both latency buffers for the DAQ data, and as spy / playback buffers for monitoring and test purposes. Both ports of each DPM block are connected to the associated I/O FPGA so that the contents may be read out for DAQ or diagnostic purposes without interrupting the main trigger functions or data capture.

## Algorithm Processing

The algorithm logic on each TPM is arranged as a tree, with processing performed in two sequential stages. All logic is fully pipelined. The three front input banks are each connected to a separate Stage A processing block through two 64-bit data buses. The three A blocks each implement the first part of the relevant trigger algorithm in parallel, and forward a reduced amount of data to a single Stage B block; the data bus from each A block is 64 bits wide. The B block is

connected directly to the rear bank I/O FPGAs, through two 64-bit buses. The directions of the data flow between the A and B blocks, and between the B block and rear I/O bank, are programmable, allowing maximum flexibility in the way the TPM resources may be used.

Most small lookup table (LUT) and FIFO memory arrays required for algorithm processing are implemented using FPGA internal resources. For applications where a large LUT is required, however, this cannot be efficiently implemented in an FPGA. For example one of the GCT baseline algorithms involves the calculation of missing energy from input energy sums (see Section 6.5.4). An external 2 Mbit 80 MHz synchronous SRAM is coupled to the B block to fulfil this function.

Each algorithm block is implemented using a single Xilinx Virtex FPGA [6.6]. The device under consideration for both the A and B blocks is the XCV1600E. Details of the mapping of the different algorithms onto TPM resources are given in Section 6.5.4.

### TRC Interface

Each TPM is capable of capturing all input and output data for DAQ purposes. The usual mode of operation is to send to the TRC system the data corresponding to each bunch-crossing with an associated L1A. Other modes are available, however; for instance, data for several bunch-crossings before and after the L1A may be stored, or data for only a fraction of the L1As, or for a certain class of trigger types. The amount of data captured may be restricted if necessary; for instance, only one subsystem, or one detector region, may be of interest for some monitoring purposes.

Each I/O FPGA stores a copy of all incoming or outgoing data into a DPM. The DPM is organised as a circular buffer, and holds the data for the duration of the remaining L1 latency. When an L1A signal is received, the corresponding data location is placed in a queue, and the data is read out in turn via the second port of the DPM. The readout path is 16 bits wide, and is implemented as a chain of point-to-point buses between the I/O FPGAs. Each FPGA outputs its own DAQ data, and that which it receives via the incoming DAQ bus, over several cycles, and eventually all DAQ data arrives at the board control FPGA, which is the last in the chain. This device is connected to a dedicated data port, which is used only for DAQ output. Since the readout bandwidth is much smaller than that of the main I/O paths, it takes many cycles to read out the data resulting from one L1A; the DPMs therefore also act as derandomiser buffers. The memory depth is sufficient so that, given the standard CMS trigger rules (see Chapter 16) and a 100 kHz maximum L1A rate, no overflow will occur if up to nine bunch-crossings worth of data are read out per L1A. The sequence of events on the DAQ data path after each L1A is entirely deterministic, and no flow control or readout controller are required. Likewise, the DAQ data are sent to the concentrator TPM, and thence to the TRC, in ‘push mode’, and no handshaking or flow control is required at either of these interfaces.

The DAQ data from every TPM are collected by the data concentrator TPM, and sent to the TRC system. The operation of the concentrator TPM is described in Section 6.5.4.

### Control and Monitoring

Each TPM contains an embedded microprocessor, housed on a daughterboard, which performs setup, test and control functions. The CPU also continuously monitors the operation of the board, and takes appropriate action in case of errors. The TPM supports a variety of online and

offline test and monitoring functions, using the DAQ DPMs as spy/playback buffers to inject and capture trigger data. Test and control signals are distributed around the board on a JTAG bus and a point-to-point serial control bus.

The embedded controller under consideration is the 486Core from CompuLab [6.11]. This self-contained module consists of an 80486-compatible CPU and all necessary support hardware, including 32MB of flash memory and an Ethernet interface. The board is software compatible with a standard ISA-bus PC, and runs the Linux operating system. The controller boots from flash memory, and contains sufficient software to program and operate the TPM in stand-alone mode without reliance on an external host. Under normal circumstances, however, high-level system control would be the responsibility of the trigger control system (see Chapter 16). Each TPM interfaces to the outside world via the Ethernet interface; the TPMs connect to the appropriate local Ethernet segment via a standard commercial hub in the GCT rack. The embedded controller requires power and reset signals from the TPM, but contains its own clock source. The TPM contains a watchdog / power monitor chip which resets the controller at system power-up, if the onboard software crashes, if a manual board reset switch is operated, or if a hard reset command is sent through the TTC system.

The controller interfaces to the TPM fast logic through a control FPGA, which is responsible for the following functions:

- Interface between the asynchronous CPU bus and the main TPM logic
- Capture of data blocks from the TPM in FIFO buffers for access over the slow CPU interface
- Serial control bus driver
- TTCRx interface
- Multiplexing and distribution of fast synchronous control signals (BC0, L1A, sync, reset)
- DAQ bus to serial DAQ output interface
- Other ‘glue’ functions.

The control FPGA appears to the CPU as a set of registers in the ISA bus I/O space. The registers allow access via semaphores and FIFO buffers to the data paths on the TPM. The DAQ bus terminates at the control FPGA, and blocks of data may be captured from this bus and read back to the CPU at a slower rate. The DAQ bus is also used for reading and writing spy and playback data to and from the DPMs; accesses of this type are automatically interleaved with DAQ readout cycles, with the latter taking priority. Data sampling is performed by page swapping in the DPMs used for the circular latency buffers, thus capturing all previous data in the buffer. Access to the control and status registers in the TPM FPGAs is via the serial control bus. In order to remove the need for arbitration logic, both the control and DAQ buses are implemented as a ring of unidirectional point-to-point links; the DAQ bus visits only I/O FPGAs, whereas the control bus is accessible by all devices. A single board-wide JTAG bus is used to program and verify all FPGAs, and access the TTCRx registers. The JTAG bus is directly controlled by the CPU, since it is used to program the control FPGA, and is also accessible from an external connector.

There is a general-purpose I/O connector at the rear of each TPM, which provides several differential LVDS and single-ended LVTTL inputs and outputs. This connector is used on two of the TPMs to control the input crate backplanes via cable links. It may also be used for board

debugging and test, as the connector signal functions are determined only by the control FPGA program. The TPMs also have connectors placed at various points on the data paths for connection of high-density logic analyser probes.

### 6.5.2 Control, Setup and Test

Each TPM is under the control of custom software, running on the embedded CPU. This software is designed to be as simple and robust as possible. The software will be written in an modern object-oriented high-level language, such as C++ or Java, for ease of maintenance. The choice of a PC-compatible embedded processor running Linux allows straightforward software development and cross-compilation on a separate PC host, without reliance on the expensive and proprietary software tools required for some embedded operating systems and platforms. The danger of hardware or software obsolescence is further reduced by the modular nature of the embedded CPU board, which may be upgraded or replaced during the lifetime of the system if necessary.

The control software implements the minimum set of low- and mid-level functions necessary to program and monitor a TPM. High-level control, monitoring and test functions are the responsibility of the trigger control system. The onboard software communicates with the main trigger control software via ethernet, using a communications library based on the standard TCP/IP protocol. The software instances running on different TPMs do not communicate with one another directly. To allow stand-alone tests of a TPM without use of an external host, the control software may be instructed by manual operation of a switch to perform a standard set of board diagnostics and report the results in go / no-go fashion.

A variety of test and monitoring capabilities are built into the TPM, some of which are available during normal running, and some of which require alternative programs to be loaded into the FPGAs. JTAG is used for low speed physical board and chip-level test. High-speed physical tests for diagnosis of board-level problems are carried out by loading a set of test programs into the FPGAs and using the normal monitoring facilities or a logic analyser to capture the board state. Offline functional test of the algorithm logic may be carried out without reprogramming the FPGAs by playing back simulated data from the DAQ DPMs, and capturing the output data. Online monitoring of trigger function is performed by simultaneously sampling blocks of data from the inputs and outputs of the board, and comparing with a software simulation of the algorithms. It is anticipated that offline tests will be performed automatically at regular intervals (e.g. during LHC fills), whereas online monitoring is performed continuously during running, as frequently as the processing capacity of the board CPU allows.

Tests of the IM FPGAs, TPM I/O FPGAs and serial links are performed offline by sending known data patterns on the IM-TPM links. This operation may be performed without reprogramming the IM FPGAs.

The high-level control and synchronisation of test procedures between parts of the trigger hardware is the responsibility of the trigger control system. The TPM control software receives instructions from the trigger control system and acts upon them by performing diagnostic tests, reporting error rates or fault conditions, or providing test data to another element of the trigger system.

### 6.5.3 Processor Crate Layout

The processor crate is a standard 21-slot 9U × 400mm subrack, with rear-mounted power supply. The TPMs are cooled by forced vertical airflow, with the unused front panel space closed off for cooling and EMC reasons. In the baseline GCT system, the processor crate contains eight TPMs and a single TRC interface board. Five TPMs are devoted to the sort subsystems for the four object classes. Global energy summation requires a further TPM. The jet count and luminosity monitor functions are combined on a single TPM, and the DAQ data concentrator requires a eighth module.

All input links from the IMs enter the processor crate at the front panel. The links to the GT leave from the back. The links from the six processing TPMs to the DAQ concentrator are routed between modules at the front of the crate, and the links to the jet count / luminosity TPM are routed at the rear. All inter-module serial cables are less than 4 ns long in order to maintain adequate timing margins at the receivers.

### 6.5.4 Algorithm Implementation

In this section, the mapping of the various GCT algorithms onto TPM resources is described, and details are given of the data flow through and between TPMs.



**Fig. 6.6:** Organisation of GCT sort logic.

## Trigger Object Sort

The GCT is required to perform a separate sort operation for each of the four categories of trigger object identified by the RCT. Four TPMs are used, one for each object type. For the e/ $\gamma$  and isolated e/ $\gamma$  object types, the number of input objects to the sort operation is 72; four from each of 18 trigger regions. All 18 front data ports of each TPM are used for input, with four objects carried on each serial link. For the central, forward and  $\tau$ -jet sort, the number of input objects is 36; four from each of nine region pairs. However, the number of bits required to send the object data for four jets exceeds the capacity of a single data port; the trigger object data is therefore packed such that six jets are received on a pair of ports. Twelve input ports of each TPM are used in total, arranged as four per input bank, such that each input bank handles the jets from three region pairs. Each sort operation is required to output the four overall-highest rank objects in rank order.

The sort operation is performed in three stages (see Figure 6.6). The sort algorithms operate on groups of four objects; conceptually, the objects of each group are presented to the sort logic one at a time, and are considered in turn. To simplify the main sort logic, it is required that each input group have its member objects prearranged in rank order, such that the object with highest rank is presented first. The first stage of processing is therefore a ‘presort’ operation, which takes groups of four objects (e.g. the e/ $\gamma$  objects from one trigger region, or jets from one region-pair) and sorts them by rank. No objects are discarded at this stage. The second stage of logic takes up to six presorted groups of four objects as input, and identifies the four overall highest rank objects; the nature of the sort algorithm ensures that the four surviving objects are arranged in rank order at the output. The third stage is similar to the second; it takes as input the three object groups from the second stage logic, and outputs the four overall highest-rank objects. The first two stages of logic are implemented within the TPM Stage A algorithm blocks, and the third within the Stage B block.



**Fig. 6.7:** GCT second- and third-stage sort logic.

The implementation of the second and third stage sort logic within the algorithm FPGAs is illustrated in Figure 6.7. The core of the logic is a ‘find four of six’ block. This operates only on the highest-rank object from each input group. The logic finds the highest rank four of the six input

objects, using 15 six-bit comparators to perform a parallel compare of all possible two-rank combinations. Combinatorial logic is then used to provide one-hot outputs which drive a set of multiplexers to select both the overall highest rank object for immediate output, and nine other objects for further consideration. This set of nine objects is guaranteed to contain those of the 2nd, 3rd and 4th highest rank, and a ‘find three of nine’ block is then used to identify these and select them for output. The algorithm is implemented in fully pipelined logic, and has a fundamental latency of five clock cycles. The presort has a latency of two clock cycles. It is foreseen that the algorithm clock speed will be set at 80 MHz; the overall latency of the entire sort operation is therefore six bunch-crossings. Further details of the sort algorithm implementation are given elsewhere [6.12].

The four output objects from each sort TPM are sent directly to the GT on four of the rear data ports. Extra logic is included alongside the sort algorithm on the central, forward and  $\tau$ -jet TPMs to count the number of input jets passing up to eight different sets of programmable rank and position cuts. The eight resulting four-bit counts are sent from each of these boards to the jet count / luminosity TPM on a further rear data port.

### Global Energy Calculation

The calculation of  $E_T$  and  $E_T^{\text{miss}}$  from local scalar transverse energy sums is performed using a single TPM. The RCT calculates a local  $E_T$  sum for regions covering 5 units of  $\eta$  by 20 degrees in  $\phi$ , giving a total of 36  $E_T$  sums as input to the calculation. The 11-bit  $E_T$  words are packed in a similar way to the jet objects, such that six energies are carried on a pair of serial links; 12 input ports of the TPM are used.

The global energy calculation logic is shown in Figure 6.8. At the first stage, two orthogonal components of transverse energy ( $E_x$ ,  $E_y$ ) are calculated for each input sum. To save logic, this operation is performed using fixed-point constant multipliers rather than LUTs; since the energy calculation process is always significantly faster than the sort, there is no requirement for the lowest possible latency here. This operation is carried out within the Stage A algorithm block, and the total  $E_T$ ,  $E_x$  and  $E_y$  for each input bank are passed to Stage B. Within the Stage B block, the overall total  $E_T$ ,  $E_x$  and  $E_y$  are calculated using further adders.

The magnitude and direction of  $E_T^{\text{miss}}$  are then computed using a large LUT implemented in external SRAM. Since currently available fast SRAM devices are limited in size to 4 Mbit or less, it is not possible to use the full precision of the  $E_x$  and  $E_y$  totals when calculating the LUT address. However, the physical symmetries within the  $E_T^{\text{miss}}$  calculation may be exploited in order to circumvent this problem. Firstly, the sign information in  $E_x$  and  $E_y$  is not required, since they are added in quadrature to obtain the magnitude of  $E_T^{\text{miss}}$ ; the correct quadrant for the vector direction may be calculated from the  $E_x$  and  $E_y$  sign bits directly, without use of extra LUT space. Secondly, the scale of the magnitude of  $E_T^{\text{miss}}$  depends linearly on that of the input energies, whereas the angle does not scale at all. It is therefore possible to divide both  $E_x$  and  $E_y$  by a uniform factor, perform the  $E_T^{\text{miss}}$  calculation using a reduced number of bits, and re-multiply the resultant magnitude of  $E_T^{\text{miss}}$  by the same factor, without losing significant relative precision in the final result. This is implemented by using the highest non-zero nine bits of the larger of  $E_x$  or  $E_y$  to calculate the LUT address, along with the matching nine bits of the smaller of  $E_x$  or  $E_y$ ; this is appropriate, since the magnitude calculation is dominated by the larger of the components. Simulation shows that this energy scale compression procedure maintains the required  $E_T^{\text{miss}}$  resolution (see Figure 6.9). A 16-bit wide SRAM is used to simultaneously provide 12 bits of



**Fig. 6.8:** Organisation of GCT energy sum logic.

$E_T^{\text{miss}}$  magnitude (before rescaling) and four bits of direction, expanded to six before output by addition of the quadrant information.

The number of bits used for all energy sums is expanded throughout the adder tree, such that no precision is lost until the final stage of calculation. The final total  $E_T$  sum is sent to the GT on a twelve-bit linear scale; the dynamic range / precision trade-off of the energy scale is programmable. The final  $E_T^{\text{miss}}$  scale is programmable in a similar way. The energy sums are sent directly to the GT using two rear data ports. The global energy sum logic is capable of counting the number of local  $E_T$  sums above a set of eight programmable thresholds, in an analogous way to the jet counts. The resulting four-bit counts are sent to the jet count / luminosity TPM on a further rear data port.

The latency of the global energy calculation is expected to be two clock cycles for the multipliers, one cycle apiece for the six stages of addition, and three cycles for the  $E_T^{\text{miss}}$  calculation. At an 80 MHz clock rate, this implies an overall latency of 5.5 bunch-crossings.



**Fig. 6.9:** Degradation of  $E_T^{\text{miss}}$  resolution caused by energy scale compression.

### Jet Counting / Luminosity Monitor

The jet count and luminosity algorithms share a single TPM, since they are both less demanding in terms of logic resources than the sort and energy sum algorithms, and they make use of similar input data. The jet count / luminosity logic takes as input eight four-bit counts from the central, forward and  $\tau$ -jet and energy sum TPMs. The connections from these TPMs to the jet count / luminosity TPM are made at the rear of the processor crate; four of the rear data ports are configured as inputs for this purpose.

The jet count logic simply sums the counts from the central, forward and  $\tau$ -jet TPMs, and sends up to eight resulting four-bit counts to the GT. The jet count data is transmitted using two rear data ports.

The luminosity logic makes use of the jet counts and energy threshold counts to form a rolling estimate of relative luminosity on a bunch-by-bunch basis. The energy thresholds and jet cuts to be used for this purpose are under study. The rolling totals of events passing the cuts are stored in circular buffers, with the totals for each colliding LHC bunch-pair updated in turn at the appropriate time. The total collision count since reset for each bunch-pair is also stored in a similar way. The circular buffers are implemented as DPMs, so that their contents may be examined from time to time by the TPM CPU without interrupting the rolling update of event totals. The CPU processes the data, and passes the luminosity estimates onto the CMS detector control or DAQ systems over ethernet.

The majority of the jet count / luminosity logic is implemented in the Stage B algorithm block. The circular buffers for luminosity totals are implemented using memory resources inside one or more of the Stage A blocks. The latency of the jet count operation is designed to be low enough that the jet counts will be sent to the GT synchronously with the sorted jets from the same bunch-crossing; this requires the jet count information to bypass most of the sort logic on the central, forward and  $\tau$ -jet TPMs.

### DAQ Concentrator / TRC Interface

A single TPM in the GCT processor crate is devoted to the collection of DAQ and monitoring data from the other TPMs, and the logical interface to the TRC. The physical TCM S-LINK interface is carried on a dedicated interface board of a different type. The dedicated DAQ outputs from all TPMs are connected to front data ports of the DAQ concentrator. In the baseline design, seven front data ports are used, though the DAQ concentrator is specified to cope with all 18 inputs being used. The output is sent through four of the rear data ports to the S-LINK interface board.

The purpose of the DAQ concentrator logic is to aggregate data from all input channels, provide sufficient buffering for derandomisation of the data at the maximum specified L1A rate and to package all data from one bunch-crossing into a single contiguous block in memory. Each block is then formatted before transmission changing the word length from 21 to 16 bits, and by adding a block header containing the event number and a checksum. The data is then sent via the S-LINK interface board to a TCM card. An ‘idle’ signal is sent to the TCM when no event data is available. The initial derandomisation of the data arriving at each input port is performed using the DAQ buffer DPMs attached to the I/O FPGAs, with the FPGAs acting as memory controllers. The event blocks are assembled in DPM internal to the Stage B algorithm FPGA, which contains the ‘event builder’ logic. The Stage A algorithm blocks are essentially unused.

#### 6.5.5 Details of latency calculation

The major contributions to the GCT system latency are summarised in Table 6.3. The overall latency is estimated to be 15.5 bunch-crossings, including four bunch-crossings contingency. This latency is referred to the time at which the last piece of data relating to a bunch-crossing is received at the GCT system, since all data for a single bunch-crossing must be synchronised before transmission to the TPMs. The latency total does not include cable propagation delays at either the input or output. The estimates for the algorithm blocks refer to the sort operation, as this has the longest latency.

Most contributions to the latency are fixed by the system design. However, a contingency of two bunch-crossings is allowed for each algorithm block, taking into account the possibility that unforeseen stages of pipelining will be required, or that inter-chip transmission on the TPMs will have to take place at 40 MHz rather than 80 MHz. Both these situations are regarded as unlikely to occur; indeed, improvements in FPGA technology may allow the algorithm logic to run at 120 or 160 MHz clock rate, resulting in a shortening of the overall latency.

The latency for transmission of data to the GMT, referred to the time at which the data arrives at the GCT, is expected to be two bunch-crossings, not including cable delays. The GMT data are not synchronised with the rest of the data for a given bunch-crossing by the GCT; synchronisation between the calorimeter and muon trigger systems is performed by the GT.

**Table 6.3:** GCT system latency contributions

| Element                 | Latency (BC) | Notes                                       |
|-------------------------|--------------|---------------------------------------------|
| IM synchronising buffer | 1            |                                             |
| IM to TPM link          | 1            | Includes 7.5ns cable delay                  |
| TPM input I/O logic     | 1            |                                             |
| Stage A algorithm block | 3.5 (+2)     | For sort algorithm;<br>includes contingency |
| Stage B algorithm block | 2.5 (+2)     |                                             |
| TPM output I/O logic    | 2            |                                             |
| TPM to GT link          | 0.5          | Serialiser chip only                        |
| <b>Total:</b>           | <b>15.5</b>  |                                             |

## 6.6 Prototypes and Tests

### 6.6.1 FPGA Processing Tests

The GCT design depends upon the ability to carry out a variety of algorithmic functions in fully pipelined logic using FPGAs. In order to demonstrate the feasibility of this approach, a programme of prototyping has been established, which will culminate in the construction a complete prototype TPM in 2001.

A series of tests were performed up to 1999 on a prototype Sort ASIC which was a key element in a previous system design [6.13]. These tests were successful, showing that the ASIC functioned as designed at full system speed. An ASIC-based approach has therefore been shown to be feasible, albeit non-optimal.

In order to gain experience with the Xilinx Virtex family of FPGAs [6.6], a first FPGA test platform was constructed in 1999 (see Figure 6.10). This was based upon a previous board designed to test the Sort ASIC. An XCV300-5 FPGA was fed with pseudorandom data from a set of hardwired ECL linear-feedback shift registers (LFSRs). Due to the limited logic capacity of the XCV300, the FPGA was initially configured to perform only the GCT second stage sort operation (see Section 6.5.4), identifying the four highest-rank trigger objects from a set of 24. The trigger object output from the FPGA was monitored using a deep memory logic analyser. Since the input data to the sort logic did not consist of presorted object groups, the output objects were in fact not necessarily the four highest rank. However, by comparing the FPGA input and output data against a software simulation, it was possible to verify the correct operation of the algorithm. The FPGA performed correctly up to a clock speed of 85 MHz, compared to the design speed of 80 MHz, demonstrating the feasibility of an FPGA-based sort operation. Compatible FPGA devices with an order of magnitude more logic capacity and several times the intrinsic speed are now available. The test platform continues to be used to test other elements of the GCT algorithms, and to gain further experience with FPGA logic design techniques.



**Fig. 6.10:** FPGA test platform in use.

A second FPGA Technology Evaluation Platform (TEP) is currently under construction. This new module is far more advanced in design than the first test platform, incorporating many of the approaches proposed for the final TPMs. A block diagram of the TEP is shown in Figure 6.11. The platform contains two Virtex-E algorithm FPGAs, which may range in size from XCV600E up to XCV2000E. Each FPGA has a large external DPM which may contain memory buffers or an LUT. This arrangement will allow the implementation of any full Stage A or Stage B algorithm block in a single FPGA, and enable the testing of the Stage A and Stage B blocks together. A control FPGA is also provided, performing roughly the same functions as the control FPGA in the TPM design: TTCSR interface, synchronous control, serial control bus, and so on. An interface for attachment of an embedded processor board via the control FPGA is provided. Clock and timing signals are received optically via the TTCSR ASIC, housed on a CERN standard TTCSR carrier board. The track termination on the board may be arranged to allow the use of several different logic standards for communication between the algorithm FPGAs.

High-speed data input and output for the TEP will be provided by a pair of CMC-sized daughter boards. These connect to the algorithm FPGAs via wide data buses, and are designed to implement the same functionality as a single input or output I/O FPGA. The daughter boards contain three Channel Link serialiser or deserialiser devices, DPMs and an XCV300E FPGA.

It is foreseen that the TEP will be used to implement the full range of algorithms required in the GCT, and to demonstrate the feasibility of the proposed extended version of the GCT system. When used with Channel Link I/O daughter boards, the TEP constitutes a complete ‘slice’ of a TPM, containing only one I/O FPGA and one Stage A block. When coupled to a second



**Fig. 6.11:** Block diagram of the GCT FPGA technology evaluation platform.

identical TEP board via serial links, the jet count and DAQ concentrator functionality may be tested. The test programme to be carried out using the TEPs will constitute a thorough ‘proof of principle’ test of the TPM design approach before construction of the first full prototype. The test platform will also enable the process of system software creation to start in earnest. The TEP is expected to be ready for first testing in December 2000



**Fig. 6.12:** Block diagram of the GCT link test board

### 6.6.2 Data Link Tests

The GCT design involves the requirement to transfer large amounts of data over relatively long distances with extremely low error rates. There are two main link technologies foreseen for the GCT system: parallel differential ECL transmission at 80 Mbit/s per pair for the input data from the RCT, and serial LVDS links running at up to 1.68 Gbit/s for data transfer around the system and to the GMT and GT. A series of tests on these link technologies have been carried out, with the aim of proving the feasibility of the transmission schemes, and of measuring the error rates. It is also planned to prototype the optical links to the TRC.

A link test board was built in 1999 to perform a first set of tests on both parallel and serial transmission schemes (see Figures 6.12 and 6.13). The board contains a Xilinx XC4000XLA series FPGA, which acts as data source and sink, and provides control functions. Test data patterns may be played back from a deep FIFO memory attached to the FPGA, or provided by a pseudorandom data source within the FPGA. The board contains a programmable clock generator in order to allow testing at a range of data rates, and a programmable-skew clock distribution tree to allow the effects of clock skew between system components to be investigated.

For parallel ECL tests, 22-bit data patterns from the FPGA are sent through a set of differential ECL output registers. The signals travel down the cable / connector combination under test, and are received by a set of ECL line receivers and synchronising registers. The input registers are clocked simultaneously, and so low-skew cable and careful manual synchronisation are



**Fig. 6.13:** GCT link test board.

required in order to achieve the optimum timing setup. The received data patterns are compared to those transmitted, and the number of transmission errors is counted. A variety of cables have been tested in conjunction with the SCSI-2 style connectors proposed for the final GCT system. Good results have been obtained with Universal SCSI cable from Madison Cable, which is a high-quality 28AWG solid-core twisted-pair cable with overall shield. Other less expensive cables were found not to be satisfactory. An eye diagram illustrating the received signal quality using 25 m of the Universal SCSI cable and an 80 Mbit/s data rate per pair is shown in Figure 6.14a; it can be seen that both the edge jitter and received noise are low, resulting in a low bit-error rate (BER). The BER for this setup has been measured to be less than  $10^{-12}$  for pseudorandom data.

Serial link tests were carried out in a similar way, using the DS90CR217/218A Channel Link chipset from National Semiconductor. The combination of Alcatel STP600 low-skew category 6 cable and standard category 5 shielded RJ45 connectors was found to perform well at a link distance of 2m. An eye diagram of the received signal quality for a 2m link running at 560 MHz (80 MHz parallel data rate) is shown in Figure 6.14b; the BER as a function of link distance under various conditions, and using different cable / connector combinations, is under study, but has been measured to be less than  $10^{-14}$  for pseudorandom data using the cable / connector combination described above.

The link test board uses a very simple input synchronisation method for received parallel data. In order to test the more robust scheme proposed for the final IMs, which can handle arbitrarily-phased data on any input pair, a second iteration of the link test board is being designed. This will be less flexible than the original, having the same functionality as one-quarter of a final



**Fig. 6.14:** (a) Eye diagram for one half of a differential ECL signal received over 25m of Universal SCSI cable. (b) Eye diagram for one half of a differential LVDS signal received over 2m of Alcatel STP600 cable.

IM; that is, it will accept a single parallel input cable, synchronise the data with the board clock using a Virtex-E FPGA, and retransmit it on one or more serial outputs. The board will be used in conjunction with the existing link test system to verify the correct functioning of the synchronisation scheme, and will also be used to test the automatic synchronisation under software control. The second iteration link test board is expected to be available for testing in Spring 2001.

The TEP and I/O daughterboards, together with the second iteration link test board, constitute a complete functional slice through the GCT hardware, using the same approaches and technologies as the proposed for the final system. This system will be used in integration tests with the RCT and GT in 2001.

## 6.7 Status and Schedule

Work with the prototype modules currently under design will continue through to the second half of 2001. Initially the TEP will be used to develop the Virtex-E designs needed to perform all parts of the GCT baseline functionality. These tests will develop into the functional slice tests discussed above, and then to integration tests with other trigger subsystems. Pre-production prototypes of the TPM and IM will be designed and produced in 2001-02. Production of the full GCT will take place in 2003 and the system will be installed and commissioned in the first half of 2004. The following milestones are foreseen for the GCT system:

- May 2001: Demonstrate TPM algorithm functionality using Virtex-E technology.
- May 2001: Finalise requirements specification for GCT.
- November 2001: Integration tests of Vertical Slice demonstrator with RCT and GT prototypes.
- November 2002: Demonstrate test results for pre-production TPM and IM.

January 2004: Finish production of the final modules. Start of installation and commissioning.

## References

- [6.1] J.J. Brooke *et al.*, "An FPGA-based implementation of the CMS Global Calorimeter Trigger", in Proceedings of the 6th Workshop on Electronics for LHC Experiments, Cracow, CERN-LHCC-2000-041, pp. 363-367, or available at [http://lebwshop.home.cern.ch/lebwshop/LEB00\\_Book/Trigger/newbold.pdf](http://lebwshop.home.cern.ch/lebwshop/LEB00_Book/Trigger/newbold.pdf); also J.J. Brooke *et al.*, "Trigger processing using reconfigurable logic in the CMS calorimeter trigger", in Proceedings of the 8th Pisa Meeting on Advanced Detectors, to be published in Nucl. Instr. Meth. A.
- [6.2] D.G. Cussans *et al.*, "Online luminosity monitoring using the Global Calorimeter Trigger", CMS Note, in preparation.
- [6.3] G.P. Heath *et al.*, "Implementation of Jet Cluster processing in the Global Calorimeter Trigger", CMS Note, in preparation.
- [6.4] B.G. Taylor, "TTC Distribution for LHC Detectors", IEEE Trans. Nucl. Sci, **45**,3, (1998), pp. 821-828; M. Ashton *et al.*, "1999 Status Report on the RD-12 Project", CERN-LHCC-2000-002, 3 January 2000; and J. Christiansen *et al.*, "TTCrx Reference Manual Version 3.0", October 1999.
- [6.5] RCT-GCT interface specification, in preparation.
- [6.6] For information on the Xilinx Virtex-E series of FPGAs, see [http://www.xilinx.com/products/virtex/ss\\_vir.htm](http://www.xilinx.com/products/virtex/ss_vir.htm)
- [6.7] Information on Channel Link transmitter and receiver chips can be found from <http://www.national.com/appinfo/lvds/>
- [6.8] The rack allocation is shown in [http://cmsdoc.cern.ch/~wsmith/USC55\\_racks.html](http://cmsdoc.cern.ch/~wsmith/USC55_racks.html)
- [6.9] GCT-GT interface specification, in preparation.
- [6.10] <http://hsi.web.cern.ch/HSI/s-link>
- [6.11] <http://www.compulab.co.il/486core.htm>
- [6.12] J.J. Brooke *et al.*, "Sort processing in the Global Calorimeter Trigger", CMS Note, in preparation.
- [6.13] D.S. Bailey *et al.*, "The Global Calorimeter Trigger for CMS", in Proceedings of the 4th Workshop on Electronics for LHC Experiments (LEB 98), Rome, CERN-LHCC-98-036, pp. 327-330.

# 7 Calorimeter Trigger Readout and Control

## 7.1 Requirements

From the point of view of the CMS Data Acquisition system, the Calorimeter Trigger is seen as a sub-detector that produces data (trigger data) and that has to be controlled and monitored. The CMS Calorimeter Trigger Readout and Control is the subsystem that allows a user to control, read-out and monitor all the Calorimeter Trigger system equipment, or a part of it. When in standalone mode, the system is able also to test that equipment whenever the user finds it necessary. The Calorimeter Trigger Control Software is based on the CARDS framework (see Chapter 16.8).

### 7.1.1 Requirements on Trigger Data Readout

The trigger data is composed of the calorimeter trigger primitives, the regional trigger intermediate data (electron and jet candidates and energy sums per trigger crate) and the final global calorimeter trigger data (top four electrons and jets and calorimeter energy sums). These data are originated, respectively, in the ECAL and HCAL readout and trigger modules (Chapter 4), the regional trigger cards (Chapter 5) and the global calorimeter trigger modules (Chapter 6).

Trigger data pipelines associated to de-randomizer buffers are available in each of these modules to store the data during the L1 latency and to capture the event data when a L1A is received. In each module, the trigger data fragment is encapsulated in a standard event block and identified by the TTC event and bunch numbers.

In normal data-taking mode, the calorimeter trigger readout system is responsible to send to DAQ the regional and global calorimeter trigger data. It is the responsibility of the ECAL and HCAL readout systems to send to DAQ the trigger primitives. The calorimeter trigger readout system will collect the event fragments, check the event fragment synchronization, encapsulate the fragments in a calorimeter trigger event block and send the event to the DAQ system. Buffer overflow and other error conditions (mis-alignments, data errors, hardware errors) will be checked and feedback will be given to the global Trigger Control. The estimated calorimeter trigger data rate is about 4 Gbit/s for 100 kHz L1A rate.

The on-line monitoring of the calorimeter trigger operation will be done at the subsystem level using spying events. For this purpose, the readout system will be able to collect from the ECAL and HCAL readout subsystems copies of the trigger primitives data. Based on this data, the trigger monitoring CUPs will emulate the trigger algorithms on an event basis and will compare the results with the data collected from the regional and global trigger modules. Because the trigger system is purely digital, this method provides a very effective monitoring of its operation. Summaries of the calorimeter trigger monitoring are provided on request to the global Run Control and Detector Control System.

The spying rate will be of the order of 1 kHz in order to allow the monitoring of error rates larger than 1 % per trigger tower in one hour of running. This requirement implies a transfer rate of the trigger primitives data of the order of 1 Gbit/s.

In test mode, the system will be able to load pre-defined patterns in memories located at different stages of the trigger pipeline that will be pushed through the system at LHC clock frequency. The system is then able to read calorimeter trigger data captured in the readout pipelines for diagnosis purposes.

During calorimeter trigger test and setting up, some local storage capability will be required for off-line analysis of the calorimeter trigger data. We estimate the need for about 100 Gbytes disk storage capacity.

### 7.1.2 Requirements on Calorimeter Trigger Control

The Calorimeter Trigger Control should comply with the general requirements of the Trigger Control Software defined in Chapter 16. In this section we develop this requirements in view of the specific needs of the calorimeter trigger.

The Calorimeter Trigger Control will allow for a simple integration with the global CMS Run Control, providing a number of services through well defined software interfaces which can be invoked by the CMS Run Control. These services include trigger configuration and initialization, run control commands, status reports, monitoring reports and error reports. The same services will be available through the graphical user interface when the trigger is operated in standalone mode.

The Calorimeter Trigger, the ECAL and the HCAL readout systems need to exchange information for test purposes. Therefore, the software should support the exchange of information with other sub-detectors.

Although the Calorimeter Trigger will be operated globally during data taking, the Calorimeter Trigger Control allows the partial operation of the trigger system in case some of the components are available, while others are not. Different experts should be able to work independently with different parts of the trigger system without disturbing each other. Therefore, the software should be based on a modular distributed control system.

Moreover, during a normal data taking period, faulty hardware elements have to be isolated from the trigger system. In these circumstances, an expert can repair the faulty module without disrupting the functionality of the Calorimeter Trigger system as a whole. If necessary, the expert will be able to test, in standalone mode, those parts of the equipment autonomously from the rest of the system.

During data taking, the control software has to monitor in permanence the calorimeter trigger operation. Each crate controller will use spying data to perform limited monitoring tasks (e.g. channel occupancy distributions). In parallel the crate controller monitors hardware counters and status registers and responds to hardware errors. Complete calorimeter trigger events will be built in a central readout crate before being sent to the DAQ. Samples of these events will be used by the crate processor or will be transferred to a monitoring station to perform more complete monitoring tasks. Summaries of the monitoring activities are reported centrally.

Trigger hardware testing is considered to be an important component of the software activity. The test software will be integrated in a coherent way in the whole software package, providing an hierarchy of trigger hardware tests through appropriate graphical user interfaces.

A data base management system, accessible to all processors in the system, will store all the relevant parameters for the calorimeter trigger operation.

## 7.2 System Overview

### 7.2.1 Operational Environment.

The user will be able to pilot the entire Calorimeter Trigger either from the global Run Control, or in standalone mode, isolated from the other data acquisition subsystems.

After logging into the system, the user receives a snapshot of the current system status. The user will be able to select from which level would he want to operate: Trigger Primitives Generator, Regional Trigger, Global Calorimeter Trigger or from the Trigger Readout Crate level. In each of these levels the user can choose a specific configuration, or in other words, a partition, to test, control, read-out and eventually monitor. The user will be able to describe these partitions in terms of elementary partitions (crates), relationships between these elementary partitions, and between these and particular equipment modules. The user should then define the run conditions (run mode, trigger, etc.) under which he would like to operate, in case the default settings are not applicable. Once specified the equipment to be operated and the run conditions, the user will then be able to initialize it and to start its operation.

During the run, the system will be monitoring its own activity, displaying information characterizing its global behavior. After a Stop, the user can look at a statistical summary of the run activity. At the same time, other users can log into the system to monitor the activity of already assigned equipment, or they can operate that equipment which has not yet been allocated.

During the setting-up periods, a user will need to cross check the behavior of the Calorimeter Trigger with the behavior of the Calorimeters. Likewise, the ECAL and HCAL teams may need the Calorimeter Trigger to generate a specific trigger signal. Since these activities can coincide in time, a given user will be able, for example, to test or operate a part of the Calorimeter Trigger, while other users are operating other parts of the trigger in a different detector region.

From the physicist point of view, the debugging of the Calorimeter Trigger will be more efficient if done by rapidity regions, or physical space sectors. In this scenario one can expect up to six different users, at the same time, operating equal number of partitions (two barrel, two endcap and two forward partitions).

When testing, the user will be provided with a suite of test functions, appropriate to the level which he has chosen to operate. At the end of each test the user can then look at a summary of the test results.

### 7.2.2 Readout Hardware

The architecture of the calorimeter trigger readout system is shown in Figure 7.1. The calorimeter trigger data is concentrated in a Readout Crate, located in the underground Counting

Room, before being transferred to the central DAQ at the surface. The calorimeter trigger event size (fixed) is about 8 kbytes (regional and global data). The event is organized in four fragments of about 2 kbytes, each one transmitted by a dedicated DAQ link, as required by the CMS Data Acquisition specifications.

The Readout Crate receives one serial link per regional trigger crate and one serial link from the global calorimeter trigger. Data from these links is concentrated in three Trigger Concentrator Modules (TCM), each one equipped with a DAQ output link.

A copy of the trigger primitives data is received on the Readout Crate for test and monitoring purposes. The volume of this data is estimated at 83 kbytes (5 bunch crossings per event). These data will be transmitted either through dedicated serial links (one per calorimeter readout crate) or through the local area network. In order to sustain 1 kHz spying rate, the required bandwidth is 670 Mbit/s (see Section 7.3.1).

The TCM modules in the Readout Crate are equipped with spying buffers allowing the crate CPU to access complete calorimeter trigger events for autonomous test and monitoring. Each TCM is equipped with a TTCrx chip for clock and L1A reception.



**Fig. 7.1:** Architecture of the calorimeter trigger readout system.

### 7.2.3 Control Software

The Calorimeter Trigger Control subsystem context is summarized in Figure 7.2.



**Fig. 7.2:** The Calorimeter Trigger Control subsystem context diagram.

The system interacts through well defined software interfaces with a number of external systems. The ECAL and HCAL Control Systems provide calorimeter data on request for test and monitoring purposes. On the other hand, the ECAL and HCAL Control systems are able to request the calorimeter trigger controller for a specific trigger setting needed for test purposes. The interaction with the Muon Trigger Controller allows the exchange of event data for monitoring or test purposes, as well as to respond to requests of the muon system to change the calorimeter trigger settings relative to quiet and MIP bits (see Chapter 3). The connection with the Global Trigger systems is intended to pass requests for specific global trigger configurations eventually needed to test the Calorimeter Trigger partition. Like any other detector system, the Calorimeter Trigger should have the ability to request a partition of the Event Builder and Filter Farm when working in partition mode.

A number of different applications will compose the control system. These are an *administration application*, for user administration, an *inventory application*, to allow the users to manipulate and maintain equipment, elementary partitions and partitions, providing a suitable interface with the repository database, a *configuration tool*, to allow the user to select that part of the Calorimeter Trigger that he will want to operate, a *control framework* supporting concurrent user access and capable to work interactively or in server mode on the selected configuration, an interactive *test* and *simulation framework*, to verify the functionality of a given hardware equipment or software component, a *monitoring framework*, supporting multi-user access to

supervise the Calorimeter Trigger, and an *accounting application*, to monitor the Calorimeter Trigger System occupation.

The Calorimeter Trigger Control software architecture is shown in Figure 7.3. All the applications have a client process, one or more server processes and a database server process. Each client process is driven by a graphical user interface. It should be noted that the *control framework* can be driven by another (external) detector control system, like the *CMS Run Control*, whereas the *test and simulation framework* can only be driven by an interactive client. Details on the system architecture will be given in Section 7.5.



**Fig. 7.3:** The Calorimeter Trigger Control software architecture.

## 7.3 Calorimeter Trigger Data

### 7.3.1 Data Description and Rates

The calorimeter trigger event data is composed of:

- a) ECAL and HCAL trigger primitives (per trigger tower);
- b) RT electron and jet candidates (per RT crate)
- c) RT energy sums (per RT crate)
- d) GCT top four electron and jet candidates
- e) GCT energy sums
- f) Quiet and MIP bits

The details of the calorimeter trigger data are given in Table 7.1. For each event, a programmable number of samples (of the order of five) around the trigger beam crossing has to be collected for synchronization monitoring purposes. The total calorimeter trigger event size is 91 kbytes. Excluding the trigger primitives, the event size is 8.5 kbytes.

The ECAL or HCAL trigger primitives data rate (per groups of 68 towers, equivalent to one ECAL crate) is of the order of 0.6 Gbit/s for 100 kHz L1A rate. The total trigger primitives rate (ECAL and HCAL) for spying events (1 kHz) is of the order of 670 Mbit/s. This rate is probably at the limit of what can be handled by the fast LAN's expected in a few years. For this reason we keep open the option of having dedicated point-to-point links between the calorimeter readout crates and the trigger readout crate.

**Table 7.1:** Calorimeter trigger event data. The Total Data Rate is computed assuming 100 kHz L1 trigger rate. The Total Spy Rate assumes a rate of spying events of 1 kHz.

| Data Type                  | Data Item      | Bytes/Item | # Items | Event Size<br>(1 sample)<br>(kbyte) | Event Size<br>(5 samples)<br>(kbyte) | Total Data Rate<br>(1 sample)<br>(Mbit/s) | Total Data Rate<br>(5 samples)<br>(Mbit/s) | Total Spy Rate<br>(1 sample)<br>(Mbit/s) | Total Spy Rate<br>(5 samples)<br>(Mbit/s) |
|----------------------------|----------------|------------|---------|-------------------------------------|--------------------------------------|-------------------------------------------|--------------------------------------------|------------------------------------------|-------------------------------------------|
| ECAL Trigger Primitives    | Tower Energy   | 2          | 4032    | 8.064                               | 40.32                                | (68 towers)<br>110                        | (68 towers)<br>548                         | 65                                       | 323                                       |
| HCAL Trigger Primitives    | Tower Energy   | 2          | 4248    | 8.496                               | 42.48                                | (68 towers)<br>115                        | (68 towers)<br>576                         | 68                                       | 340                                       |
| Regional Trigger Crate     | 4x4 Region Et  | 2          | 16      | 0.032                               | 0.16                                 | 25.6                                      | 128                                        | 0.256                                    | 1.28                                      |
|                            | Elect Et       | 2          | 8       | 0.016                               | 0.08                                 | 12.8                                      | 64                                         | 0.128                                    | 0.64                                      |
|                            | Et sums        | 2          | 2       | 0.004                               | 0.02                                 | 3.2                                       | 16                                         | 0.032                                    | 0.16                                      |
|                            | Quiet/MIP bits | 2          | 2       | 0.004                               | 0.02                                 | 3.2                                       | 16                                         | 0.032                                    | 0.16                                      |
|                            | Total          |            |         | 0.056                               | 0.28                                 | 44.8                                      | 224                                        | 0.448                                    | 2.24                                      |
| Regional Trigger Total     | Crate Event    | 60         | 18      | 1.08                                | 5.4                                  | 864                                       | 4320                                       | 8.64                                     | 43.2                                      |
| Global Calorimeter Trigger |                |            |         |                                     |                                      |                                           |                                            |                                          |                                           |
| Input data                 | Electrons      | 2          | 144     | 0.288                               | 1.44                                 | 230.4                                     | 1152                                       |                                          |                                           |
|                            | Jets           | 2          | 72      | 0.144                               | 0.72                                 | 115.2                                     | 576                                        |                                          |                                           |
|                            | Counters       | 1          | 4       | 0.004                               | 0.02                                 | 3.2                                       | 16                                         |                                          |                                           |
|                            | Energy         | 2          | 54      | 0.108                               | 0.54                                 | 86.4                                      | 432                                        |                                          |                                           |
|                            | Total          |            |         | 0.544                               | 2.72                                 | 435.2                                     | 2176                                       | 0.4352                                   | 2.176                                     |
| Output Data                | Elect & jets   | 2          | 16      | 0.032                               | 0.16                                 | 25.6                                      | 128                                        |                                          |                                           |
|                            | Counters       | 1          | 8       | 0.008                               | 0.04                                 | 6.4                                       | 32                                         |                                          |                                           |
|                            | Energy         | 2          | 2       | 0.004                               | 0.02                                 | 3.2                                       | 16                                         |                                          |                                           |
|                            | Total          |            |         | 0.044                               | 0.22                                 | 35.2                                      | 176                                        | 0.0352                                   | 0.176                                     |
| Total                      |                |            |         | 0.588                               | 2.94                                 | 470.4                                     | 2352                                       | 0.4704                                   | 2.352                                     |

## 7.4 Calorimeter Trigger Readout

### 7.4.1 Readout of Trigger Primitives

The front-end readout of the trigger primitives follow the same logical model of the readout of detector data. The calorimeter trigger primitives are stored, during the L1 trigger latency, in pipeline memories located in the calorimeters readout and trigger boards, and are moved into de-randomizer buffers at each L1 Accept (see chapter 4). The trigger primitives are then assembled by the board readout controller, merged with the detector data and transferred to the crate data concentrator. As an example, we show in Figure 7.4 the data structure of one event collected in the ECAL readout/trigger board (ROSE100 board).



**Fig. 7.4:** Event block in the ECAL readout/trigger board.

The calorimeters Data Concentrator Cards (DCC) are common collection point for data from a set of readout cards in one crate. Data is transferred upon receipt of a CMS-wide trigger signal. Local event building and monitoring is performed. Specific to ECAL, reduction of the data volume is provided by Selective Readout of regions flagged by energy deposition. In Figure 7.5 a functional block diagram of the DCC is shown.

The data path from the readout boards to the DCC uses the Channel Link component from National. Channel Link implements point-to-point LVDS four pair connections with an effective bandwidth of 1.12 Gbit/s. The DCC receives event fragments from the readout modules,



**Fig. 7.5:** Functional block diagram of the DCC

assembles the fragments in DCC-events (crate-events), checks the data integrity and transfers the DCC-events to the ECAL-DAQ and to the Trigger Readout Crate. A fraction of events are also made available to the crate processor on a sampling basis (spying).

A block diagram of the ECAL DCC readout architecture is shown in Figure 7.6. Data transferred from the readout modules is stored in the input FIFOs (iFIFO). Data transfer is controlled by the Input Handlers, one for each iFIFO. Event fragments are then checked, assembled and stored in the output FIFOs by the Event Builder. Different output FIFOs store ECAL data and trigger primitive data.



**Fig. 7.6:** Block diagram of the DCC readout architecture.

The Event Builder is permanently waiting for the L1A signal and L1A event identification, and manages its own List of L1A Events Waiting. The Event Builder activity continues as long as there are events in this list. The TPG data is broadcast over a medium speed link to the Trigger Readout Crate. The ECAL data is transferred through a standard DAQ interface to the DAQ link.

The strategy used for buffering is shown schematically in Figure 7.7. This scheme is predicated on the complete push architecture used from the front-end up to the DCC output. At each stage upstream, data is pushed as fast as possible downstream. To smooth out the fluctuations, buffering is done at each layer of assembling. Because of statistical fluctuations on trigger rate and event size it is always possible to overflow the buffer, independent of its depth, as long as the buffer output is bandwidth limited. Rather than trying to implement a complex scheme of “backpressure”, a simpler method is used. That is, it is tolerable to throw away entire events (as long as very infrequently), if necessary, as long as the information that this section of data was dropped for that specific event is not lost.

As seen in Figure 7.7, the trivial logic for this technique is driven by monitoring the almost-full flag of a standard, commercial FIFO. A FIFO twice as deep as necessary for nominal operation (including reasonable contingency) is chosen. Due to an exceptional data fluctuation, or a hardware error downstream, if the almost-full flag comes on, action is taken on an L1A basis. The flag is polled only at the beginning of an L1A event. If on, the data is not written. Instead, an error word, indicating that the data for L1A accept # n is written to the FIFO. This process is repeated at the beginning of receipt of data from each triggered event. Recovery requires no specific action.

Reliable operation of the readout requires continual data quality monitoring. Mechanisms for this may be implemented at a number of levels. The primary elements considered currently are the monitoring of word-counts and check-sums on the data streams as they are processed, and the use of spying data, comparing small sub-sets of the data as seen on the DCC with data spied upon at the source.

The hardware status of the readout modules is reported to the Trigger Control System (see Chapter 16) through the Fast Monitoring Network.



**Fig. 7.7:** Schematic view of the data buffering strategy.

### 7.4.2 Readout of Regional Trigger Data

The trigger data computed in the Regional Calorimeter Trigger and transferred to the Global Calorimeter Trigger is moved into data acquisition buffers located in the GCT. However, for test and monitoring purposes is highly desirable to have the possibility of storing the same data in the Regional Trigger itself. We will adopt this line provided enough space is available in the regional trigger cards (see Chapter 5) for the data acquisition buffers.

The storage of the regional trigger data, defined in Section 7.3, follows the same logical model of the readout of detector data. The regional trigger data are stored in pipeline memories during the L1 trigger latency, and are moved into de-randomizer buffers at each L1 Accept. The data in the derandomizer buffer is pushed through a Channel Link connection into the Trigger Readout Crate (Figure 7.8). The event size is about 0.25 kbyte/crate and the data rate per link is of the order of 200 Mbit/s.

In order to give data access to the regional trigger crate CPU, the data flowing from the Jet/Summary Card can be sampled in a mezzanine PMC on the crate processor (Figure 7.8).



**Fig. 7.8:** Regional trigger readout architecture

### 7.4.3 Readout of Global Calorimeter Trigger Data

As it was pointed-out before, the Global Calorimeter Trigger system stores both its input and output data. GCT events with about 3 kbytes size (5 data samples per event) are then transferred to the Trigger Readout Crate through 8 optical links, with an integrated bandwidth of 4.8 Gbit/s. The details of the data readout in the Global Calorimeter Trigger are given in Chapter 6.

### 7.4.4 Calorimeter Trigger Readout Crate

Taking a systemic view of the CMS DAQ and Trigger as seen in Figure 7.1, for the purposes of monitoring, it is very beneficial to have all Trigger related information localized in one section of the DAQ system. The penalty otherwise is a huge overhead in horizontal burden on the LAN backbone -- whose bandwidth which could be much better used. This architecture is

completely in line with the philosophy that the Trigger should be read out exactly as another sub-detector.

The calorimeter Trigger Readout Crate (TRC) centralizes the data collected from the Trigger Primitives Generators, the Regional Calorimeter Trigger and the Global Calorimeter Trigger. The crate is equipped with a controller CPU and 15 Trigger Concentrator Modules (TCM). The TCMs have all the same architecture but are configurable according to the needs of the different data channels. The trigger data concentrator design follow the same principles used in the calorimeters DCC. In fact, the TCM will be a simplified version of the calorimeter's DCC.

The basic architecture of the TCM is represented in Figure 7.9. It has 10 input channels and two output channels. The input channels will have different rates according to the data sources, which can be located at different distances from the trigger readout crate. It is therefore suitable to be able to choose the physical link as a function of the data source type. However, in order to keep uniformity we plan to use the same link interface for all inputs. This functionality is today provided by the S-link, for example.

The output channels will have a standard DAQ interface to the DAQ link. In order to provide flexibility the mapping of input to output channels can be configured. On the other hand, the TCMs receiving the trigger primitives will not be equipped with the output link since these data is transported to the central DAQ by the calorimeters readout systems. The trigger primitives are available to the crate CPU via the VME backplane.



**Fig. 7.9:** Trigger Concentrator Module architecture

## 7.5 Calorimeter Trigger Control

As we mentioned in Section 7.2.3, each Calorimeter Trigger Control application has a client process, one or more server processes and a database server process. Each client process is driven by a graphical user interface.

The *administration* and the *inventory applications* each have such a client, which communicates with a specific server, located in another machine, using a lightweight standards based protocol. The data from these client processes, after being processed by the correspondent server process, is transmitted to the database server process, eventually located on a third machine.

The *accounting application* server runs permanently on a dedicated machine, monitoring the CMS Calorimeter Trigger occupation and status. Depending on the peak message volume and rate, this process can also function as an alarm logger.

The *control, test and simulation, and monitoring frameworks* interact with elements of different subsystems. In the calorimeter trigger control software we consider the following subsystems: a Back-end subsystem, a Front-end subsystem, a Regional Trigger subsystem, a Global Trigger subsystem, a Trigger Readout subsystem and a Calorimeter Trigger TTC subsystem. The software subsystems reflect the organization of the readout hardware.

### 7.5.1 Calorimeter Trigger Back-end Subsystem

The Back-end subsystem (Figure 7.10) is where all the control logic is located. It concentrates the *administration* and the *inventory servers*, as well as the *control*, the *test* and *simulation* and the *monitoring service providers*. It can be seen as the system window to the rest of the world. Ideally there should be several machines allocated to the Back-end subsystem to guarantee a proper load balancing and a suitable run-time response to the client applications.

The *administration server* receives user registration service requests and after processing retrieves/sends the required information from/to the database server process, located on a dedicated machine. In the same way, the *inventory server* provides the user interface with the repository database, handling all the requests concerning registration, modification or deletion of:

- equipment (*devices*) information; or
- components (e.g. *elementary partitions* and/or *partitions*) information.



**Fig. 7.10:** Back-end control software architecture

Whenever a *control client* (or a *test* and *simulation client* or a *monitoring client*) process is launched, the Back-end subsystem hands over a *control service provider* (respectively, a *test* and *simulation service provider* or a *monitoring service provider*).

The *control (test) service provider* has a configuration and setup server, a configuration and setup database read/write client, a control (test) server and as many crate control (test) clients as the elementary partitions included in the selected configuration.

The *control (test) client* submits configuration requests (or partition booking requests), by means of a partition's name, for example. Or it can submit setup (parameters' change) requests. These requests are handled by the configuration and setup server, which, after some processing, delivers them to the configuration and setup database client. The later stores these changes into the repository database through the database server process. From this moment on, the selected configuration will be locked (booked) by its user. Once a configuration has been agreed upon, a control (test) server and as many crate control (test) clients as the elementary partitions (crates) included in the selected configuration, are created. Additionally, a Calorimeter Trigger TTC subsystem client is also created.

The *control (test) client* provides control commands (test directives) to the control (test) server, which are propagated to its subordinate crate control (test) clients. At the end of each action the control (test) server returns a completion status to the *control (test) client*.

The *monitoring service provider* behaves in a similar way. The difference is in the configuration and setup database client. The user still selects a configuration to monitor. Nonetheless, this configuration is not locked (booked) to a particular user.

## 7.5.2 Front-end Calorimeters Subsystem

The Front-end Calorimeters subsystem is a grouping of Front-end Calorimeter Elementary Partitions. Its external behavior is defined by the external behavior of each of its elements.

By definition, a Front-end Calorimeter Elementary Partition is a set composed of a Crate Controller and a list of one or more Front-end equipment modules (or *devices*)<sup>1</sup>. As an example, an ECAL crate can be seen as an Elementary Partition. Nevertheless, the list of Front-end equipment modules can vary. It is determined by the selected user's configuration. A valid list of Front-end equipment modules would be a Data Concentrator and/or one or more ECAL (HCAL) devices (e.g. for the ECAL these are the *ROSE100* boards).

All the Elementary Partitions have the same interfaces, in principle. The interfaces are grouped in functional categories and there are interfaces for: configuration and setup, control, monitoring and test of an Elementary Partition. The configuration and setup is realized by a (configuration and setup) database client, which communicates with the database server system. The other interfaces are realized by a control server, a monitoring server and a test server, respectively. These servers communicate with the correspondent Back-end subsystem clients. Additionally the ECAL (HCAL) Front-end Calorimeter crate control and monitoring servers can be driven by the ECAL (HCAL) Detector Control System.

---

<sup>1</sup>. The smallest Front-End subsystem elementary partition is defined by a region of 4 Trigger Towers which corresponds to 100 crystal cells.

On creation the crate control server, creates a configuration and setup client and a monitoring server. Once the Back-end crate control client issues a *Configure* (or *Load*) request to this Elementary Partition control server, the configuration and setup client is contacted. It downloads this elementary partition's configuration (device list) and setup parameters (device setup parameters) from the database server system signalling the server when done. The crate control server will then be ready to take further control commands (test directives) from the Back-end crate control client. Each server will be responsible for delegating on the interfaces of the devices belonging to the list mentioned earlier, the actions requested by their clients.

### 7.5.3 Calorimeter Regional Trigger Subsystem

Likewise the Regional Calorimeter Trigger subsystem is a grouping of Regional Calorimeter Trigger Elementary Partitions (crates). By definition, a Regional Calorimeter Trigger Elementary Partition is a set composed of a Crate Controller and a number of regional trigger modules. Correspondingly, the Regional Calorimeter Trigger interfaces are: configuration and setup, control, monitoring and test of an Elementary Partition.

The configuration and setup is realized by a (configuration and setup) database client, which communicates with the database server system. The remaining interfaces are realized by servers, which communicate with the correspondent Back-end subsystem clients. These components behave in the same way as the Front-end Calorimeter Elementary Partition components. As before, each server is responsible for delegating on the interfaces of the trigger modules, the actions requested by their clients.

### 7.5.4 Calorimeter Global Trigger Subsystem

The Global Calorimeter Trigger subsystem is a grouping of Global Calorimeter Trigger Elementary Partitions. Each module in the Global Calorimeter Trigger crate is equipped with its own CPU, and can be seen as an elementary partition. A Global Calorimeter Trigger Partition is a set of module Controllers and the associated trigger electronics.

The Global Calorimeter Trigger has the same set of interfaces described before. In the same way, the configuration and setup is realized by a database client and the remaining interfaces are realized by servers that communicate with the correspondent Back-end clients. Again, these components' behavior is the same as the Front-end Calorimeter Elementary Partition components one.

### 7.5.5 Calorimeter Trigger Readout Subsystem

The Calorimeter Trigger Readout subsystem is confined to the Calorimeter Trigger Readout crate. Its external behavior is defined by the external behavior of this elementary partition which is composed of a Crate Controller and a number of Trigger Concentrator Modules.

The Calorimeter Trigger Readout subsystem has the same set of interfaces described before: configuration and setup, control, monitoring and test. The difference here is in that the interface implementations concern only the TCMs.

## 7.5.6 Calorimeter Trigger TTC Subsystem

The Calorimeter Trigger TTC subsystem is the entity that bridges the Calorimeter Trigger with the Trigger Control System (see Chapter 16). The Calorimeter Trigger TTC subsystem can be seen also as a grouping. A grouping of *TTC Timing Zones*. Each *Timing Zone* can act much like a *Calorimeter Trigger Partition*. The number of possible Timing Zones will be determined by the needs of the Calorimeter Trigger System, of the ECAL and of the HCAL detectors. Each *Timing Zone*, at the end, will be mapped to a TTCvi (TTC-VMEbus interface module).

The TTC Calorimeter Trigger subsystem interfaces are grouped in only two functional categories: the configuration and setup of a Timing Zone, and the control of a Timing Zone. The configuration and setup is realized by a (configuration and setup) database client, which communicates with the database server system.

The Timing Zone control interface is realized by a TTC control server which communicates with a Back-end client. When a Back-end control (or test) client is created a TTC control server is assigned to it and a configuration and setup client is created. Once the Back-end control client issues a *Configure* (or *Load*) request to this TTC control server, it then translates the user selected configuration, from partitions and elementary partitions, to the timing zones to which those elements belong. The timing zones' names are submitted to the configuration and setup client which seeks the correspondent information (the TTCvi device setup parameters) from the database server for downloading. The TTC control server will be responsible, when applicable, for delegating on the interfaces of the TTCvi devices belonging to the selected timing zones, the actions requested by their clients. In addition after a *Start* command, the server will be responsible to request from the Global Trigger to deliver a trigger type (or trigger mode), according with the user selection, to the chosen timing zone(s).

## 7.6 Interfaces

### 7.6.1 Hardware Interfaces

#### Readout Interface to DAQ

The interface to DAQ is located in the TCM modules in the Calorimeter Trigger Readout Crate. The TCMs collecting data from the Regional and Global Calorimeter Trigger will be equipped with a standard DAQ mezzanine board. This board contains the readout engine that reads data from the TCM final buffer (pull model) and the fast optical link to transmit the data to the DAQ Readout Units. Four Readout Units are foreseen for the Calorimeter Trigger data.

#### TTC Interface

The Calorimeter Trigger receives fast control from the central TCS (Chapter 16) through a single TTC partition. The TTC partition is composed by a TTC VME interface (TTCvi module), a TTC transmitter module (TTCEx module) and the passive optical fanout.

## Fast Monitoring Interface

The Calorimeter Trigger will send to the central TCS the fast monitoring signals Ready, Busy, Error, Warning Overflow and Out of Sync, according to the specifications defined in Chapter 16.

### 7.6.2 Software Interfaces

#### Run Control Interface

The Calorimeter Trigger Control system can operate as a server of the CMS Experiment Run Control (in non-interactive mode). In this circumstances, it will be able to receive configuration directives (or partition booking requests) from such a client, by means of a partition's name. It will be able to receive setup parameters for the chosen configuration (e.g. device setup parameters). All this, through the Back-end subsystem's configuration and setup server.

The Calorimeter Trigger Control system will comply to the CMS run control protocol. It will be able to receive control (*Initiate, Start, Pause, Resume, Stop, Reset*) commands to which it will provide the correspondent feedback (completion status).

#### ECAL and HCAL Control Systems Interface

When operating in interactive standalone mode during the test phases of the calorimeter trigger, the Calorimeter Trigger Control can control the calorimeters front-ends for configuration purposes. In any case (despite the operation mode), the Calorimeter Trigger Control should be able to collect samples of calorimeter data, on user request.

#### Farm System Interface

When operating in standalone mode for debugging purposes, the system can request the allocation of dedicated resources from the Event Builder and Farm System . In any case (despite the operation mode), for monitoring purposes, the system should be able to collect data samples, from the Farm System, on user request.

#### Global Trigger Interface

The Calorimeter Trigger Control is able to request the central Trigger Control to deliver a trigger of a particular type to the Calorimeter Trigger partition, when in standalone mode.

#### Muon Trigger Interface

In spite of the operation mode, the Calorimeter Trigger Control should be able to collect data samples from the Muon Trigger system, or from a subset of the Muon Trigger System, on user request, for monitoring purposes.

### 7.6.3 User Interface

To reduce the time required to train people on the use of the Calorimeter Trigger Control, all the applications' user interfaces should have a common *look & feel*. Thus, the use of the CMS SCADA framework in this context may be a considerable advantage. All the

applications' user interfaces will provide graphical operations and textual data entry, and receive both textual and graphical output in support of each transaction and function.

The exception goes to the *administration* and the *inventory application* user interfaces which will be heavily (if not exclusively) based on textual forms for data submission and reception.

The remaining application user interfaces will have at their core a common graphical representation, view, of the system with which the user can interact. The interaction capabilities of this view will depend solely on the context (application) from where it is being used.

## 7.7 Prototype and Tests

### 7.7.1 Data Concentrator Card

A prototype of the ECAL data readout in the front-end crates was build and is under test. The prototype system includes a ROSE board equipped with front-end pipelines for 50 input channels and a prototype of the SLB card (see Chapter 4). The communication between the ROSE/SLB board and a DCC emulator using Channel Link was successfully tested (Figure 7.11).

The effort is now concentrated in the design of a DCC module. We plan to have a first prototype in 2001.



**Fig. 7.11:** Test bench of the ECAL trigger primitives readout and synchronization.

### 7.7.2 Readout PMC

A prototype of a generic readout PMC was developed (Figure 7.12). This prototype receives data from Channel Link or from Firewire connections and interfaces through PCI with the host CPU. Channel Link was found more appropriate for our application and was selected.



**Fig. 7.12:** Readout PMC prototype.

### 7.7.3 Boundary Scan Controller

A prototype was developed of a VME based JTAG controller (hardware/software) enabling the application of a boundary scan test to the subsystem crates during idle time or when a maintenance operation is required [7.4].

The boundary scan controller board (Figure 7.13) receives VME bus commands, generated by the VME crate controller, to download and configure the test procedure, and drives the 1149.1 test signals in order to apply a boundary scan test to a single board inserted in the VME backplane.



**Fig. 7.13:** Block diagram of the JTAG crate controller

### 7.7.4 Control Software

The development of the calorimeter trigger control software has been already started. After the identification of the user requirements, and the definition of the system architecture, the first prototype applications were developed. These include in particular the front-end ECAL software (ROSE software) and the TTC software.

## 7.8 Status and Schedule

The final prototype of the calorimeter trigger readout hardware should be fully tested at the end 2002. Production will be organized in 2003. Installation and commissioning will be done in 2004.

The software development will proceed in several phases. As the requirements are not yet totally clear, a first “exploratory prototype” phase was anticipated which will most certainly end in a mock-up version of the subsystem. A first standalone release will follow the mock-up version with reduced functionality. Only after this, the final product will be released, for integrated operation with other CMS data acquisition subsystems. The first release is meant to help the Calorimeter Trigger group to test their equipment.

## References

- [7.1] C. Tully, J. Varela, J.C. Silva, G. Varner, Study of the ECAL Data Concentrator, CMS IN-1999/012
- [7.2] C. Beltrán Almeida, I. C. Teixeira, J. P. Teixeira, J. Augusto, M. Santos and N. Cardoso, J. Varela, Testability Issues in the CMS Ecal Upper-Level Readout and Trigger System, Proceedings of 'Fifth Workshop on Electronics for LHC Experiments', Snowmass, Colorado, USA, 1999.
- [7.3] T. Monteiro, Ph. Busson, W. Lustermann, T. Monteiro, J. C. Silva, C. Tully, J. Varela, Selective Readout in the CMS ECAL, in Proceedings of 'Fifth Workshop on Electronics for LHC Experiments', Snowmass, Colorado, USA, 1999.
- [7.4] N. Cardoso, C. Beltran, J. C. Silva, A System Level Boundary Scan Controller Board for VME Applications, in Proceedings of IEEE European Test Workshop (ETW 2000), June 2000, Lisbon, published by IEEE Computer Society and submitted to Journal of Electronic Testing.

# 8 Muon Trigger Introduction

## 8.1 Muon Detector

The CMS detector is a general purpose detector specifically optimized for muon measurement which is performed by Drift Tubes (DT) located outside of the magnet coil in the barrel region and Cathode Strips Chambers (CSC) in the forward region. The CMS Muon System is also equipped with Resistive Plate Chambers (RPC) dedicated for triggering. Positions of those three detectors are schematically shown in Fig. 8.1.



**Fig. 8.1:** Longitudinal cut of the CMS Muon System.

### 8.1.1 Drift Tubes

The layout of the barrel muon system [8.1] is shown in Fig. 8.2. It is composed of four muon stations interleaved with the iron of the yoke to make full use of the magnetic return flux ( $\sim 1.8$  T). This makes possible a muon momentum standalone measurement which is essential for the trigger and it is useful for the off-line matching of the muon reconstructed track to its image inside the inner tracker. The main drawback of a scheme with muon chambers packed close to the iron is the presence of an important electromagnetic background due to showering in the iron induced by muon bremsstrahlung, that complicates the track reconstruction. Together with the unavoidable cracks introduced for the supports and the cabling of large detectors, this background is the most important reason for the choice of an highly redundant design.

The redundancy is ensured by both number of stations and number of detector layers in each station. One station contains one or two RPC modules and one DT module. Each DT module is composed of three superlayers (SL) each one split in four layers of staggered drift tubes as shown in Fig. 8.3: two SL are measuring the coordinate in the bending plane ( $\phi$  view) and one is looking at the longitudinal plane ( $\theta$  view).



**Fig. 8.2:** Transverse view of the CMS detector.



**Fig. 8.3:** Cross section of a barrel muon chamber.

The inner  $\phi$  view SL is separated from the other ones by a 20 cm thick aluminum honeycomb plate that supplies the module with the required stiffness, permits the opening of an electromagnetic shower that can be generated inside the iron allowing a better measurement in the SL away from the iron and provides a lever arm in the bending plane useful for triggering purposes.

The outermost station differs from the inner ones, since it does not have a SL in the  $\theta$  view. This SL is replaced by a honeycomb plate providing the same lever arm between the two  $\phi$  view SLs. Its eventual contribution to the muon  $\eta$  assignment was found to be marginal.

The barrel chambers are drift chambers that are required to provide a final resolution of 100  $\mu\text{m}$  per station and to be linear in the time to space conversion, i.e. to have a drift velocity constant along the whole drift path. Most of the past studies were therefore devoted to careful evaluation of the drift cell layout in order to meet these requirements. Several tests were done on a muon beam using various chamber prototypes [8.2][8.3]. The results of the tests fully matched the requirements.

### 8.1.2 Cathode Strip Chambers

Endcap muon stations are equipped with Cathode Strip Chambers (Fig. 8.4). CSCs are multiwire proportional chambers with segmented cathode readout. High precision coordinate along the wire is obtained by extrapolation of charges induced on several adjacent cathode strips. In CMS the strip width varies from 3.2 to 16 mm. Obtained resolution is in the range between 80  $\mu\text{m}$  and 450  $\mu\text{m}$  for one layer. For the trigger purposes, however, resolution of 1/2 of strip is good enough.

CMS chambers have trapezoidal shape. One chamber consists of six detecting layers. The layers are separated by 16 mm thick polycarbonate plastic honeycomb panels which make the chamber stiff and provide a lever arm necessary to measure angle of the tracks. In each layer the strips are running radially. In angular units the strip width  $\Delta\phi$  varies from 2.0 to 4.3 mrad and the length  $\Delta\eta$  from 0.35 to 0.60  $\eta$ -units. Combined off-line resolution of six layers approaches 50  $\mu\text{m}$ .

The wires are perpendicular to the strips, except ME1/1 where the wires are tilted by 25°. This is to compensate the Lorentz effect in high magnetic field (almost 4T) to which the chamber is exposed. The wires have 50  $\mu\text{m}$  diameter and they are spaced by 2.5 or 3.175 mm. Groups of 5-17 wires are readout together providing the spatial resolution of  $\Delta R = 16\text{-}54$  mm, i.e.  $\Delta\eta = 0.01\text{-}0.04$ .

### 8.1.3 Resistive Plate Chambers

RPC chambers are located both in the barrel and in the endcaps (Fig. 8.1). Each muon station is equipped with one RPC plane except the two innermost barrel stations 1 and 2 which contain two RPC planes (Fig. 8.3). This is because low momentum muons ( $p_T < 5\text{-}6$  GeV/c) cannot reach the outer stations, for which a special low  $p_T$  trigger is foreseen. Such additional planes are not necessary in the endcaps where the same  $p_T$  corresponds to much higher total momentum. Thus, the low  $p_T$  reach of the CMS muon trigger is about 4 GeV/c in the barrel and 2.0-3.5 GeV/c in the endcaps. The RPC readout is segmented into strips 1-4 cm wide and 30-130 cm long. In the barrel the strips are running parallel to the beam, whereas in the endcaps they are radial. In angular units each strips covers  $\Delta\phi=(5/16)^\circ$  and  $\Delta\eta\approx 0.1$ .



**Fig. 8.4:** Layout of the CMS Endcap.

### 8.1.4 Magnetic Field

Measurement of the muon transverse momentum  $p_T$  is based on track bending in the magnetic field. The magnetic field in the CMS detector is created by a long superconducting solenoid (Fig. 8.5). Since the dominant component of the  $\mathbf{B}$  field is along the beam direction, tracks are primarily bent in the  $(r, \phi)$  plane (perpendicular to the beam direction). Thus, tracks in  $(r, z)$  projection are approximately straight lines, i.e. they keep almost constant  $\eta$  value along the path. The presence of a radial field component  $\mathbf{B}_r$ , especially in the forward part of the detector, slightly modifies this picture. A track bent by the  $\mathbf{p}_T \times \mathbf{B}_z$  force gets some tangential component  $p_\phi$ . Then the  $\mathbf{p}_\phi \times \mathbf{B}_r$  produces the  $z$  component of the Lorentz force. As a result, the track's  $\eta$  changes along its path. This deflection in  $\eta$  is rather small because the  $p_T$  component is small in comparison to the total  $p$  value. Even for the softest tracks reaching the muon stations, the change in  $\eta$  typically does not exceed 0.15  $\eta$ -unit. Thus, in order to measure the transverse momentum of the track, it is enough to observe the dominant bending in the  $(r, \phi)$  plane.



**Fig. 8.5:** CMS magnetic field map.

## 8.2 Muon Trigger Overall Structure

The First Level Muon Trigger of CMS uses all three kinds of muon detectors: Drift Tubes (DT), Cathode Strip Chambers (CSC) and Resistive Plate Chambers (RPC). The excellent spatial precision of DT and CSC ensures sharp momentum threshold. Their multilayer structure provides a possibility of effective background rejection. RPC are dedicated trigger detectors. Their superior time resolution ensures unambiguous bunch crossing identification. High granularity makes possible to work in high rate environment. Time information and both spatial coordinates of a detected particle are carried by the same signal, which eliminates ambiguities typical for wire detectors.

Complementary features of muon chambers (DT/CSC) and dedicated trigger detectors (RPC) allows us to build two trigger subsystems which deliver independent information about detected particles to the Global Muon Trigger. Advantages of having two such subsystems are numerous. The muon chambers and the dedicated trigger detectors deliver different information about particle tracks. They behave differently in difficult cases and they respond in different ways to various backgrounds. DT with long drift time (~400 ns) and CSC with charge weighting are more vulnerable to muon radiation for which RPC are much less sensitive. In the DT/CSC case, a background hit or track segment can eliminate the right one and cause some inefficiency. It is just the opposite in the RPC trigger processing all hits simultaneously, which may lead to some rate increase. Accidental coincidence of three of four background hits can be recognized by the RPC trigger as a real muon. This is very unlikely for DT/CSC as they look for coincidence of several planes in each station.

Properly combining the information from both systems results in high efficiency and powerful background rejection. Two extreme cases of such combinations would be the logical *OR*, which is optimized for efficiency, and the logical *AND*, optimized for background rejection. However, neither of these operations results in full use of the complementary functions of the

muon trigger components and a more sophisticated algorithm will be used. This is possible, because both the muon chambers and the dedicated trigger detectors deliver an information about the quality of detected muon candidates — number of hits in coincidence, matching between different chambers, etc. In general the muon candidate is taken in the two cases:

- it is seen by both RPC and DT/CSC subsystem (regardless of quality),
- it is seen by only one subsystem with high quality.

Another important advantage of the two component system is a possibility of crosschecks and cross calibration. Trigger data from the two components collected by the DAQ can be compared online. This enables the quick discovery of possible problems and gives a possibility of immediate action. When studying cross sections, asymmetries etc., it is very important to know the trigger efficiency and acceptance. Usually this is done by running with thresholds much lower than the measurement range. Two component system offers a unique ability to measure these quantities in a more unbiased way.

The muon trigger system consists of the following items:

- Drift Tube (DT) Trigger
- Cathode Strip Chamber (CSC) Trigger
- Pattern Comparator Trigger (PACT) based on Resistive Plate Chambers (RPC)
- Global Muon Trigger



**Fig. 8.6:** Muon Trigger data flow.

Functional relations between the components are shown in Fig. 8.6. The data are exchanged between DT and CSC in the overlap region ( $0.8 < |\eta| < 1.2$ ). In this way the Barrel Track Finder covers  $|\eta| < 1.0$ , whereas the Endcap Track Finder covers  $1.0 < |\eta| < 2.4$ . Optionally, coarse RPC data can be sent to the CSC trigger in order to help solving spatial and temporal ambiguities in multimuon events.

The RPC trigger works on a grid of  $\Delta\eta \times \Delta\phi = \sim 0.1 \times 2.5^\circ$ , which determines its two muon resolution. The DT and CSC triggers do not work on a fixed grid. The  $\eta$  and  $\phi$  coordinates are calculated with precision of 0.05  $\eta$ -unit and  $2.5^\circ$  respectively.

## 8.3 Algorithms and Implementation



**Fig. 8.7:** Muon Trigger block diagram.

The logical structure of the Muon Trigger is shown in Fig. 8.7. DT and CSC electronics first process the information from each chamber locally. Therefore they are called *local triggers*. As a result one vector (position and angle) per muon per station is delivered. Vectors from different stations are collected by the Track Finder (TF) which combines them to form a muon track and assigns a transverse momentum value. TF plays the role of a *regional trigger*. Up to 4 best (highest  $p_T$  and quality) muon candidates from each system are selected and sent to the Global Muon Trigger.

In the case of RPC there is no local processing apart from synchronisation and cluster reduction. Hits from all stations are collected by PACT logic. If they are aligned along a possible muon track, a  $p_T$  value is assigned and the information is sent to the Muon Sorter. The Muon Sorter selects 4 highest  $p_T$  muons from the barrel and 4 from the endcaps and sends them to the Global Muon Trigger. The Global Muon Trigger compares the information from TF (DT/CSC) and PACT (RPC). So called quiet bits delivered by the Calorimeter Trigger are used to form an isolated muon trigger. The 4 highest  $p_T$  muons in the whole event are then transmitted to the Global Trigger. Finally transverse momentum thresholds are applied by the Global Trigger for all trigger conditions.

### 8.3.1 Drift Tube Trigger

The drift chambers deliver data for track reconstruction and triggering on different data paths. The local trigger is based on two SL in the  $\phi$  view of the muon station. The trigger logical blocks are shown in Fig. 8.8.



**Fig. 8.8:** Block scheme of the local trigger of a drift chamber.

The trigger front-end device, i.e. directly interfaced to the wire front-end readout electronics, is called Bunch and Track Identifier (BTI). It is used in the  $\phi$  view and the  $\theta$  view and performs a rough muon track fit in one station measuring position and direction of trigger candidate tracks with at least three hits, in different planes of a SL. The algorithm fits a straight line within programmable angular acceptance. Since it accepts three points tracks, the device is still working even if the drift time of a tube is missing, due to inefficiency, or wrong, due to the emission of a  $\delta$ -ray, since there are still three useful cells giving the minimum requested information. It is also insensitive to all uncorrelated single hits. In the  $\theta$ -view only tracks pointing to the vertex are selected. The BTI uses an internal resolution of 0.7 mm for its calculations, but the resolutions on the output parameters are  $\sim 1.4$  mm on the impact position and  $\sim 60$  mrad on the track direction.

This device performs the bunch crossing assignment of every found muon track candidate. The algorithm used in the device is a generalization of the mean-timer method [8.4].

Since this method must foresee alignment tolerances and needs to accept alignments of only three hits, the algorithm can generate false triggers. Hence in the bending plane a system composed by a Track Correlator (TRACO) and a chamber Trigger Server (TS) is used to filter the

information of the two  $\phi$  SLs of a chamber in order to lower the trigger noise. The TRACO/TS block selects, at every cycle among the trigger candidates, at most two tracks with the smallest angular distances (i.e. higher  $p_T$ ) with respect to the radial direction to the vertex.

In particular the TRACO improves the angular resolution of the muon candidate track to  $\sim 10$  mrad using the larger lever arm available and is converting the triggering variables to  $p_T$  related quantities (the position in the detector as the angle  $\phi$  and the bending angle  $\phi_B$ ), while the TS system is governing the two tracks selection decision and it is therefore deciding dimuon detection efficiency.

The TS outputs at most two track segments in cells of size  $\Delta\phi \sim 1.5$  mrad and  $\Delta\phi_B \sim 12$  mrad. This cell defines the minimal de facto separation between two segments necessary for their identification, although two identical objects are allowed on output. The  $\eta$  segmentation is variable along the detector, being at fixed  $z$  values: there are 40 pseudorapidity cells in the range  $|\eta| \leq 1.2$ .

Track segments found in each station are then transmitted to a regional trigger system called Drift Tube Track Finder (DTTF). The task of the Track Finder is to connect track segments delivered by the stations into a full track and assign a transverse momentum value to the finally resolved muon track [8.5][8.6][8.7]. The system is divided in sectors, each of them covering  $30^\circ$  in the  $\phi$  angle. The Sector Processors are organized in twelve wedges along the  $\eta$  coordinate. The Sector Processors in the outermost barrel wheels receive track segment data also from the Cathode Strip Chambers system, in order to handle the overlap region track finding. Each Sector Processor is logically divided in three functional units - the Extrapolator Unit (EU), the Track Assembler (TA) and the Assignment Units (AU), as shown in Fig. 8.9.

The Extrapolator Unit attempts to match track segments pairs of distinct stations [8.5]. Using the spatial coordinate  $\phi$  and the bending angle of the source segment, an extrapolated hit coordinate may be calculated. The match is considered successful if a target segment is found at the extrapolated coordinate, within a certain tolerance (see Fig. 8.9). Memory based look up tables are used in the calculation of the extrapolated hit coordinate and tolerance values. Since tracks may cross detector sector boundaries, the Extrapolator Unit can use the neighboring sector track segments as targets in the extrapolations [8.8]. The two best extrapolations per each source are forwarded to the Track Assembler.

The Track Assembler attempts to find at most two tracks in a detector sector with the highest rank, i.e. exhibiting the highest number of matching track segments and the highest extrapolation quality. This task is performed in three steps. First the Track Segment Linker joins the track segment pairs formed by the Extrapolation Unit to full tracks, which are then forwarded to the Track Selector unit. The Track Selector contains a cancellation logic that reduces the number of duplicated tracks. Finally it selects the two highest rank tracks. As a last step, the Address Assignment sub-unit extracts the corresponding track segment data from the data pipeline, and forwards them to the Assignment Unit.

Once the track segment data are available to the Assignment Unit, memory based look up tables are used to determine the transverse momentum, the  $\phi$  and  $\eta$  coordinates, and a track quality. The transverse momentum is assigned using the difference in the spatial  $\phi$  coordinate of the two innermost track segments. The  $\phi$  coordinate is defined as the spatial coordinate of the track segment in the second muon station.



**Fig. 8.9:** Principle of the track finder algorithm (3-step scheme). On the left side, the pairwise matching algorithm is described. An extrapolated hit coordinate is calculated using the  $\phi_1$  coordinate and the bending angle of the source segment. The match is considered successful if a target segment is found at the extrapolated coordinate, inside a certain extrapolation window.

A preliminary coarse  $\eta$  assignment is derived from the place the track crossed detector wheel boundaries. A dedicated board has been designed for a finer  $\eta$  measurement; this measurement is derived by using the data from central Super Layers of the three innermost muon stations, which give a measurement of the  $z$ -coordinate of the track segments [8.9]. The  $\eta$  track finder board tries to match tracks along the  $\eta$  coordinate. At the Assignment step, the  $\eta$  track finder board attempts to match the found candidates with the  $\phi$  Track Finder candidates. If the matching is successful, the fine  $\eta$  measurement is assigned to the Track Finder candidate, otherwise the coarse measurement is assigned.

Each Sector Processor forwards the two best ranking candidates to the Wedge Sorter, which selects the two track candidates with the highest transverse momentum. Each of the twelve Wedge Sorters sends the muon candidates to the Muon Sorter, which reduces the number of split tracks performing a check over the neighboring wedges candidates. The four highest momentum tracks are selected and then forwarded to the Global Muon Trigger for the final decision.

### 8.3.2 CSC Trigger

A block diagram of the CSC Trigger electronics is shown in Fig. 8.10. The task of the Cathode Strip Chamber (CSC) Track-Finder is to reconstruct tracks in the CSC endcap muon system and to measure the transverse momentum ( $p_T$ ), pseudo-rapidity ( $\eta$ ), and azimuthal angle ( $\phi$ ) of each muon. A  $p_T$  resolution of about 25% is necessary to have sufficient rate reduction at L1 with a reasonable threshold. The measurement of  $p_T$  by the CSC trigger uses spatial information from up to three stations to achieve a precision similar to that of the DT Track-Finder despite the reduced magnetic bending in the endcap. The CSC muon trigger finds muon candidates in the presence of large background rates from: low-energy photon interactions, where the photons originate from neutron-induced nuclear reactions; decay muons, especially at the lowest momenta and in the first muon station; punch-through pions, particularly in the first muon station; primary muons having low energy, and bremsstrahlung showers from the high-momentum muons themselves.



**Fig. 8.10:** Block diagram of the CSC Trigger.

At large rapidities, high backgrounds are expected from punch through pions, primary muons, secondary muons, and neutron-induced gamma rays. The high-rapidity muons also have higher momentum corresponding to a particular  $p_T$  and hence radiate more bremsstrahlung. The CSC Local Trigger provides high rejection power against these backgrounds by finding muon segments, also referred to as Local Charged Tracks (LCTs), in the 6-layer endcap muon CSC chambers. Muon segments are first found separately by anode and cathode electronics (see Fig. 8.11) and then time correlated, providing precision measurement of the bend coordinate position and angle, approximate measurement of the non-bend angle coordinate, and identification of the correct muon bunch crossing with high probability.



**Fig. 8.11:** Principle of the CSC Local Trigger.

The primary purpose of the CSC anode trigger electronics is to determine the exact muon bunch crossing with high efficiency. Within the CSC chambers, anode wires are hard-wired together or ‘ganged’ at the readout end in groups of 10-15 wires in order to reduce channel count. Anode signals are fed into amplifier/constant-fraction discriminators. Since the drift time can be longer than 50 ns, a multi-layer coincidence technique in the anode “Local Charged Track” (LCT) pattern circuitry is used to identify a muon pattern and find the bunch crossing. For each spatial pattern of anode hits, a low coincidence level, typically 2 layers, is used to establish timing, whereas a high coincidence level, typically 4 layers, is used to establish the existence of a muon track.

The primary purpose of the CSC cathode trigger electronics is to measure the  $\phi$  coordinate precisely to allow a good muon momentum measurement up to high momentum. The charge collected on an anode wire produces an opposite-sign signal on several strips, and precision track measurement is obtained by charge digitization and precise interpolation of the cathode strip charges. A simpler and more robust method used by the CSC trigger achieves localization of the muon track to one-half of a strip width in each cathode layer. This is done with a 16-channel “comparator” ASIC that inputs amplified and shaped signals and compares the charges on all adjacent and next-to-adjacent strips. If a strip charge is found to be larger than those on its neighbors, a hit is assigned to the strip. Simultaneous comparison of left versus right neighbor strip charges allows assignment of the hit to the right or left side of the central strip, effectively doubling the resolution. The six layers are then brought into coincidence in LCT pattern circuitry to establish position of the muon to an RMS accuracy of 0.15 strip widths. Strip widths range from 6-16 mm.

Cathode and anode segments are brought into coincidence and sent to CSC Track Finder electronics which links the segments from the endcap muon stations. Each Track Finder unit finds muon tracks in a  $60^\circ$  sector. Because of the limited bending in the endcap region, information is not shared across sector boundaries. Each CSC Track Finder can find up to three muon candidates. A CSC muon sorter module selects the four best CSC muon candidates and sends them to the Global Muon Trigger.

### 8.3.3 RPC Trigger

The algorithm behind the RPC based Pattern Comparator Trigger has been described in detail elsewhere [8.10]. Here, we would like to recall its salient features and make some general comments on practical implementation, which is described in details in Chapter 13.



**Fig. 8.12:** RPC Trigger principle.

#### The algorithm

PACT is based on the spatial and time coincidence of hits in four RPC muon stations (see Fig. 8.12). We shall call such a coincidence a candidate track or a hit pattern. Because of energy loss fluctuations and multiple scattering there are many possible hit patterns in the RPC muon stations for a muon track of defined transverse momentum emitted in a certain direction. Therefore, the PACT should recognize many spatial patterns of hits for a given transverse momentum muon. In order to trigger on a particular hit pattern in the RPCs left by a muon, the PACT electronics performs two functions: requires time coincidence of hits in at least 3-out-of-4 muon stations (so called 3/4 candidate; better is 4/4 one) along a certain road and assigns a  $p_T$  value. The coincidence gives the bunch crossing assignment for a candidate track. The candidate track is formed by a pattern of hits that matches with one of many possible patterns pre-defined for muons with defined transverse momenta. The  $p_T$  value is thus given. The pre-defined patterns of hits have to be mutually exclusive i.e. a pattern should have a unique transverse momentum assignment. The patterns are divided into classes with a transverse momentum value assigned to each of them. PACT is a threshold trigger; it gives a momentum code if an actual hit pattern is straighter than any of pre-defined patterns with a lower momentum code. The patterns will depend on the direction of a muon i.e. on  $\phi$  and  $\eta$ .

The number of patterns for a low transverse momentum muon track might be huge. Fortunately, for these strongly bent tracks we do not need full granularity of RPCs and can reduce the number of patterns by suitably OR-ing RPC strips on the PACT input. In fact, one can adapt granularity for different  $p_T$  values, using full one for straight hit patterns and progressively OR-ed ones for more curved hit patterns. This is the idea of dynamic cone on which the current PAttern Comparator (PAC) processor - a kernel of PACT- is based. This concept has the advantage of moderate number of patterns per track (80 in the current design) and manageable number of input strips to a PAC.

## Segmentation of PACT

Ideally, the RPC geometry should be projective in  $\phi$ ; the strip width vary from 10 mm at high  $|\eta|$  and low radii to about 40 mm at the outer radius of CMS. In reality, the RPC azimuthal symmetry is broken (see e.g. Fig. 8.2) in a different way for every station in the barrel. This leads to the dependence of pre-defined pattern on  $\phi$  mentioned above. The logical segmentation is defined uniquely in so called reference station, which for most of the detector is Muon Station 2. The segmentation in  $\phi$  is based on the basic PACT logical unit called segment. Every segment subtends  $2.5^\circ$  in  $\phi$  i.e. 8 RPC strips in the reference station, and is serviced by one PAC processor. In non-reference stations more strips are connected forming a cone. The cones from neighbouring segments overlap. While the assignment of strips in the reference station to a segment and its PAC is unique, the strips in non-reference stations are connected to several segments, hence the need to split the RPC strip signals to several destinations.

The PACT is segmented in  $\eta$  into 33 towers, each covering roughly  $\Delta\eta \approx 0.1$ . This does not coincide with physical segmentation of the RPCs. Hence the need of OR-ing the strip signals from different chambers, and overlapping cones in  $\eta$  for segment processors. Each segment i.e. each PAC processor finds at most one candidate track. Internally, there may be several candidates found by a PAC inside one segment. The best candidate (i.e. with the best quality and then the highest transverse momentum) is selected and its momentum code, muon charge and quality bits are sent to PAC output.

## Ghosts, ghost-busting and sorting

One of the requirements for the RPC Muon Trigger is to deliver the 4 highest transverse momentum muon candidates in the barrel and 4 in two endcaps. This means that we need to sort the list of candidates for barrel and endcaps to find the highest ones. Before any sorting, however, we have to reduce the number of ghosts. There are cracks in geometrical coverage of the RPCs due to gaps between CMS barrel wheels as well as to the gaps between RPCs in a given muon station. In order to maximize the efficiency the PACT accepts not only 4/4 but also 3/4 candidate tracks and uses overlapping cones as inputs to PACs. This causes appearance of ghosts i.e. candidate tracks in neighbouring segments in  $\phi$  and/or  $\eta$ . Simulation shows that, typically, ghosts in  $\phi$  are 3/4 candidates with higher  $p_T$  than the real 4/4 candidate tracks. They could increase significantly the trigger rate and we have to suppress them before sorting. There are two ghost-busting (GB) engines implemented in PACT: one for GB in  $\phi$ , just after the layer of PAC processors, the second after sorting in one trigger tower (144 PACs). After last GB in  $\eta$ , there is final sorting to be done before producing up to 4+4 final candidates in barrel and endcaps.

### 8.3.4 Global Muon Trigger

The choice to have two separate muon systems was one of the determining ideas in the concept of the CMS detector. The basic design goals were excellent efficiency and control of trigger rates as well as redundancy. The Global Muon Trigger achieves the first two goals by choosing muon parameters from the most suitable system. Redundancy is achieved by the fact that the GMT can also use only one system in case of deficiency of the other. Special attention is given to ghost suppression, i.e. to the correct identification of identical muons seen twice.

The Regional Muon Trigger reconstructs muon candidates in both the barrel and the endcap regions out of hits or track segments found at the muon stations. For RPCs this is the Pattern

Comparator Trigger (PACT) covering the entire  $\eta$ -region. The DTs and CSCs have separate track finders. The GMT receives the best four barrel DT and the best four endcap CSC muons and combines them with 4+4 muons sent by the RPC PACT. It performs a matching based on the proximity of the candidates in  $(\eta, \phi)$  space. If two muons are matched their parameters are combined to give optimum precision. If a muon candidate cannot be confirmed by the complementary system quality criteria can be applied to decide whether to forward it. The muon candidates are ranked based on their transverse momentum, quality and to some extent pseudorapidity and the best four muon candidates in the entire CMS detector are sent to the Global Trigger.

The Global Muon Trigger also receives information from the calorimeters. The Regional Calorimeter Trigger sends two bits based on energy measurements representing isolation and compatibility with a minimum ionizing particle in  $\Delta\eta \times \Delta\phi = 0.35 \times 0.35$  trigger regions. The GMT extrapolates the muon tracks back to the calorimeter trigger towers and appends the corresponding ISO (isolation) and MIP (minimum ionizing particle) bits to the track data consisting of  $p_T$ , sign of the charge,  $\eta$ ,  $\phi$  and quality.

The GMT hardware consists of two GMT logic boards located in the Global Trigger crate. There is no separate crate for the GMT. Before the calculations can be done the incoming data from the different muon detectors and the calorimeter bits have to be aligned in time to each other and to the LHC bunch crossing structure. For the calorimeter MIP and ISO bits this is done on separate Pipeline Synchronizing Buffer modules also located in the Global Trigger crate which send the information to the GMT logic boards via the backplane. The four output muons are sent to the Global Trigger via the backplane.

## 8.4 Summary of Algorithm Performance

### 8.4.1 Simulation Tools

All the results presented in this section were obtained with PYTHIA [8.11] event generator. Trigger rates were calculated using large samples containing about  $10^7$  “minimum bias” single-muon and di-muon events and  $4 \cdot 10^6$  W, Z and Drell-Yan events. Details are described in [8.12].

Particle passage through the detector was simulated with CMSIM [8.13] program based on GEANT 3 package [8.14]. The only exception is the neutron background simulation which was performed by a stand alone version of FLUKA [8.15] with proper simulation of thermal neutrons. In order to properly simulate punch-through effects the hadronic interactions were switched on. Muons were enabled to produce delta-rays, bremsstrahlung photons and  $e^+e^-$  pairs.

The detector response was simulated with ORCA [8.16] program, unless otherwise stated. It contains detailed description of the trigger hardware mapped into the object oriented structure in such a way that each board (in some cases a chip) is represented by a C++ object. The algorithms are simulated using bit-wise arithmetic.

## 8.4.2 Background

The task of the Muon Trigger is difficult because of the presence of severe background. In fact this is the major challenge of the design. There are three main sources of background:

- proton-proton interactions themselves,
- beam losses because of the limited LHC aperture ( $p$ -nucleus collision with energy of 7 TeV in the laboratory system),
- cosmic rays.

These sources produce various effects in the detectors. We group them into four classes depending on how they can influence the trigger:

**track** — a set of aligned track segments from several muon stations,

**track segment** — a set of aligned hits within one muon station,

**correlated hit** — caused by a genuine muon or its secondaries,

**uncorrelated hit** — caused by phenomenon not related to a given muon.

Dominant sources of each class of background are given in Table 8.1.

**Table 8.1:** Background classification.

| detected objects  | caused by       | dominant source                                          |
|-------------------|-----------------|----------------------------------------------------------|
| tracks            | low $p_T$ muons | b- and c-quark decays, $\pi$ and K decays                |
| track segments    | hadrons         | punch-through and back-splashes                          |
| correlated hits   | electrons       | muon bremsstrahlung, $\delta$ -rays, $e^+e^-$ production |
| uncorrelated hits | electrons       | thermal neutrons $\rightarrow \gamma \rightarrow e$      |

## Muons

Muons come from several sources:

1. Proton-proton interactions
  - (a) decays of heavy objects like W, Z, top, higgs, etc.,
  - (b) b- and c- quark decays,
  - (c) decays of hadrons composed with quarks u, d and s (mainly  $\pi$  and K),
  - (d) punch-through of hadronic showers.
2. Beam losses because of the limited LHC aperture (sometimes called *beam halo muons*).
3. Cosmic rays.

The most difficult background is genuine low  $p_T$  muons. It is difficult to distinguish soft punch-through muons from other charged particles produced in hadronic showers. Therefore, we will discuss them together with punch-through hadrons.

Muons of type 1a and 1b together are often called **prompt muons**, because they are produced very close to the pp vertex. The life time of the longest living b-mesons, expressed as  $c\tau$ , is no longer than  $500 \mu\text{m}$ . Even taking into account the relativistic dilation most of the particles containing b or c quark will decay within 1 cm from the vertex. The decays of b- and c-quarks dominate the rate of prompt muons (see Fig. 2.4, where b and c decays are included in the “minimum bias” set).

Prompt muons dominates the overall muon rate except for very low  $p_T$  ( $< 5 \text{ GeV}/c$ ) where **muons from  $\pi$  and K decays** become more important, as shown in Fig. 8.13.



**Fig. 8.13:** Rates of muons entering the Muon System.

A typical rate of **cosmic muons** on the ground level is about  $200 \text{ Hz}/\text{m}^2$ . One can expect a reduction factor of approximately one hundred with respect to ground level cosmic rate. Hence the local rate is  $\sim 2 \text{ Hz}/\text{m}^2$  and it is completely negligible compared to other sources. Since the detector cross section has roughly  $22 \times 15 \text{ m}^2$  the total rate of cosmic muons crossing it is about 700 Hz. Only very few of the cosmic muons will have a chance to give a trigger because it requires the tracks to be pointing to the vertex. Most of them will create only very low  $p_T$ , poor quality track segments.

**Beam halo muons.** The limited aperture of the LHC causes some beam losses. Particles deviating from the beam center will interact with machine elements producing many secondaries. Most dangerous of them are energetic muons because of their ability to penetrate matter. They will enter the experimental hall and traverse the detector almost parallel to the beam. They have a very small probability to cause the trigger because they do not point to the vertex. In the endcap

chambers, however, they might be seen as local track segments. Estimated rate of halo muons varies from  $10^{-2}$  Hz/cm $^2$  to 1 Hz/cm $^2$  depending on the distance from the beam.

The **total rate of muon hits** is shown in Fig. 8.14 as a solid line. It is a sum of all contributions discussed above.



**Fig. 8.14:** Hit rates in muon chambers due to muons (solid line), hadronic punchthrough/backsplashes (open circles) and thermal neutrons (full circles).

## Hadrons

**punch-through.** The thickness of the CMS calorimeters varies from 11 to 15 interaction lengths  $\lambda$ . This is enough to contain most of hadronic showers caused by energetic hadrons. However, in some cases particles from the tails of the showers may enter muon detectors. This phenomenon was extensively studied experimentaly in the early stage of the LHC R&D in a dedicated experiment RD5 [8.17]. The results were used to tune Monte Carlo simulations.

When highly energetic hadron hits one of the forward detector elements some products of the hadronic shower can be emitted at large angles and travel towards muon chambers. This effect is often called a **backsplash**, although it has very similar nature to punch-through and the division is somewhat artificial. The total rate of charged hadrons due to punch-through and backsplashes of particles from pp interactions is shown in Fig. 8.14 as open circles.

### Uncorrelated electrons from neutrons ("neutron background")

The very last product of hadronic showers are thermal neutrons. They cannot cause hits in detectors by themselves. However, they can be captured by nuclei and produce photons by deexcitation. Such a photon can in turn create an  $e^+e^-$  pair eventually causing hits in detectors. If the capture happens in iron or hydrogen the photons have energy of 2-8 MeV which may result in electrons penetrating several chamber layers. For rough estimates, one can assume that the flux of electrons causing hits is  $\sim 100$  times lower than the photon flux, which is in turn  $\sim 10$  times lower than the flux of neutrons. The thermal neutrons behave like a gas filling all the experimental hall. They can travel long distances even in dense matter and therefore it is difficult to shield them out.

There is another mechanism through which neutrons can produce detector hits. Elastic neutron-proton collision can give some kinetic energy to the proton which can be then registered by the detector. It has been shown [8.18], however, that the hit rate due to this effect is negligible compared to the  $n \rightarrow \gamma \rightarrow e$  mechanism. The total rate of hits due to neutrons is shown in Fig. 8.14 as full circles.

### Electrons correlated with muons ("muon radiation")

Muon traversing matter can lose its energy by four processes

- ionization (including delta ray production),
- bremsstrahlung, i.e. photon emission,
- direct  $e^+e^-$  pair production,
- nuclear interactions.

Probability of the last one can be neglected compared to others. Probability of the first three processes is given in Fig. 8.15 as a function of muon momentum and energy of secondaries. The most probable effect is an emission of a soft delta ray. Probability of hard ( $> 1$  GeV/c) electron or photon emission is relatively low, but such particle usually develops entire electromagnetic shower which can disturb seriously the muon measurement. Probability of such shower in one of four muon stations is about few percent for muons above 100 GeV/c and it exceeds 10 % for 1 TeV/c muons.



**Fig. 8.15:** Production of electrons by muons traversing iron [8.14].

### 8.4.3 Muon isolation

Muons from decays of heavy objects ( $W$  and  $Z$  bosons, SUSY higgses  $A$ ,  $H$  and  $h$ , etc.) are usually isolated, whereas those from quark decays are accompanied by jets. One can improve signal to background ratio by vetoing muons surrounded by significant energy deposit in calorimeters. Muons can be tagged as “isolated” or “nonisolated” by the Global Muon Trigger using quiet bits from the Calorimeter Trigger. One quiet bit is associated with a calorimeter region of  $\Delta\eta \times \Delta\phi = 0.35 \times 0.35$ . This is usually too small to contain full energy of a jet. Therefore isolation algorithms based on larger area were also studied —  $1\times 2$ ,  $2\times 2$  and  $3\times 3$  regions. A muon was considered isolated if the entire area of the sliding window centered on the muon was quiet. Two signal examples were chosen: an inclusive  $W \rightarrow \mu$  sample and  $A \rightarrow \mu\mu$  sample. Background was represented by  $b\bar{b}$  pairs. The simulation was done with Pythia/CMSIM, with and without pileup in order to study low and high luminosity cases [8.19]. Only the barrel ( $|\eta| < 1$ ) was considered.

Efficiency of single isolated muon trigger for signal and background is shown in Fig. 8.16 as a function on the quality bit threshold  $E_T$ . On the basis of this figure the  $E_T$  threshold was set to ensure 90 % efficiency for muons from  $A$  higgs decays. It corresponds to 95 % or higher efficiency for muons from  $W$ . Selected  $E_T$  values are given in Table 8.2. The efficiency for background muons is shown in Fig. 8.17 as a function of muon  $p_T$  threshold.

It is seen that the isolation is not very effective for low  $p_T$  muons. This is because they are usually accompanied by soft, wide jets with energy per region less than the isolation threshold. The isolation becomes effective for muons with  $p_T > 15$  GeV/c. At low luminosity  $2\times 2$  and  $3\times 3$  algorithms give equally good results. At high luminosity the  $2\times 2$  algorithm is the best. More details can be found in Ref. [8.19].



**Fig. 8.16:** Fraction of accepted events as a function of energy threshold at high luminosity. The upper and lower curves with error bars represent the  $W$  and  $A^0$  signal respectively. The four histograms in each plot correspond to muons from b-quark decays with  $p_T > 5$  (the upermost), 15, 20 and 25  $\text{GeV}/c$ .

**Table 8.2:** Muon isolation thresholds  $E_T$  for different algorithms.

| algorithm | $\Delta\eta \times \Delta\phi$ | $E_T$ for $10^{33} \text{ cm}^{-2} \text{ s}^{-1}$ | $E_T$ for $10^{34} \text{ cm}^{-2} \text{ s}^{-1}$ |
|-----------|--------------------------------|----------------------------------------------------|----------------------------------------------------|
| 1x1       | $0.35 \times 0.35$             | 6 GeV                                              | 7 GeV                                              |
| 1x2       | $0.35 \times 0.70$             | 6 GeV                                              | 7 GeV                                              |
| 2x2       | $0.70 \times 0.70$             | 6 GeV                                              | 7 GeV                                              |
| 3x3       | $1.05 \times 1.05$             | 7 GeV                                              | 8 GeV                                              |



**Fig. 8.17:** Single isolated muon trigger efficiency for background ( $b\bar{b}$ ) muons.

#### 8.4.4 Trigger efficiency and rates

The Muon Trigger performance was tested using tools described in Section 8.4.1. Simulated efficiency is shown in Fig. 8.18 as a function of  $\eta$ . It is seen that it exceeds well 95 %.



**Fig. 8.18:** Muon trigger efficiency vs  $\eta$ .

Trigger rates coming from various physics channels are shown in Fig. 8.19. Output rate can be adjusted to a desired value by changing the  $p_T$  threshold. Proposed values are given in Table 8.3.



**Fig. 8.19:** Single muon trigger rate vs  $p_T$  threshold.

**Table 8.3:** Muon trigger thresholds and rates.

| luminosity        | $10^{33} \text{ cm}^{-2} \text{ s}^{-1}$ |            |         | $10^{34} \text{ cm}^{-2} \text{ s}^{-1}$ |            |         |
|-------------------|------------------------------------------|------------|---------|------------------------------------------|------------|---------|
|                   | 1 $\mu$                                  | 1 $\mu$ +X | 2 $\mu$ | 1 $\mu$                                  | 1 $\mu$ +X | 2 $\mu$ |
| trigger           | 1 $\mu$                                  | 1 $\mu$ +X | 2 $\mu$ | 1 $\mu$                                  | 1 $\mu$ +X | 2 $\mu$ |
| threshold [GeV/c] | 10                                       | 4          | 3, 3    | 25                                       | 5          | 8, 5    |
| rate [kHz]        | 8.7                                      | 2.2        | 1.5     | 8.1                                      | 2.4        | 2.8     |

In order to maximize the physics potential of the detector, the trigger thresholds should be lower than the corresponding off-line cuts. From Table 8.4 one can see that this criterion is fulfilled with large safety margin. The only exception is b-quark physics, because the rate of b-quark production is  $\sim 5$  MHz and one can afford to take only a small fraction of events. However, one can devote some fraction of bandwidth to exclusive channels and thus enable to exploit the

lowest possible  $p_T$  thresholds. For other physics the  $1\mu$  trigger threshold is even lower than  $1\mu+X$  offline cut. This ensures very high efficiency for such triggers.

**Table 8.4:** Offline cuts on muons [GeV/c]

| luminosity | $10^{33} \text{ cm}^{-2} \text{ s}^{-1}$ |          |          | $10^{34} \text{ cm}^{-2} \text{ s}^{-1}$ |          |          |
|------------|------------------------------------------|----------|----------|------------------------------------------|----------|----------|
| trigger    | $1\mu$                                   | $1\mu+X$ | $2\mu$   | $1\mu$                                   | $1\mu+X$ | $2\mu$   |
| SM higgs   | —                                        | 20       | 10, 5    | —                                        | 20       | 20, 10   |
| SUSY higgs | 10                                       | 7        | 5, 5     | —                                        | 20       | 10, 10   |
| sparticles | —                                        | 10       | 10, 10   | —                                        | 20       | 15, 15   |
| exotica    | —                                        | 100      | 20, 20   | 100                                      | 100      | 20, 20   |
| top        | 50                                       | 50       | 30, 4    | 50                                       | 30       | 15, 4, 4 |
| beauty     | 10                                       | 2-4      | 2-4, 2-4 | —                                        | —        | —        |

## 8.5 Muon Trigger for Heavy Ion Runs

### 8.5.1 Low Momentum Threshold

Low  $p_T$  threshold is crucial for quarkonia detection. Its value should be limited only by the muon energy loss in calorimeters. The minimal value of the trigger threshold  $p_T^{\min}$  is plotted in Fig. 8.20 as a function of  $|\eta|$ . Because of Landau fluctuations of the energy lost by muons, different  $p_T^{\min}$  values are obtained for different required efficiencies  $\epsilon$ . For comparison the total momentum  $p^{\min}$  is also plotted. The error bars on the plots correspond to uncertainty of absorber distribution of one nuclear interaction length  $\lambda$ . One can see that in the barrel one can achieve  $p_T^{\min} \approx 4 \text{ GeV}/c$  for  $\epsilon = 90\%$  or  $p_T^{\min} \approx 3.5 \text{ GeV}/c$  for  $\epsilon = 80\%$ , where  $\epsilon$  is the single muon trigger efficiency. In the endcap it decreases below 2  $\text{GeV}/c$ . This allows us to explore central  $\Upsilon$ ,  $\Upsilon'$ ,  $\Upsilon'' \rightarrow \mu^+\mu^-$  production with good statistics at all  $p_T(\Upsilon)$ , down to  $p_T(\Upsilon)=0$ .

Quarkonia decay in muon pairs. Efficiency for detecting both muons at L1 is rather low at low  $p_T$ . Triggering on anyone of the two muons would increase quarkonia efficiency significantly if the single muon L1 rate at the minimal threshold is acceptable. This is illustrated in Fig. 8.21.

### 8.5.2 Trigger Rates

In the case of heavy ion collisions the muon trigger rate is dominated by three major contributions:

- prompt muons (mainly from c- and b-quark decays),
- muons from hadron decays (mainly  $\pi$  and K),
- hadronic punch-through.



**Fig. 8.20:** Minimal muon trigger thresholds  $p_T^{\min}$  and  $p^{\min}$  for various required efficiencies as a function of muon pseudorapidity.



**Fig. 8.21:** Trigger efficiency in the case of 1- and 2-muon events ( $|\eta| < 1.5$ )

Detailed simulation has been performed [8.20] in order to evaluate those rates in the case of minimum bias Pb-Pb collisions. Results are shown in Fig. 8.22 for the nominal luminosity of  $10^{27} \text{ cm}^{-2} \text{ s}^{-1}$ . In the pseudorapidity range of  $|\eta| < 1.5$ , useful for quarkonia study, one can expect a single muon trigger rate of  $\approx 500$  Hz with almost equal contributions from prompt muons (c- and b-quark decays) and from hadronic punch-through + decays. This is well below the 5 kHz limit imposed by the DAQ bandwidth (Sec. 2.5). It allows us to run requesting a single muon at the first level trigger, which ensure high efficiency for  $\Upsilon \rightarrow \mu^+ \mu^-$ . The second muon could be requested at L2, which would reduce the rate down to 60 Hz — already acceptable for the mass storage. This strategy is summarized in Fig. 8.23.



**Fig. 8.22:** Expected muon trigger rates for minimum bias Pb-Pb events.

The strategy described above works well for Pb-Pb collisions and it may work (with some modifications) in the Sn-Sn case. For lighter ions, however, the single muon rates are much higher (see Table 8.5) and one has to require two muons already at the first level. The price for this is an efficiency for low  $p_T$  muon pairs of 80% or even lower. Fortunately this is compensated by much higher luminosities which ensures to collect high enough statistics in spite of low efficiency.

**Table 8.5:** Single muon trigger rates for different ion species [8.20].

|                                               | pp        | O O                 | Ar Ar               | Kr Kr               | Sn Sn               | Pb Pb     |
|-----------------------------------------------|-----------|---------------------|---------------------|---------------------|---------------------|-----------|
| luminosity [ $\text{cm}^{-2} \text{s}^{-1}$ ] | $10^{34}$ | $3.1 \cdot 10^{31}$ | $1.0 \cdot 10^{30}$ | $6.6 \cdot 10^{28}$ | $1.7 \cdot 10^{28}$ | $10^{27}$ |
| 1μ trigger rate [kHz]                         | 190       | 120                 | 21                  | 5.7                 | 2.9                 | 0.5       |



Fig. 8.23: Trigger and physics rates for minimum bias Pb-Pb events.

## 8.6 System robustness

Designing LHC detectors is a challenge in many aspects. The energy of collisions is an order of magnitude higher than in previous accelerators (TeVatron). Therefore there is significant uncertainty in the extrapolation of cross sections. Collision frequency is 4 times higher (HERA). High clock frequency might be a source of crosstalks. The 25 ns bunch spacing means that signals from different bunch crossings will be mixed in some detectors. The luminosity is  $\sim 10$  times higher (TeVatron). This will cause a pileup of 10-20 pp interaction in a single bunch crossing. Radiation level will also be much higher than in previous experiments. Large multiplicities of particles implies fine granularity of detectors and hence huge number of channels. High energy of particles requires deep calorimeters and large tracking detectors with strong magnetic fields. Strong

radiation together with large size of detectors and huge amount of surrounding electronics cause access to the detector to be very limited. Some parts can be accessed only once a year for a short period. The situation becomes similar to experiments in space where there is no possibility to repair after the launch. On the other hand the huge amount of diverse electronics distributed over large area makes the LHC experiments not comparable to any other device. All those challenges requires very careful design in order to make the system robust.

Operation of the system can be distorted by two kinds of causes:

- bottlenecks within the system,
- malfunctioning of system elements.

A bottleneck could be a limited bandwidth, finite buffer size, finite two hit or two track resolution, etc. The size and frequency of errors caused by bottleneck depends mainly on the data stream itself, which is determined by

- signal physics (luminosity, correlations),
- background physics (in the case of muon detector it is punch-through, thermal neutrons, muon radiation, muons from the accelerator, etc.),
- state of the detector (pulse shapes, noise levels, timing, efficiencies, etc.).

Malfunctioning of system elements can be caused by external disturbances, like:

- electromagnetic disturbances (noise, crosstalk, RF, bad grounding, faulty power supply, etc.),
- radiation (radiation damage, single event upsets),
- temperature, humidity,
- mechanical stresses.

All those reasons can cause different kind of faults:

- transient (the data are distorted as long as the disturbance exists),
- permanent (the data are distorted until the system is reset),
- destructive (the element still work, but with lower performance),
- fatal (the element is faulty forever).

In order to limit the impact of all kind of failures each system should have the following features:

- redundancy,
- immunity to single, local failures (e.g. a single link should not serve  $> 1\%$  of a detector),
- possibility of (possibly remote) exchange of faulty element with a spare one,
- possibility of changing the algorithm (e.g. masking out noisy channel or requiring more stringent coincidence).

All those precautions are widely used in the CMS Muon Trigger System. First of all it consists of two complementary subsystems, DT/CSC and RPC, which can work independently. Within each subsystem the redundancy is provided by using four muon stations. In each station

there are 8-12 DT layers or 6 CSC layers. A brief summary of possible system bottlenecks and the tools to handle them is given in Tables 8.6 - 8.8. The second column of each table gives the location of each object abbreviated as follows: **D** - on the detector, **P** - on the periphery of the iron yoke, **C** - in the Counting Room. From the number of objects of each type one can judge what fraction of the detector is covered by a single object and how narrow is the bottleneck due to the bandwidth limit shown in the fourth column. The fifth column contains tools implemented to handle different kind of noise, including physics background and artefacts of the algorithms (e.g single track seen by two processors). In addition to the tools listed in the tables there is a possibility of masking out every channel at the input of each object. The last column tells what kind of redundancy is implemented in order to make the system immune against failure of the given object. This is usually provided by majority coincidences of M out of N elements, shortly denoted as “M/N”. For example a dead channel introduces local inefficiency in one layer, but the trigger is still efficient because of majority coincidence of 3/4 layers. The last column does not concern the Counting Room electronics, where a replacement can be done easily.

**Table 8.6:** Drift Tube Trigger chain

| object           | loc | quantity | bandwidth                 | noise handling                       | failure handling |
|------------------|-----|----------|---------------------------|--------------------------------------|------------------|
| channel          | D   | 200 000  | 1 hit / bx                | discrim. threshold                   | 3/4 layers       |
| BTI              | D   | 50 000   | 1 track / bx              | fit tolerance                        | 1/2 superlayers  |
| TRACO            | D   | 4 400    | 2 tracks / 2 bx           | matching tolerance                   | 2/4 stations     |
| Trigger Server   | D   | 250      | 2 tracks / 2 bx           | sorting priority                     | 2/3 TSM chips    |
| Sector Collector | P   | 60       | 2 tracks / 2 bx / station | output bandwidth = input bandwidth   | several links    |
| Track Assembler  | C   | 72       | 2 tracks / 2 bx           | matching tolerance, sorting priority |                  |
| Wedge Sorter     | C   | 12       | 4 tracks / bx             | quality threshold                    |                  |
| Barrel Sorter    | C   | 1        | 4 tracks / bx             | quality threshold                    |                  |

All CMS muon trigger subsystems use discriminated detector signals, therefore the first line of defence against noise is obviously the discriminator threshold. In the case of Drift Tubes possible inefficiency due to dead or masked out channels is handled by the BTI making majority coincidence of hits in 3 out of 4 layers. Tolerance on this coincidence is programmable, so it can be tighten in case of high noise. Similarly TRACO has programmable tolerance on matching tracks from two superlayers, but also single (not matched) tracks can be accepted. Information about quality of coincidences in BTI and TRACO is sent to the Trigger Server which selects the 2 highest rank tracks per chamber. Since the rank is a programmable function of qualities and estimated  $p_T$ , it provides a powerful tool for background suppression. Because one Trigger Server covers the entire chamber, its potential failure could have relatively large impact on the system. Therefore it is made of 3 identical chips in such a way, that any two can serve the entire chamber. The last

critical element is Sector Collector providing the optical transmission to the Counting Room. It does not create any bandwidth bottleneck, so the noise is not an issue. However, its possible failure could affect an entire sector and therefore it is located in a more accessible place. Double background suppression is done at Track Assembler. First, the tolerance on matching track segments from different stations is programmable. Second, the priority of selecting 2 best tracks per  $30^\circ$  sector is a programmable function of  $p_T$  and matching quality. Also the final stages of sorting are based on programmable rank. In addition, low quality tracks can be removed from sorting.

In the case of CSC the redundancy is provided by majority coincidences of 2 out of 3 (or 4) muon stations and 4 out of 6 layers in each station. The Local Charge Tracks (LCT) are formed as patterns of hits in a given chamber. One can reduce background in a well controlled way by restricting the set of allowed patterns. If it is still too high one can accept only 5 out of 6 or 6 out of 6 coincidences. Further rate reduction is performed by Muon Port Card. It selects 2(3) highest quality track per  $20^\circ$  ( $60^\circ$ ) sector in ME1 (2,3,4). Again, the selection criteria are programmable. Track Finder processing is similar to that of DT. Track segments from different stations are matched to each other with programmable tolerance. Three highest rank candidates per sector are selected. Again, the rank is a programmable function of  $p_T$  and quality. This rank is also used for final sorting.

**Table 8.7:** CSC Trigger chain

| object                   | loc    | quantity            | bandwidth                      | noise handling                          | failure handling                   |
|--------------------------|--------|---------------------|--------------------------------|-----------------------------------------|------------------------------------|
| anode wire group         | D      | 160 000             | 1 hit / bx                     | discrim. threshold                      | 4/6 layers                         |
| cathode strip            | D      | 200 000             | 1 hit / bx                     | discrim. threshold                      | 4/6 layers                         |
| Anode LCT                | D      | 432                 | 2 tracks / bx                  | patterns, quality                       | 2/3(4) stations                    |
| Cathode LCT              | P      | 432                 | 2 tracks / bx                  | patterns, quality                       | 2/3(4) stations                    |
| Port Card ME1<br>ME2,3,4 | P<br>P | 36<br>$3 \times 12$ | 2 tracks / bx<br>3 tracks / bx | sorting priority<br>sorting priority    | 2/3(4) stations<br>2/3(4) stations |
| Sector Processor         | C      | 12                  | 3 tracks / bx                  | matching tolerance,<br>sorting priority |                                    |
| Muon Sorter              | C      | 1                   | 4 tracks / bx                  | sorting priority                        |                                    |

The case of RPC Trigger is somewhat different as it does not have the intermediate step of forming tracks in single stations. On the other hand very good time resolution permits some background reduction by narrowing the input gate. The next step is already using the data from all stations. The tracks are formed comparing observed hits with preprogrammed patterns. Like in the case of CSC, careful pattern selection and quality tagging are powerful tools for background suppression. The final sorting is similar to that of DT and CSC.

The last but not least line of defence against background is at the Global Muon Trigger. Its programmable logic permits very flexible use of quality information. Purity of the data can be

largely improved by matching muon candidates from DT/CSC and RPC subsystems. Overall efficiency can be increased by accepting candidates seen only in a single system. Accepting only high quality singles ensures that the purity is not compromised significantly. Details of the algorithm can be adjusted to the current situation (detector status, background rate) in each detector region.

**Table 8.8:** RPC Trigger chain

| object             | loc | quantity | bandwidth                      | noise handling                       | failure handling |
|--------------------|-----|----------|--------------------------------|--------------------------------------|------------------|
| RPC strip          | D   | 200 000  | 1 hit / 4 bx                   | discrim. threshold,<br>gate width    | 3/4 stations     |
| Master Link Board  | P   | 732      | 8 groups ×<br>12 strips / 8 bx |                                      | 3/4 stations     |
| Pattern Comparator | C   | 4752     | 1 track / bx                   | patterns, quality                    |                  |
| Trigger Board      | C   | 396      | 4 tracks / bx                  | sorting priority<br>sorting priority |                  |
| Trigger Crate      | C   | 33       | 4 tracks / bx                  | sorting priority                     |                  |
| Barrel Sorter      | C   | 1        | 4 tracks / bx                  | sorting priority                     |                  |
| Endcap Sorter      | C   | 1        | 4 tracks / bx                  | sorting priority                     |                  |

The key ingredients implemented all the way through the trigger chain are quality information and programmable selection criteria (patterns, ranks, etc.). Wherever possible explicit cuts are avoided in favor of sorting and selection. This makes possible tuning of the algorithms to a very wide range of conditions. High redundancy and flexibility of the algorithms is what makes the CMS Muon Trigger a very robust system.

## References

- [8.1] CMS Muon Technical Design Report, **CERN/LHCC 97-32**, 1997.
- [8.2] M. Benettoni et al., Nucl. Instr. and Meth **A 410** (1998) 133.
- [8.3] M. Aguilar-Benitez et al., Nucl. Instr. and Meth **A 416** (1998) 243.
- [8.4] F. Gasparini et al, Nucl. Instr. and Meth **A 336** (1993) 91.
- [8.5] A. Kluge, T. Wildschek, **CMS Note 1997/091**.
- [8.6] A. Kluge, T. Wildschek, **CMS Note 1997/092**.
- [8.7] A. Kluge, T. Wildschek, **CMS Note 1997/093**.
- [8.8] G. M. Dallavalle et al, **CMS Note 1998/042**.
- [8.9] M. Kloimwieder, **CMS Note 1999/054**.
- [8.10] M. Andlinger et al., Nucl. Instr. and Meth. **A 370**, 389, (1996).

- [8.11] T. Sjostrand et al., *High Energy Physics Event Generation with PYTHIA 6.1*, **hep-ph/0010017**, LU TP 00-30; T. Sjostrand, Computer Physics Commun. **101** (1997) 232.
- [8.12] N. Neumeister et al., *Monte Carlo simulation for High Level Trigger studies in single and di-muon topologies*, **CMS IN 2000/053**.
- [8.13] CMS Simulation Package CMSIM — Users' Guide and Reference Manual, <http://cmsdoc.cern.ch/cmsim/cmsim.html>
- [8.14] *GEANT - Detector Description and Simulation Tool*, CERN Program Library Long Writeup **W5013**.
- [8.15] P.A. Aarnio et al., *FLUKA86 user's guide*, CERN TIS-RP/168 (1986);  
P. A. Aarnio et al., *Enhancements to the FLUKA86 program (FLUKA87)* CERN-TIS-RP/190 (1987), A. Fass`o et al., *FLUKA: present status and future developments*, Proc IV Int. Conf. on Calorimetry in High Energy Physics, La Biodola, Sept 20-25, 1993, Ed. A. Menzione and A. Scribano, World Scientific, p. 493 (1993);  
A. Fass`o et al., *FLUKA: performances and applications in the intermediate energy range*, Specialists' Meeting on Shielding Aspects of Accelerators, Targets and Irradiation Facilities, Arlington, Texas, April 28-29, 1994.
- [8.16] CMS Reconstruction Software: The ORCA Project, **CMS IN-1999/035**.
- [8.17] *Status Report of the RD5 Experiment*, **CERN-DRDC/93-49**;  
C. Albajar et al, Z.Phys. **C 69** (1996) 415;  
C. Albajar et al, Nucl. Instr. and Meth **A 386** (1997) 421.
- [8.18] M. Huhtinen, *Optimization of the CMS forward shielding*, **CMS Note 2000/068**.
- [8.19] C. Albajar and G. Wrochna, *Isolated Muon Trigger*, **CMS Note 2000/067**.
- [8.20] G. Wrochna, **CMS Note 1997/089**.

# 9 Drift Tube Local Trigger

## 9.1 Requirements

The muon trigger provides the identification of the muon and a good measurement of its curvature that enables a sharp cut on muon momentum for rate reduction. These tasks are separated in the drift tubes trigger: a local algorithm provides the muon track segments (trigger primitives) inside each station and a regional algorithm links these segments extracting the vector parameters of each identified muon. The trigger requirements are so stringent that the barrel muon detector layout was designed around the trigger.

The local trigger must immediately resolve time ambiguities and therefore it is required to perform the muon parent bunch crossing identification. Thus each track segment is uniquely assigned to a LHC bunch crossing as soon as it is found. This process must be dead time free in order to avoid any event loss. Besides the trigger dead areas must be negligible, leading to a concept of redundant design.

The bunch crossing alignment relies on good determination of the drift times and the precise positioning of the wires:

- the analog front end must have a time resolution better than 2 ns and the discriminated signal input to the trigger chain must have a minimal duration of 5 ns;
- the wires inside a quadruplet must be positioned within 100  $\mu\text{m}$ ;
- shifts between quadruplets must be contained within 500  $\mu\text{m}$ , although a global correction can be applied;
- the chambers must be aligned within few millimeters, matching the alignment requirement due to reconstruction of 300  $\mu\text{m}$ .

Alignment corrections will be included in the look-up tables used for the conversion of the trigger data from the local reference system to the CMS one.

The existence of punchthrough muons and the need to find dimuons imposes the constraint to be able to find more than one track inside a single station. The two muons have usually different momenta and directions: hence if they are very close inside one station they separate at different stations. A few centimeters wide zone insensitive to dimuons is therefore allowed inside each station.

The momentum cut is more effective as higher resolutions are available at the trigger primitive level. The running algorithms should try to push their resolution as close as possible to the muon chambers resolution.

The implemented algorithm must be flexible and designed to reduce unexpected background and noise. Therefore coincidences, thresholds and quality codes should be available to easily modify its flow according to different experimental conditions.



**Fig. 9.1:** Overview of one barrel muon station electronics layout.

## 9.2 System Overview

The block scheme of the first level drift tubes muon trigger primitive generator [9.1][9.2][9.3] inside each muon station is shown in Fig. 9.1.

Each muon chamber is instrumented in the transverse ( $r,\phi$ ) plane (hereafter  $\phi$  view) and in the longitudinal ( $r,\theta$ ) plane (hereafter  $\theta$  view).

The front-end trigger device is called Bunch and Track Identifier (BTI): it performs a rough track reconstruction within each SL and uniquely assigns the parent bunch crossing of the candidate track. The device was realized and prototypes were recently tested.

The BTI is followed by a Track Correlator (TRACO) that is required to associate portions of tracks in the same chamber combining groups of BTIs of the  $\phi$  view among them. The TRACO enhances the angular resolution and produces a quality hierarchy of the triggers.

TRACO trigger data are transmitted to the chamber Trigger Server (TS). Actually the TS is composed by a set of devices (Track Sorter Slave (TSS), Track Sorter Master (TSM) and Trigger Server Theta (TST)) whose purpose is performing track selection in a multitrack environment. The TS of the  $\phi$  view selects two tracks (looking for the lowest bending angle) among



**Fig. 9.2:** DTBX first station Mini-Crate location.

all tracks transmitted by the TRACOs; the TS of the  $\theta$  view sends the wired-or of the BTI trigger outputs to TRACOs for trigger qualification purposes and codes the triggers in a 16 bits string giving all the tracks pointing to the vertex with a position resolution of 16cm and a quality marker.

Data from the four muon stations in each CMS sector are conveyed using LVDS links towards a Sector Collector (SC) that codes the trigger information (track position, bending angle, quality bits) transmitting it to the Regional Muon Trigger using optical links.

### 9.2.1 Drift Tube Local Trigger Layout

Chamber trigger and readout electronics is lodged in Mini-Crates (MC) mounted on the front side of the chamber, inside the C profile surrounding the honeycomb layer. Fig. 9.2 shows the MC located inside the chamber with front panel removed.

Low and high voltage power supplies are lodged on the balconies in crates powering half a wheel each. MCs trigger and readout data in each sector are sent to the Sector Collector (SC) unit that is integrated in the fourth station’s MC. Chamber trigger and readout electronics is arranged in 128 channel units; only for the first and the fourth stations a 32 channel unit is necessary for a better matching with chamber channel number.

Inside the MC a Drift Tubes Chamber Control Board (DTCCB) takes care of electronics setup, monitoring and of many control and test functions. One TTC receiver chip is lodged on the DTCCB for clock and broadcast signals distribution. Each DTCCB is connected via a dedicated optical fiber to the Drift Tubes Control Master (DTCM) crate in counting room (see Chapter 9.7). As back-up channel a copper cable connects in parallel all MCs belonging to each half wheel to a Drift Tubes Wheel Control Board (DTWCB) sitting on the balconies. Each DTWCB is connected to the DTCM via dedicated optical fibers. MCs are connected to front-ends of both  $\phi$  and  $\theta$  SLs and to alignment and RPC electronics for remote controls via two dedicated I2C buses. MCs are cooled by cold water flowing in tubes extruded in the MC aluminum profile along the full length.



**Fig. 9.3:** Layout of PHITRB128, the 128 channels board for DTBX transverse plane trigger

## 9.2.2 Trigger Boards

Trigger electronics is grouped in 128-channel units. An additional 32-channel unit is needed in the first and fourth stations to match the channel number. There are three types of different boards: two for the  $\phi$  view, a 128-channel unit called PHITRB128 and a 32-channel one called PHITRB32. PHITRB128 contains 32 BTIs assembled in 8 multi-chip modules (BTIM) connected to 4 TRACOs and one TSS (Fig. 9.3), while PHITRB32 contains 2 BTIMs, one TRACO and one TSS (Fig. 9.4).

Only a 128-channel unit called THETATRB containing 32 BTIs assembled on 8 BTIMs exists for the  $\theta$  view (Fig. 9.5).

Front-end signals are received through the faced Readout Board, where LVDS to CMOS translators are located, while trigger data are sent to the TS via a fine-pitch 40 lines flat cable. Fine pitch lateral connectors are used for neighboring unit communications of detector signals and of chamber control signals coming from the central Server and Control Unit (SCU).

Trigger Board power supply is internally stabilized to 3.3V by a low drop regulator. The internal regulator has power supply protection circuitry with over and under voltage tolerance up to  $\pm 40\text{V}$  and overcurrent limit with a “retriggerable fuse” function useful for latch-up protection.

Each Trigger Board can be shut off independently thanks to an “intelligent” on board regulator and input signal isolation logic. Local power supply measurement and control is accomplished by the Control Board that takes care of device turn-on and turn-off sequences.



**Fig. 9.4:** Sketch of PHITRB32 layout, the 32 channels board for DTBX transverse plane trigger



**Fig. 9.5:** Layout of THETATRB, the 128 channels board for DTBX longitudinal plane trigger



**Fig. 9.6:** Sketch of the layout of the Server Board in the SCU of DTBX mini-crane electronics.

Clock is distributed using Pseudo-ECL signals on twisted pair cables. One clock input with low skew is foreseen per Trigger Board. A low-skew clock distribution tree, consisting of PLLs and multiple drivers, is implemented in each Trigger Board. The maximum allowed skew from TTC output to ASIC input is about 1ns.

Trigger Boards can be tested at many levels using a JTAG bus and on-board self-test features. The Control Board can address a single Trigger Board for JTAG access, up to a maximum of 16 units, in order to reduce the component chain. A production assembly test is foreseen using boundary scan and on-board functional tests using event emulation capabilities integrated in BTI test logic. Once the Trigger board is programmed in Test Mode, emulation data, consisting of the delays in 12.5 ns steps of event hits, are downloaded to all BTIs via JTAG. A trigger command issued via JTAG and internally synchronized with the 40 MHz clock starts the test sequence. Trigger data can be latched at Trigger Board output or read out via JTAG, downloading the content of snap registers integrated in all ASICs. This kind of functional test can be repeated for assembled MCs allowing a full test of trigger functionality without any external test pattern generator.

### 9.2.3 Server Board

There is one Server Board per MC: the Track Sorter Master consisting of three chips is lodged on it. The Server Board is part of the Server and Control Unit.

This unit is powered with three separated 3.3V lines, one per ASIC, in order to maintain the highest possible redundancy. Overcurrents, on power supply lines, are protected by precise current monitors with fast shut off capability.

Server Board clock phase is adjustable with respect to the Trigger Board clocks to take care of signals synchronization. The Server Board layout is shown in Fig. 9.6.

## 9.3 Bunch and Track Identifier

The BTI is directly interfaced to the front-end of the muon chamber system. It generates a trigger at the alignment of the hits produced in the group of drift tubes interested by the muon. The coincidence of these hits happens at fixed time after the muon traversed the array of drift tubes allowing bunch crossing identification. The BTI can extract the full track information (position and direction).

### 9.3.1 Working Principle

The Bunch and Track Identifier is the implementation of a trigger device based on the generalized mean-timer method [9.4]. It was explicitly developed to extend the technique to work on groups of four layers of staggered drift tubes, aiming to the identification of the tracks giving signals in at least three out of the four measurement planes.

This method relies on the fact that the particle path is a straight line and the wire positions along the path (the measurement points) are equidistant. Therefore, considering the drift times of any three adjacent planes of staggered tubes (e.g. cells in layers A, B and C of Fig. 9.7), the relation

$$T_{MAX} = \frac{T_A + 2T_B + T_C}{2}$$

where  $T_{MAX}$  is the maximum drift time to the wire, holds independently of the track impact point and angle of incidence. Actually the BTI digitizes the time  $T_s$  after particle detection at 80 MHz frequency and at every clock count computes the apparent drift time  $T = T_{MAX} - T_s$ , where  $T_{MAX}$  is a programmable parameter depending on drift velocity. This calculation gives the real drift times only at the time  $T_{MAX}$  after particle crossing. Therefore the digitized times have values satisfying the relation only at that clock count, while the relation does not hold true at any other time.

This constant time difference, between the particle crossing and the detection of the relation validity, allows the identification of the parent bunch crossing.

In fact it occurs that at the time  $T_{MAX}$  after muon crossing the drift times are aligned: i.e. the hits form an image of the muon track, thus allowing the extraction of the full track information (track impact position and direction).

Extending the method to four layers, implies that the bunch crossing identification is possible even if the drift time of a tube is missing, due to inefficiency, or wrong, due to the emission of a  $\delta$ -ray masking the good hit, since there are still three useful cells giving the minimum requested information. The mean timer method is also insensitive to all uncorrelated single hits and it is therefore well-suited to a high radiation environment.

### 9.3.2 Algorithm Description

Each BTI is connected to nine wires allocated as shown in Fig. 9.7. Each SL is equipped with one BTI every four wires and therefore the BTIs are overlapped by five wires, i.e. the next BTI will start from cell labeled 5 in Fig. 9.7. This overlap assures the redundancy needed to limit the inefficiency in case of a BTI failure.

The evaluated parameters are the position, computed in the SL centre, and the angular  $k$ -parameter  $k = h \tan \psi$ , with  $\psi$  being the angle of the track with respect to the normal to the chamber plane in the transverse projection and  $h = 13\text{mm}$  being the distance between the wire planes.

The actual BTI candidate track finding algorithm computes in parallel several track patterns hypotheses: a pattern is identified by a sequence of wire numbers and labels stating if the track crosses the tube on the right or on the left of the given wire (e.g. in Fig. 9.7 the track corresponds to the pattern 5L3R6L4R). Any given pattern includes six couples of planes (AB, BC, CD, AC, BD, AD), each one providing a measurement of the position (through a *x-equation*) and of the  $k$ -parameter (through a *k-equation*) of the track.

The equations are computed at every cycle using the hit arrival time with 12.5ns resolution. At any clock cycle the value of a *k-equation* corresponds to a rough measurement of the track direction at that cycle and is time dependent. Therefore each couple included in a pattern



**Fig. 9.7:** BTI geometric layout showing the channels allocation and important parameters.

gives its own measurement of the track direction at every clock cycle: the hits are aligned when, after applying a couple dependent proportional factor, the values of the k-parameter computed for each couple are equal.

Hence at every clock cycle the whole set of *k-equations* is computed and a BTI trigger is generated if at least three of the six k-parameters associated to any of the patterns are in coincidence. The coincidence of the *k-equations* values is verified within a programmable tolerance window. This tolerance is defined according to the resolution of each couple that in turn depends on the distance between the wires and was chosen to allow a maximum cell linearity error equivalent to 25ns. The coincidence allows the bunch crossing identification owing to the time-dependence of the *k-equations* value.

If there is a coincidence of all the six k-parameters, the trigger corresponds to the alignment of four hits and it is marked as High Quality Trigger (HTRG), while in any other case, with a minimum of three coincident k-parameters, it is due to the alignment of only three hits and it is marked Low Quality Trigger (LTRG). The angular resolution is track pattern dependent and is generally worse for LTRGs.

If several track patterns give a response, the HTRG is chosen as the triggering track pattern. If there is more than one HTRG or the triggers are all LTRGs, the first one, in an arbitrarily defined order, is selected.

The request of the alignment of any three hits is a substantial source of background, since it introduces effects creating false triggers. There is a probability that the alignment of four hits at some clock step produces the alignment of only three of them at the step just before or after the HTRG signal, thus generating *ghost* LTRG candidate tracks. There is also some probability that a random LTRG could happen at any clock step with some fancy k-parameter due to the left-right ambiguity, that is duplicating the possible choices for every hit. Finally the  $\delta$ -rays produced inside the cell will provide wrong time measurement enhancing the probability of an out of time trigger generated using the wrong measurement.

The noise reduction of the former kind of *ghosts* is obtained issuing the LTRG signal only if at the neighbouring steps there is not any HTRG generated: this mechanism is called Low Trigger Suppression (LTS). The noise reduction of the latter kind of *ghosts* is obtained acting on tolerances in the association phase of the following stages of the trigger. These algorithms do not add any latency to the BTI flow.

The impact position of the muon is not entering the track selection algorithm and it is computed only at the end of the process and only for the selected triggering pattern.

Position and angular resolution depend on the drift velocity and on the sampling frequency of the device. The drift velocity without magnetic field and for the gas mixture foreseen in the drift chambers (Ar/CO<sub>2</sub>-85/15) is 56 $\mu$ m/ns and the sampling frequency is 80MHz. Under these conditions the angle is measured with a least count better than 60mrad and the position is measured with a least count of 1.4mm.

With the present geometric parameters of the chamber the BTI equations are fully covering the angular range up to  $\Psi_{MAX} = \pm 45^\circ$ .

### 9.3.3 Hardware Implementation

The Bunch and Track Identifier was implemented in an ASIC, the best compromise between cost and performance. A block scheme of the BTI chip is shown in Fig. 9.8.

The *Input Shapers* block is the interface to the 9 discriminated wire signals coming from the analog front-ends. An input latch is triggered by the rising edge of the signal, accepting a minimum pulse width of 3ns. Outputs are stretched to a programmable duration; during this time the input shaper is not retriggerable in order to reject high frequency double pulses. This parameter is close to  $T_{MAX}$  and must be set according to drift velocity to minimize tube dead time.

In the *Pattern Logic* blocks all track patterns are evaluated in parallel. For each wire couple the Equations Counters compute the k-parameter and the position of the crossing track according to the reference system introduced in Fig. 9.7.

In the *Pattern Comparator* block the parent bunch crossing is identified looking for a matching of the k-parameters computed by the six equation counters relative to the four crossed tubes. The output is the matching value K, the position X and the quality bit H/L. The use of a quality bit allows the next device in the trigger chain to distinguish between clean tracks (alignment of four hits, i.e. six coincident k-parameters) and potentially wrong triggers (three out of four aligned hits, i.e. at least three coincident k-parameters) where can be classified most of the *ghost triggers*.

If several track patterns give a trigger at the same time, only one is selected by the *Priority Logic*, preferring a high quality track and using a default encoded order in case of multiple triggers with the same quality.

The BTIs send their data to TRACOs that associate the track segments and performs noise rejection. Indeed each BTI of the chamber outer SL is interfaced to three TRACOs. In order to reduce the probability to generate multiple triggers from a single track transmitting three copies of it, an angular filter was included and programmed to send to each TRACO only those tracks that could be fully contained in it. The trigger strobe is split into three signals activated only if the track k-parameter is within the relative programmable window of each TRACO. The quality bit line and the k-parameter/position bus are common to the three TRACOs. The BTI trigger output bus consists of the three trigger strobes, the quality bit and the 6 bit data bus where k-parameter and position are multiplexed at the clock speed.

In order to reduce the number of *ghost triggers*, the cancellation of LTRGs in the range (-1bunch crossing,+8bunch crossings) around a high quality trigger (Low Trigger Suppression) is a very efficient solution. The LTS logic block is latency inexpensive because it was inserted in the *Output Filter* pipeline, in parallel with the window comparator logic.

The *Control Logic* block contains a JTAG interface and a bidirectional parallel interface.

Details on the integrated circuit design and characteristics are reported in [9.5].

**Fig. 9.8:** BTI block scheme.

## 9.4 Track Correlator

The BTI is followed in the electronics chain by a Track Correlator (TRACO) [9.6] that is required to associate portions of tracks in the same chamber relating predefined groups of BTIs among them. The TRACO interconnects the two SLs of the  $\phi$  view. It receives the information from the BTI devices connected to it and tries to find the couple of BTI track segments that fits the best track, linking the inner layer candidates to the outer layer ones.

The introduction of this device is necessary since the BTI is intrinsically a noisy device and therefore a local preselection and a quality certification of the BTI triggers is required. Furthermore the number of BTIs per chamber is around few hundreds and it is not possible to connect together all the channels to perform any preselection at chamber level.

### 9.4.1 Algorithm Description

The number of BTIs connected to a TRACO is limited from the size of the chip and it is determined by the acceptance requirement. The current design connects four BTIs of the inner SL to twelve BTIs of the outer SL allocated as shown in Fig. 9.9, assuring a full coverage until  $\Psi_{MAX}$ .

The algorithm starts selecting, among all the candidates in the inner SL and the outer SL independently, the best track segment, according to preferences given to the trigger quality (H/L) and to the track proximity to the radial direction to the vertex (i.e. its  $p_T$ ).

Then it computes the k-parameter and the position of a correlated track candidate. The compatibility between the k-parameters of the selected track segments in the inner and outer SLs and the correlated track is checked against a programmable tolerance.

The internal parameters computed for the correlated tracks are:

$$\begin{cases} k_{COR} = \frac{D}{2} \tan \psi = x_{inner} - x_{outer} \\ x_{COR} = \frac{(x_{inner} + x_{outer})}{2} \end{cases}$$



**Fig. 9.9:** Track Correlator layout.



**Fig. 9.10:** Definition of TRACO output parameters.

Owing to the long lever arm between the two SLs the angular resolution of a correlated track candidate is 10mrad for the nominal drift velocity, thus significantly improving the BTI precision, while the resolution on the position remains unchanged.

These parameters are converted, using programmable look-up tables, to the chamber reference system: position is transformed to radial angle  $\phi$  and k-parameter to bending angle  $\phi_b$  as defined in Fig. 9.10. The chosen track is forwarded to the chamber TS, for further selection.

If the correlation fails the correlator forwards an uncorrelated track following a preference list that includes the parent SL (IN/OUT) and the quality bit (H/L) of the two tracks selected for correlation.

If no correlation is possible since there is no candidate in one SL, the uncorrelated track is anyway forwarded.

The track is output on a bus, using 10 bits for the bending angle and 12 bits for the radial angle and it is accompanied by three quality bits identifying HH, HL, LL, H<sub>i</sub>, H<sub>o</sub>, L<sub>i</sub>, L<sub>o</sub> track candidates with obvious symbols meaning.

A further preference selection can be activated to connect the trigger generated in the  $\phi$  view to the triggers generated from the BTIs in the  $\theta$  view. In particular, since the noise generated

from the BTI algorithm is of LTRG quality, a programmable coincidence between the two views is foreseen to certify the uncorrelated LTRGs.

In order to allow the identification of two muons inside the same correlator, the same algorithm is applied twice to the data received from the BTI. Therefore sometimes a second track is forwarded to the chamber TS. The programmability of the preferences for the choice of the First Track and the Second Track are completely independent, although we believe that the same criteria should apply.

A further selection is needed in the case that more than one TRACO inside a chamber give a trigger. The communication between the TRACOs and the chamber TS to allow this decision is done using a Preview information, in order to minimize the time needed for calculations of the whole trigger chain. A copy (called Preview) of one of the candidates chosen for correlation is sent to the TS according to the programmed H/L and IN/OUT selection flags, before starting any correlation calculation. The TS selection is based on the quality of the Preview (given by the BTI resolution) of the various candidates.

#### 9.4.2 Hardware Implementation

The block diagram of the TRACO operations is given in Fig. 9.11. In the following paragraphs we shall describe the TRACO algorithm referring to the flow of this diagram.

There are four *data flows* inside the TRACO: two track calculation flows and two track Preview flows.

In order to allow the identification of two muons inside the same correlator, the TRACO algorithm is applied twice to the data received from the BTI. Therefore inside the TRACO there are two parallel flows delayed by one cycle: the first path computes a First Track, choosing between all the BTI candidates, while the delayed path computes a Second Track from all unused candidates. The programmability of the preferences, described in details later, for the choice of the First Track and the Second Track are completely independent, although in principle we believe that the same criteria should apply.

A further selection is needed in the case that more than one TRACO inside a chamber give a trigger. The communication between the TRACOs and the chamber TS to allow this decision is done using a dedicated Preview data bus for each track, in order to minimize the time needed for calculations of the whole trigger chain. A copy of the k-parameter of one of the candidates chosen for correlation is sent to the TS according to the programmed H/L and IN/OUT selection flags. The TS selection is based on the quality of the Preview of the various candidates. The Preview data are coded in 9 bits: five bits for the module of the k-parameter, one bit for the track quality (H/L); one bit identifying First/Second track; one bit identifying Inner/Outer layer; one bit identifying Correlated/Uncorrelated track candidate.

The *Input Register* (16 x 8bits) receives and latches the data values and the qualification flags from the BTI chip. The TRACO collects the inputs from 16 BTIs (four from the inner layer and twelve from the outer layer).

The input data bus from each BTI contains the k-parameter and the position in the BTI coordinate system, multiplexed at 80MHz on the same lines (6 bits wide). Two extra flags are provided: the trigger quality (H/L) and the strobe.



Fig. 9.11: TRACO block scheme

The *Angle and Position Converter* module receives the k-parameter 6 bits input word from the BTI and converts it into local radial coordinates ( $k_{\text{local}} = k_{\text{BTI}} - \text{RAD} - \text{offset}$ ). The RAD parameter is a 6 bits load value, depending on the geographical position of the correlator, i.e. the k-parameter of its center. The offset is a programmed value dependent on drift velocity needed from the BTI for its calculations. The converted angle is used for internal calculations and sent on the Preview bus to the TS for further track selection.

Each BTI position is converted to TRACO position, offsetting by the appropriate geographical value. An additional SL shift parameter is provided to correct for eventual construction misalignments of the two SLs.

The *Sorter* module receives the converted angle and selects the candidate with the smallest angle, i.e. the angle closest to the local radial direction. There is one sorter for the four inner BTIs and another for the twelve outer BTIs, hence the choice is done twice independently on the two  $\phi$  SLs. The sorting operation can be programmed to select the biggest angle instead of the smallest one, and/or to give preference to candidates tagged with the HTRG quality flag.

Two other sorters for the Second Track path exist.

The *Calculator and Comparator* module computes the k-parameter and the position of the correlated candidate. It transforms the inner and outer k-parameter of the two independently selected track segments into the correlator coordinates system and computes the correlated track parameters. The angular resolution of a correlated track candidate is 10mrad for the nominal drift velocity, thus improving the BTI value, while the resolution on the position is unchanged.

A second step compares the three candidates (single inner, single outer, correlated) to set the correlated flag. If the correlated track fits inside the programmed acceptance window this flag is raised.

The *Priority Selector and Preview Selector* module selects one of the candidates according with some programmed information.

If the correlation was successful the priority selector chooses the correlated candidate and forwards its parameters to the further stages. If the correlation fails the correlator creates an uncorrelated track following a preference list that includes the parent superlayer (IN/OUT) and the quality bit (H/L) of the two candidate tracks. If no correlation is possible since there is no candidate in one Superlayer, the existing uncorrelated track is still accepted.

The trigger quality H/L and the IN/OUT preference selections are chosen according to the following programming masks: high level trigger mask (HTMSK), low level trigger mask (LTMSK), Superlayer mask (SLMSK).

The First Track priority selector or the Second Track priority selector treats the best inner candidate, the best outer candidate and the correlated candidate considering the trigger quality and all the masks. An equivalent priority selector is implemented for the Preview path. The Preview priority selector treats the best inner and best outer candidates, but only considers LTMSK and SLMSK.

A further preference selection can be activated to connect the trigger generated in the  $\phi$  view to the triggers generated from the BTIs in the  $\theta$  view. Activating this preference a programmable coincidence between the two views is foreseen to certify the uncorrelated triggers.

**Table 9.1:** Codes for TRACO output track quality identifier.

| Description                                                   | Symbol         | Code |
|---------------------------------------------------------------|----------------|------|
| HTRG on inner and outer layer                                 | HH             | 6    |
| HTRG on inner or outer layer and LTRG on inner or outer layer | HL             | 5    |
| LTRG on inner and outer layer                                 | LL             | 4    |
| HTRG on outer layer                                           | H <sub>o</sub> | 3    |
| HTRG on inner layer                                           | H <sub>i</sub> | 2    |
| LTRG on outer layer                                           | L <sub>o</sub> | 1    |
| LTRG on inner layer                                           | L <sub>i</sub> | 0    |
| Null track                                                    |                | 7    |

In particular, since the noise generated from the BTI algorithm is concentrated in the LTRGs, this coincidence is requested by default for the LTRGs and it is optional for the HTRGs.

The priority selector sends only one candidate towards the output bus, and generates a three bits qualification code as shown in Table 9.1.

Two candidates are needed to fit a correlated track, one from the inner superlayer and another from the outer one. If the correlated track does not satisfy the acceptance value, one of the track candidates selected to try the correlation is forwarded as the First Track choice and the other can be reused for the Second Track calculations. This task is performed in the *Recycling unused candidates* module. This feature can be software disabled.

The two selected tracks are output on the same bus at consecutive bunch crossings. Therefore it is possible that a Second Track from the bunch crossing  $n$  is computed at the same time of a First Track from the bunch crossing  $n+1$ . A First Track choice has always priority on the output bus and therefore overlaps the Second Track from bunch crossing  $n$ . The *Mixer* block performs this choice and activates a flag if an overlap occurs.

Inside the *Coordinate converter and bending angle calculation* block the internally calculated position and k-parameter are converted, to the chamber reference system: position is transformed to radial angle  $\phi$  and k-parameter to bending angle  $\phi_b$  as defined in Fig. 9.10.

This task is performed with direct access to two programmable look-up tables. The first look-up table is used for conversion of the local correlator position coded in 9 bits to the track radial angle  $\phi$  coded in 12 bits. The second performs the conversion from the k-parameter coded in 10 bits to the angle  $\psi$  coded in 10 bits. A further block performs the computation of the bending angle  $\phi_b = \phi - \psi$ .

Some filtering functions are performed in the *Quality Filter* block to select the output value driven to the chamber server. These functions include an uncorrelated Low Trigger

Suppression and a programmable tolerance window for the bending angle output value. The filters will be discussed in detail in Chapter 9.11.2.

The selected track is output on a bus, using 10 bits for the bending angle and 12 bits for the radial angle and it is accompanied by three quality bits identifying HH, HL, LL,  $H_i$ ,  $H_o$ ,  $L_i$ ,  $L_o$  track candidates.

Data output bus provides one track at each clock cycle, with up to two tracks per bunch crossing at consecutive clock cycles. The latency for the First Track is five cycles, while Second Tracks are output after six cycles.

## 9.5 Trigger Server

The TS [9.7] has to select the two best trigger candidates among the track segments selected by all TRACOs in a muon station and sends them to the Sector Collector, where they will be forwarded to the Regional Muon Trigger.

The TS has to fulfill the following requirements:

- since a lot of interesting physics has two close muons that can hit the same muon station, some emphasis should be put on the efficiency and purity of both selected segments. The selection should be based on both the bending angle and the quality of the track segment and selection priorities should be configurable.
- it has to reject fakes generated by TRACOs using a configurable fake rejection algorithm.
- the processing time must be independent on the number of TRACOs in the station.
- the TS should be able to treat pile-up events.
- since only one TS is mounted on each station, it represents the bottleneck of the on-chamber trigger devices and therefore it should have built-in redundancy.

The TS is composed by two subsystems: one for the transverse view ( $TS\phi$ ) and the other for the longitudinal view ( $TS\theta$ ). The  $TS\theta$  has to detect triggers produced by the 64 BTIs which equip the SL in the  $\theta$  view. This information is sent to the TRACOs of the  $\phi$  view and can be used there as a validation of a trigger in order to reduce background. Besides, a pattern of the track segments found in the longitudinal view has to be sent to the Regional Muon Trigger [9.8]. The  $TS\theta$  consists of groups of OR of BTI hits.

The number of TRACOs in a station can be quite large: as much as 25, for the largest station. Each TRACO transmits to the  $TS\phi$  its two best tracks serially in two consecutive bxs, ordered in quality. In order to minimize the latency of TRACO- $TS\phi$  system, the TRACOs send Previews of track segments. While the  $TS\phi$  makes its selection (in pipeline), the TRACOs compute the full track parameters (absolute coordinates with higher precision). The  $TS\phi$  then serially reads the full track parameters of the two best tracks from the corresponding TRACOs, and sends them to the Sector Collector. With this mechanism 2 bxs are gained and the total latency of the TRACO- $TS\phi$  system is limited to 6 bxs.

We define  $bunch1$  and  $bunch2$  respectively the first and the second bunch of tracks arriving from the TRACOs connected to the  $TS\phi$ . The sorting algorithm could be simple if it was just selecting independently the best track of  $bunch1$  and the best one of  $bunch2$ . However, it is not

assured that the best track of *bunch2* represents the second-best track among all *bunch1* and *bunch2* tracks. The TS $\phi$  must sort the truly second-best track among all *bunch1* and *bunch2* tracks. In order to achieve it, the TS $\phi$  selects, among the tracks of *bunch1*, the first-best-track (FBT) and the second-best-track (SBT). On the following bx, the search for the best is done among *bunch2* tracks and the SBT of previous bx (*carry*). Therefore the sorting algorithm is applied in pipeline at each bunch. In this way the truly two best tracks among all the possible tracks are found. For normal triggers, the TS $\phi$  is able to sort the two best tracks within two bxs. In case of pile-up triggers, the TS $\phi$  is able to provide to the Sector Collector at least the FBT data resulting from the sorting of *bunch1*. In case of two close muons, they likely produce two track segments in *bunch1* from different TRACOs in the same stations, which are correctly picked up by the TS $\phi$  algorithm through the *carry*.

The TS $\phi$  logic diagram is shown in Fig. 9.12. The selection algorithm uses a two layer cascade of processing units. This architecture was chosen in order to minimize the number of logic cells within a unit and the amount of I/O between blocks[9.7]. In each unit, a parallel minimum and next-to-minimum search is performed over a small group of input words, using 2 by 2 fast 9-bits comparators. The fully parallel approach guarantees a fixed time response, independent of the number of TRACOs in a station. Each unit of the first layer (TSS: Track Sorter Slave) processes up to four data words, while the second layer unit (TSM: Track Sorter Master) processes up to seven data words.

The TS $\theta$  is formed by two identical units (TST), which form the OR of groups of BTIs: the information about the presence of a trigger in the  $\theta$  view is sent to the TRACOs via the TSSs and can be used as trigger validation in the  $\phi$  view. Besides a pattern of trigger hits is sent to the TSM and forwarded to the Sector Collector.

As shown in Fig. 9.12, the hardware partitioning of the system matches with the logical blocks. Each TSS device is mounted on a board (PHITRB128) which contains 4 TRACOs and 32 BTIs. The TSM system is composed of three devices mounted on a separate board (Server Board) which receives the output of at most 7 TSSs (for the largest chamber). The two TST devices are mounted on two separate boards (THETATRB).

The control and monitoring of the system is possible through a JTAG serial line which links the TSM, the TSSs, the TRACOs and BTIs. In case of failure of this link a backup solution is provided: a Parallel Interface which uses the lines normally dedicated to data transmission.

### 9.5.1 Track Sorter Slave

The main tasks performed by the TSS [9.9] are the sorting of Preview data coming from the four TRACO placed on the same board and the suppression of noise generated by them.

The Preview consists of a 9-bits word: 4 bits are reserved for the quality of the track (First/Second Track choice, H/L trigger, Correlation, Inner/Outer SL), the other 5 bits are used for the bending angle. The best track is the one with best quality and smallest angle (which means higher transverse momentum). If the quality bits are correctly coded, the search for the best track is a search for the minimum. In one bx the TSS is able to activate a select line addressing the TRACO which sent the best Preview: the TRACO will send the corresponding full track parameters to the TSM for further processing. At the same time the best Preview is also sent to the TSM for the second stage processing.



**Fig. 9.12:** Trigger Server architecture.

There are two kinds of ghost segments that the TSS is able to recognize:

- Due to the geometry of TRACO acceptance, a TRACO nearby to the one that sends the best segments, can send a copy of the same track, which has to be an Outer segment. This kind of ghost can be cancelled removing *carry* tracks of type Outer in the TRACO nearby the one that sent the best track.
- If a TRACO cannot correlate the two track segments in the Inner and the Outer SLs belonging to the same track, it sends the Inner segment as a First Track and the Outer one as a Second Track. This ghost can be removed by requiring that an Outer Second Track sent by the same TRACO which gave the best track in the previous bx, is not valid.

The final suppression of ghosts of the first type, which involves nearby TRACOs belonging to different boards, can only be done by the TSM.

From the point of view of controls and monitoring, the TSS contains the JTAG controller that can be addressed by the Controller from the Control Board: all TRACOs and BTIs on the corresponding board are serially linked to the TSS.

The design of the device was done using the VHDL language, which guaranteed an easy portability on different hardware technologies.

## Algorithm Description

The functionality of each TSS is performed in two consecutive cycles (one cycle per bx), called *sort1* and *sort2*. The *sort1* processing status is recognized when at least one TRACO gives a non-null track of *bunch1* type, while the *sort2* status simply corresponds to the cycle following *sort1*. The *sort2* status can be aborted in case of pile up triggers. In the *sort1* cycle, each TSS unit analyses four Preview data words and transmits the minimum to the TSM unit in the second layer, while the next-to-minimum is stored locally and carried over to the *sort2* cycle. At the same time a local select is given in output to enable transmission of the full data from the selected TRACO to the TSM. In the *sort2* cycle, each TSS unit analyses the four input words of *bunch2* together with the *carry* word of the *sort1* cycle. If the *carry* is the best trigger candidate in the *sort2* cycle, a post-select line is used to inform the TRACO that it has to transmit to the TSM its best track segment of the previous cycle.

Using the internal configuration registers, it is possible to steer the sorting algorithm:

- each of the input Previews from TRACOs can be masked in case of noisy channels (it is also possible to mask only the tracks with a certain quality, for example only LTRGs).
- during normal operation the priorities used in the sorting are, in order of decreasing importance: correlation, quality of the trigger, position of the trigger (i.e. IN/OUT SL), angular deviation with respect to the radial direction. It is possible to swap the priority order of the quality bits.
- normally the carry track from *sort1* cycle is used in *sort2* cycle: it is possible to disable this mechanism.
- it is also possible to disable the algorithms for ghost suppression.

## Hardware Implementation

The TSS functionalities are implemented in a single 120-pin ASIC chip built in CMOS 0.5  $\mu\text{m}$  technology. Because of the severe speed requirements (sorting of two out of four 9-bits words with carry in 1 bx) it is not possible to use programmable ASIC devices like it is done for the TSM. The main building blocks of TSS, described hereafter, are shown in Fig. 9.13.

The *Sorting Core* is the processing unit. The four 9 bits preview words from TRACOs are filtered: they can be masked individually or depending on their quality and the priority of their quality for the sorting can be changed. Then they enter a battery of ten 2-words comparators together with the *carry* track of previous cycle. The First Best Track is sent to the TSM while the Second Best Track is kept as *carry*. At the same time the select lines to TRACOs are activated.

The TSS contains seven 8 bits *Configuration Registers* which are used to control the chip. They can be read and written through the JTAG or the Parallel Interface. With these registers it is possible to steer the sorting core and the filtering of input previews.

JTAG controller: it is the control unit for the standard JTAG circuitry. Besides the internal loop on I/O registers (Boundary Scan Registers) there are 3 user defined serial lines: for the Configuration registers, for Test registers and for Snap registers.

*The Parallel Interface Controller* unit takes over the control of the bidirectional lines used for the Parallel Interface. After entering in program mode the sorting core is isolated. Eight bits of the preview word to TSM is used for communication with the higher level of Local Trigger

architecture. Eight bits of each of the TRACOs preview are used for communication with lower level devices. The direction of this bus is controlled depending on a protocol for read and write operations. Data can be read or written in configuration registers, test and snap registers if the addressed device is the TSS itself, while in the other cases the bus acts as a bypass.

The *Test-in registers* can be used in order to check the functioning of sorting core in offline mode. Patterns of preview words can be loaded into the registers by JTAG or Parallel interface. Then the patterns can be injected into the sorting core at 40 MHz.

*Snap registers*: monitor the input and the output of the sorting core. After the snap registers receive a reset they wait the first trigger condition (for example the presence of an high quality track among the input previews) after which the I/O are latched into flip-flops. These registers can be read during runtime through the JTAG.

### 9.5.2 Track Sorter Master

The TSM unit [9.10] in the second layer analyses up to seven Preview words from the TSSs. There is one TSM per muon station. It behaves similarly to a TSS unit of the first processing layer, but its processing begins two bxs later. The information handshake between the TS $\phi$  system and the TRACO devices allows data from up to fourteen tracks to be stored in the TSM unit. The selected output signal from the TSM, corresponding to the FBT in the first processing cycle and to the SBT in the second cycle, are used to enable the transmission of full track data to the Sector Collector for two out of the fourteen possible candidates stored in the TSM unit.

The TSM system has two different logic components (Fig. 9.14): a sorter block (TSMS) which performs the sorting on the Previews from the TSSs and a data multiplexing block (TSMD) which outputs the full data from TRACOs, corresponding to the selection done in the TSMS.

System robustness considerations, discussed further on, suggest to split TSMD into two hardware blocks, each one looking after half DTBX chamber and with the same functionalities and internal architectures. The TSMS, TSMD0 and TSMD1 are all placed on the Server Board



**Fig. 9.13:** Main blocks of TSS design.



**Fig. 9.14:** The TSM system can be configured through both JTAG (a) and Parallel Interface (b).

The TSM also receives information from the  $\theta$  view. The hit pattern received from the TS $\theta$  is synchronized with  $\phi$  view track data and forwarded to the Sector Collector.

### Algorithm Description

The TSM is the last element of the TS system and of the muon on-chamber trigger electronics. Therefore a very important requirement in the TSM design is robustness. From the point of view of functionality, the main constraint comes from dimuon physics: this means capability of the system to maintain good efficiency also in case of hardware failures. In order to achieve it the system has been segmented in blocks with partially redundant functionalities. Care has been taken that the segmentation does not deteriorate the expected performance, in particular the latency assignment.

First the TSM is split into three parts: a TSMS block, two TSMD blocks (TSMD0,TSMD1). The TSMS inputs are the Preview data from the TSSs. The TSMDs inputs are the selected track candidates data for a half chamber each.

The TSM can be configured in two processing modes. In the default processing mode the TSMS sorts on the TSS Preview data and issues select signals that the TSMDs use to choose between the track candidates coming from TRACOs. The TSMS can select two tracks in TSMD0 or two tracks in TSMD1 or one track each. In the back-up processing mode TSMS processing is bypassed: each TSMD sorts the best track candidate among the data of a half chamber and outputs one track. The default processing implements the full performance and guarantees that dimuons are found with uniform efficiency along the chamber.

In case of failure of one TSMD, the Previews of the corresponding half chamber are disabled in TSMS sorting, so that full efficiency is maintained in the remaining half chamber. A similar technique is used in case of damage of the data lines from TRACOs.

The back-up processing is activated in case of TSMS failure or in case of damage in the Preview lines. It guarantees full efficiency for single muons and for open dimuon pairs (one track in each half-chamber).

### Hardware Implementation

Robustness is the focal point in designing the TSM system, since TSM operates both as last functional element of the track-segment sorting tree of a DTBX chamber and as nodal point for distribution of the monitoring and configuring information to the various elements of the chamber local trigger.

Functionality is distributed over three ICs (TSMD0,TSMD1,TSMS) (Fig. 9.12) that can be configured in different processing modes, providing high redundancy in case of hardware failure. Each one has independent power lines. Three separate lines from the chamber Controller are used to provide enable signals (nPWRenD0, nPWRenD1, nPWRenSort) for the power switches. When one is powered down, all I/O lines to the chip are disconnected via switches driven with the power enable signals from the Controller.

Access to the configuration registers is possible in two ways: through a configurable JTAG net for boundary scan and through the DTBX ad-hoc configuration protocol, hereafter called Parallel Interface (Parallel Interface). Fig. 9.14(a) shows the JTAG path through the three chips: the path adapts to run through the chips that are powered on, using switches controlled via the



**Fig. 9.15:** Time sequence of TSM operations when in Default Processing Mode.

power enable lines. Fig. 9.14(b) shows how the Parallel Interface bus lines are distributed through the TSM system. Each individual chip has both a common TSM address and its own address. The Parallel Interface commands are forwarded to the trigger boards only through the TSMS which gives access to only one trigger board at once. In case of TSMS failure the trigger boards can be configured via their individual JTAG nets. The Parallel Interface uses the same lines to propagate the data whose direction is reversed during a Parallel Interface writing operation.

The TSM configuration can also be changed from the chamber Controller acting directly on the power enable signals: TSMS receives the power enable state of TSMD0 and TSMD1 and can change its processing mode to select two tracks from the same TSMD, i.e. from half a chamber, when the other TSMD is powered off. Similarly each TSMD receives the power enable state of both TSMS and the other TSMD, and it switches to the back-up processing mode when the TSMS is not powered. The system can still run in the limit scenario of only one TSMD block with undamaged connections. Three power fault signals are generated and reported to the Controller when an overcurrent condition is detected in the corresponding power net.

There are two main processing modes. The sequence of processing operations performed in Default Mode are illustrated in Fig. 9.15.



**Fig. 9.16:** Time sequence of TSM operations when in Back-up Processing Mode.

Two tracks, one per clock cycle, are presented on output on the same bus. The track selection is performed by the TSMS which applies the second stage of the sorting algorithm on the reduced data already processed by the TSS during the first sorting stage. The addresses of the selected tracks are used to drive the multiplexers of the data pipelined in the TSMDs and to enable the three-state output buffers of the TSMDs.

The sequence of processing operations performed in Back-up Mode are illustrated in Fig. 9.16.

In this mode the TSMS is bypassed and each TSMD sorts the best track in half chamber data. The selection is performed on the full-track data from the TRACOs selected by the TSSs during the first stage of sorting. The quality (3 bits) of the selected track in TSMD1 is compared to the one in TSMD0 to enable the corresponding three-state buffer on output. One track is output per clock cycle.

Besides performing the second stage of the sorting algorithm begun in the TSSs, the track selection procedure also applies data masking, ghosts and fakes rejection, in a way consistent with the TRACO-TSS system. In the TSM, ghost rejection is expanded to a new kind of ghosts: a single good track that produces two segments, one each in two neighbouring TRACOs, which then forward their segments to two different, although contiguous, TSSs. Both segments appear at the TSM input. Since each TSS serves four TRACOs and this particular kind of ghosts can only appear in two contiguous TRACOs, each TSS forwards to the TSM two bits giving the relative address of the selected TRACO. The TRACO with address 00 of the TSS<sub>i</sub> is adjacent to the TRACO with address 11 of the TSS<sub>i-1</sub>. If such adjacent segments are found, one is cancelled if it is an OUTER segment, i.e. a segment only observed in the outer SuperLayer: no duplication of inner SuperLayer segments is possible in the TRACOs by construction.

The TSMD0, TSMD1, TSMS logic is implemented using three identical Actel A54SX32 chips, which, tested for space applications, have shown good tolerance to high radiation doses up to 10 to 50 krads and high thresholds for Single Event Effects. They belong to a new generation of FPGAs also called pASICs (programmable ASICs) based on the 0.35 μm silicon antifuse technology: once programmed the chip configuration becomes permanent, making them effectively ASICs.

### 9.5.3 Trigger Server in the Longitudinal View

Theta Trigger Server has to group information from the 64 BTIs in θ-layer and to send an opportune pattern to the Regional Trigger. Studies on the Regional Muon Trigger [9.8] suggest that the minimum pattern resolution, without loss of efficiency in the Track Finder reconstruction, correspond to the space covered by eight BTIs.

TSθ receives 2 bit from each BTI (trigger strobe and H/L quality) and, for both bits, it performs a logic OR of groups of 8 BTIs. The output is formed by 2 bits for each group: in total 8 bits for the trigger pattern plus 8 bits for corresponding quality. These signals are sent to the Server Board where they are synchronized with the track segments found in the φ view and sent to Sector Collector, which transmits them to the Regional Muon Trigger.

Moreover TSθ sends a 2-bit signal (64 BTI wired OR + 64 BTI Quality wired OR) to the TSSs, where they are forwarded to the TRACOs to help in noise reduction.

TSθ is made of two identical devices (TST). Each device is located on the THETATRB and connected to 32 BTI. TSTs are built using commercial ICs.

### 9.5.4 Timing of the Drift Tube Local Trigger System

The timing of the information handshake between BTI-TST-TRACO-TSS-TSM is summarized in Fig. 9.17.

The sorting in the TSS is based on TRACO's previews (prv). The TRACO with the best track is addressed by the TSS through a select line (sel) and sends the full track parameters (trk) to TSMD. The second stage sorting in the TSMS is also based on previews sorted by the TSS. The two best tracks among the ones stored in the TSMD are selected by the TSMS (sel) and sent to the Sector Collector in two consecutive bunch crossings by transmitters (Tx). The information about the presence of triggers in the θ view (θ trg) is prepared by TST and, through the TSS, forwarded



**Fig. 9.17:** TRACO-TS timing diagram. See text for explanation.

to TRACOs. The  $\theta$  trigger hit pattern ( $\theta$  pattern) is not used in the sorting and is forwarded to Sector Collector.

## 9.6 Sector Collector

The Sector Collector Board (SCB) is lodged in the mini-crate of the fourth station of each sector. All high speed optical links either for readout or for trigger are placed on the SCB. Readout data, coming from each Readout Board on serial medium speed links on twisted pairs, are grouped and formatted to form sector readout packets and sent, via a high speed optical link, to the DAQ.

Trigger data, coming from each Server Board on high speed LVDS serial links, are grouped to form sector trigger packets and sent, via high speed optical links, to the Regional Muon Trigger (Fig. 9.18).

The link connecting each SB to the SCB consist of 48-bit LVDS Channel Link<sup>®</sup> Serializer and Deserializer connected by a variable length flat cable with 10 twisted pairs. The device 48 inputs are latched at 40MHz and serialized on 8 twisted pairs at 240Mbit/s; one twisted pair is reserved for clock transmission. The different cable lengths (2m to 4m) compensate, within 2ns, for the different time of flight of particles coming from the vertex and crossing the sector stations. Sector trigger data amount to 164 bits and are transmitted to the Regional Muon Trigger using two PAROLI (Siemens) optical links, one for trigger  $\phi$  view information and one for  $\theta$  view information.



**Fig. 9.18:** Sketch of SCB connections to Mini-Crate electronics.

**Table 9.2:** Summary of standard data forwarded to Regional Muon Trigger from each muon chamber station

|               | Quantity                      | # of bits |
|---------------|-------------------------------|-----------|
| <b>φ view</b> | φ                             | 12        |
|               | φ <sub>b</sub>                | 10        |
|               | Quality                       | 3         |
|               | I/II track flag               | 1         |
|               | Overlap                       | 1         |
|               | Associated θ view information | 2         |
|               | Calibration flag              | 1         |
| <b>θ view</b> | Position                      | 8         |
|               | Quality                       | 8         |
|               | Associated φ view information | 2         |
|               | Synchronization control       | 2         |

A new Sector Collector layout is under study aiming to move the unit out of MC electronics on the balconies. This option should allow relaxing device dimensions, power consumption and reliability requirements on trigger optical links due to bigger available space and easier access and maintenance. For this new setup, trigger cables from DTCCBs to SCB are much longer and a new transmitter based on LVDS serializers is under test. The possibility to replace the PAROLI optical link with a custom one using a VCSEL array is being evaluated. The seven Collector Boards of each half wheel are hosted in a crate together with a Drift Tubes Wheel Control Board (DTWCB). The DTWCB has two serial interfaces, one (called DTCI) is intended for test purposes only and one (called DTC1) is used for Detector Control access. Another connector (called DTC2) is provided, for the termination of the RS-485 link connecting all the DTCCBs of the half wheel. The DTCI is a standard RS-232 running at 9600 baud rate. The DTC1 is an asynchronous serial communication interface with optical connection for full duplex communication between the DTCM crate, sitting in counting room and balcony crate electronics. A total number of 10 optical links are needed to control electronics lodged on balcony crates.

### 9.6.1 Data Exchange with Regional Muon Trigger

The data will be packed inside the SC and sent to Regional Muon Trigger on an optical link for the φ view information of each sector multiplexing First and Second Track data and with one optical link for the θ view information.

The details of the data sent to Regional Muon trigger from each station are collected in Table 9.2. The data collected here are the ones sent from a standard station. Actually there will be

some differences between stations: in fact  $\theta$  view layers are not available in station 4 and  $\phi_b$  information from station 3 is not needed from the Regional Muon Trigger.

Taking these particular situations in consideration the link in the  $\phi$  view will send 110 bits per sector and the link of the  $\theta$  view will send 60 bits per sector.

## 9.7 Chamber Electronics Control System

### 9.7.1 Drift Tubes Control Interface

The Drift Tubes Control Master (DTCM), directly interfaced to the Detector Control System, interacts with each Control Board via a dedicated optical asynchronous serial bus called DTC1 (Fig. 9.19). A fiber pair per chamber is needed to communicate with the relative DTCCB. For budget reasons DTC1 lines are grouped in cables of four fibers. Patch panels are foreseen both close to the drift chambers and at the DTCM crate in counting room. Patch panels consist of small



**Fig. 9.19:** Layout of DTBX Control System.



**Fig. 9.20:** Sketch of Drift Tubes Control Unit layout, the Control Board of DTBX mini-crane electronics.

boxes that, on the chamber side, are mounted on the iron between stations 1 and 2 and between station 3 and 4 to service the respective chambers.

An additional RS-485 connection, called DTC2 is foreseen as detector control backup in case of failure of the optical one. The DTC2 line connects in parallel all chambers in each half wheel and is serviced by a DTWCB sitting on the balconies. The ten DTWCBs are connected with dedicated optical fibers to the DTCM crate consisting of 17 Receiver Boards interfaced to a host PC via MXI bus. The connection between the DTCM and the Detector Control System is done via ethernet interface.

## 9.7.2 Control Board

The Control Board, as part of the Drift Tubes Control Unit, accomplishes control and monitoring functions of mini-crane electronics: a pictorial views of chamber Control Board is shown in Fig. 9.20. Part of the control electronics is lodged on the Server Board of Fig. 9.6. A block scheme of the control system is reported in Fig. 9.21.



**Fig. 9.21:** Chamber control system block scheme

An on board microprocessor has access, either via JTAG or via a parallel interface, to all ASIC configuration registers. The chamber TTC receiver, used as clock and broadcast commands source, is located on the DTCCB. DTCCB functionality can be grouped in different units: clock distribution, test pulse, front-end analog signals, detector control interfaces, temperature measurement, power supply distribution and microprocessor unit ( $\mu$ PU).

Clock distribution logic consists of three blocks of Pseudo-ECL buffers delivering dedicated low skew clock lines for Trigger Boards, Readout Boards and Server Board. Each block can be independently enabled or disabled by the  $\mu$ PU. Clock cables consist of twisted pairs cut precisely in order to guarantee the same sampling clock phase at BTI inputs for all the three Superlayers.

The Test Pulse unit is able to inject a fixed and repeatable amount of charge into front-end inputs at reception of a defined command. Furthermore, programming the unit is possible to simulate a track normal to the Superlayer and crossing it at any position with 10mm resolution, allowing a precise characterization of TDC and BTI-TRACO performance. In order to reduce the amount of calibration data, charge injection is masked either at front-end output or at Readout Board input, limiting the number of hit channels to 4 per Trigger and Readout Boards. A description of the calibration sequence can be found in [9.11].

Front-end chips, called Multiple Amplifier and Discriminator (MAD) integrate all channel analog processing functions: amplification, discrimination and signal cable driving. MAD chips add new possible features to front-end electronics: input channels can be singularly enabled at the shaper stage, also an integrated temperature probe can be used for monitoring purpose.

Front-end electronics monitoring performs temperature and power supply measurements. Two temperature channels are provided with maximum and medium temperatures

measured in MAD chips by dedicated cells. For external temperature measurements a large number of chips can be connected to the Control Board temperature connector for a maximum of 300m cable length.

Control Board access is guaranteed by means of three serial interfaces, one intended for test purposes only (called DTCI) and two for the detector control, as main (called DTC1) and back-up (called DTC2) accesses.

Three I2C buses dedicated to front-end, RPC and alignment electronics programming are provided.

Readout and Trigger ASICs are programmed via JTAG. As already mentioned every Trigger and Readout Board has a JTAG chain useful for unit setting and testing. Boards can be accessed one at a time by means of chain addressing. Four lines are used to address a maximum number of 16 units. During normal operation ASICs in the mini-crate can be accessed for monitoring purposes without any interference with trigger and readout activity. A back-up channel has been provided for device programming to guarantee the capability to configure all trigger devices. Using the trigger data path backwards all ASICs can be addressed by the Trigger Sorter Master once the  $\mu$ PU has taken control of the bus. A dedicated parallel interface between the  $\mu$ PU and the TSM has been implemented for this purpose.

At power supply turn-on the  $\mu$ PU on every Control Board starts executing the reset sequence. The boot program, sitting in EPROM, is executed as soon as power voltage stabilizes at nominal value. This program takes care of first hardware diagnostics, checking FLASH and SRAM functionality and power supply bus-bar voltages. After the self-check the result is sent via detector control link and the  $\mu$ PU enters a sleep state waiting for remote commands. At this level it is possible to turn on and off any block, to repeat the boot sequence and to access FLASH and SRAM memories for program downloading and execution. Standard program execution foresees the copy of code from FLASH to SRAM and FLASH memory power off for code source protection from SEE. Program execution can take place either in FLASH or in SRAM memory blocks; a separate power supply line allows an independent use of both. On top of program execution a progressive power on sequence is foreseen to allow a full check of mini-crate electronics integrity.

A TTC receiver per mini-crate accomplishes clock and broadcast signals distribution over the barrel drift tubes detector. Every TTC receiver chip delivers two clock lines with programmable fine de-skewing: one is distributed to all Trigger and Readout Boards while the second one is used by Server and Control Boards. TTC programming is done via an I2C interface.

## 9.8 Synchronization and Latency

### 9.8.1 Synchronization Procedure

#### Description of the Muon Barrel Signal Path

The first issue of a synchronization procedure [9.11] is the definition of the basic block whose phase adjustment is done by hardware construction, assuring the synchronization with a careful compensation of the known delays. This compensation can be hardware achieved only if there are short electrical connections between the components of the trigger/readout chain and it requires the knowledge of all the contributions to the delays.



**Fig. 9.22:** Time of flight and cable delays to the Sector Collector in the Central wheel.

All the local trigger and readout electronics is lodged on the chamber itself; therefore it is sensible to believe that every chamber could be considered an intrinsically synchronous block, equipped with one Trigger Timing and Control Receiver (TTCrx).

The internal timing distribution is equalized using cable connections of adequate length and the maximum skew between internal parts is foreseen to be below 1ns. The trigger and readout data are sent to the Sector Collector (SC) lodged in station 4 using differential signals on twisted pair cables. Trigger cable lengths are used to compensate the difference in time of flight of the four stations of the sector in order to synchronize the full sector data at the trigger optical link input. Fig. 9.22 gives the time of flight and cable delays for each station within the same sector; once stations are synchronized with the beam the data skew at the SC input is within 3ns (see Table 9.3).

The TTCrx of each station provides two clocks with independent phase adjustment: one is used as sampling clock while the other is the data transmission clock (see Fig. 9.23). The first one is distributed, using equalized cables, to the chamber trigger and readout front-end ASICs. Its phase has to be set to compensate the time of flight to assure the highest trigger efficiency averaged



**Fig. 9.23:** Block diagram of the signal path in the muon electronics.

on the chamber. The transmission clock phase has to compensate for the internal difference (less than 5ns) of signal propagation in order to align the data at the Trigger Server for the transmission to the Sector Collector. The time difference between the two clocks is a constant value determined by the hardware layout and set before chamber installation. This difference will be measured on the bench: hence only the sampling clock phase needs to be evaluated online.

### **Sampling Clock Synchronization**

This is the first procedure to run because both the chamber trigger efficiency and the coherence of trigger data transmitted by the Sector Collector unit are strongly dependent on sampling/transmission clock phase.

This procedure is executed on the monitor CPUs in the control room in parallel for all the chambers. A special single station LV1A is generated by the drift tubes trigger system every time there is a very High Quality Trigger (HH or HL or LH) inside any station. Since many triggers will be available at the same time, they can hide each other: the procedure is therefore repeated disabling the chambers already aligned until they are all synchronized.

**Table 9.3:** Average delay and maximum spread due to muon time of flight and signal propagation along the wire at the electronics lodging position for each muon chamber.

| Chamber | Average delay<br>(ns) | Spread<br>(ns) |
|---------|-----------------------|----------------|
| MB/2/1  | 27.3                  | 1.2            |
| MB/2/2  | 29.3                  | 1.5            |
| MB/2/3  | 31.9                  | 1.8            |
| MB/2/4  | 34.6                  | 2.0            |
|         |                       |                |
| MB/1/1  | 21.6                  | 2.3            |
| MB/1/2  | 24.1                  | 2.6            |
| MB/1/3  | 27.3                  | 2.9            |
| MB/1/4  | 30.5                  | 3.1            |
|         |                       |                |
| MB/0/1  | 19.0                  | 4.3            |
| MB/0/2  | 21.9                  | 4.4            |
| MB/0/3  | 25.4                  | 4.4            |
| MB/0/4  | 28.8                  | 4.4            |



**Fig. 9.24:** Distribution of the quantity  $T_1+2T_2+T_3$  for different synchronization offsets and minimal BTI acceptance.

We take advantage from the synchronous nature of the trigger electronics. A trigger can be generated only at 40 MHz frequency, hence in the case of a bad synchronization a certain fraction of them will be assigned to the time slots close to the right one.

For every trigger the TDC data are read out from the FIFO and since they are triggered by the BTI itself, they will carry the offset introduced by the actual time slot assignment. Once the triggering BTI is identified, the drift times of any three consecutive layers are used to compute the quantity  $T_1+2T_2+T_3$ ; histograms of this quantity are accumulated. The value depends on the



**Fig. 9.25:** Root mean square of the distribution of the quantity  $T_1+2T_2+T_3$  for a simulated event sample as a function of synchronization offset for different BTI alignment tolerances.

trigger latency, and indirectly, on the sampling clock phase: if the sampling clock is out of phase the BTI is not able to identify uniquely the track crossing time and the trigger output is distributed over two neighbouring cycles; as a consequence the distribution shows two distinct peaks.

Fig. 9.24 shows the distribution of this quantity for a set of simulated data in the case of minimal acceptance to enhance the effect. The oscillation between two peaks and one peak histogram is quite evident. It is not sensible to look for this behavior and visually establish the best synchronization time. We have therefore to find an automatic algorithm. The easiest indicator is the r.m.s. of the distribution for each synchronization time set. The value of this quantity is plotted in Fig. 9.25 for the same sample and the same configurations we considered before: it shows a rather evident minimum at the correct synchronization time. The r.m.s. method is safer than an equivalent rate counting method since it will not depend on luminosity drop during the time required to run the procedure varying the synchronization time (roughly about 30 minutes at maximum luminosity for each iteration). This algorithm is sensitive only if the track incident angle is limited below  $20^\circ$ : higher angles cause algorithm failures.

Hence the algorithm will store the histograms and find the right clock phase looking for the minimum of the root mean square value of the quantity  $T_1+2T_2+T_3$ . In order to have the highest possible resolution in this procedure the BTIs must be programmed to trigger with minimal acceptance, i.e. with half the standard alignment acceptance. With this setting the BTIs show a consistent efficiency drop when the clock phase is not correct. The TTCrx fine phase adjustment is changed with 3ns steps repeating the procedure and interpolating the results. This procedure is dependent neither on trigger rate nor on beam pattern.

Once the sampling clock phase is determined for each chamber, there may still be a difference between the various chambers quantized in steps of 25ns, when the data are sent to the SC.

## TTC Links Synchronization

Once all the chambers are “in phase” with the beam it is possible to adjust the different TTCrx delays looking at the time distribution of local trigger events. Every time a local High Quality Trigger is detected its arrival time with respect to the BC0 is histogrammed. Once filled, the histogram is cross-correlated with the expected LHC beam pattern.

## Trigger Links Synchronization

The trigger links are located in the SC unit: the transmission clock phase is fixed by hardware in order to be synchronized with the Server Board transmission clock. The different trigger link delays are compensated using FIFOs at the receiver end. Once TTCrx have been synchronized the Test Pulse System (TPS) can be used to generate trigger data at the same time in all the chambers. The Regional Trigger should act on the trigger link receiver FIFOs in order to get the test pulse trigger data at the same time from all the SCs.

### 9.8.2 Latency Determination

Chamber trigger latency can be divided into three contributions: analog signal propagation, trigger computation and digital signal propagation as shown in Table 9.4.

**Table 9.4:** Drift Tube Chambers Trigger Latency

|                           | Time (ns) | bx cycles |
|---------------------------|-----------|-----------|
| Time of flight (4 to 10m) | 33        |           |
| Cell drift time           | 380       |           |
| Wire signal               | 5         |           |
| Front-end electronics     | 3         |           |
| Cables front-end          | 20        |           |
| Total                     | 441       | 18        |
| BTI                       |           | 4         |
| TRACO-TSS-TSM             |           | 9         |
| Chamber link              |           | 1         |
| Sector Collector          |           | 2         |
| Sector link               |           | 20        |
| Total                     |           | 54        |

Analog signal propagation consist of particle time of flight from interaction vertex to detector sensitive volume and detector signal propagation up to trigger front-end input.

Trigger computation is divided in BTI, TRACO, TSS and TSM parts, including synchronization time needed for detector asynchronous signals sampling. TRACO and TS computations overlap, reducing the sum of the respective latencies to 6 clock cycles: the actual computation time of each device is 6bx for TRACO, 6 bx for TSS and 3 bx for TSM.

Digital signal propagation includes data transmission from the Server Board to the Sector Collector and from the Sector Collector to the Regional Muon Trigger. Since synchronization of trigger links is accomplished in the Track Finder, using programmable depth FIFOs, length of sector links are at minimum and respective latency is intended as maximum.

## 9.9 Prototypes

### 9.9.1 BTI Prototype and Test Bench Performance

The first BTI prototype [9.12] [9.13] [9.14] was designed using FPGA technology, while the other prototypes where produced as custom ASIC chips.

Although using the biggest available FPGA at the time (XILINX XC4013, 6 ns grade) the implementation of the algorithm encountered serious space limitations, forcing us to downgrade the requirements.

We could build only a device programmable for fixed incidence angle including only a part of the wire couples and we could not output any information on position. Three FPGA prototypes were tested on a chamber prototype. Efficiency and uniformity of the response were checked in order to tag eventual algorithm defects. As an example the average efficiency for different incident muon directions that could be extracted from the data is given in Fig. 9.26. The efficiency is compared with the expectation results of the Montecarlo used to tune up the prototype design and to define the sets of couples. The good agreement between Montecarlo and FPGA behaviour is a sign of the good quality of the trigger design, but the difficulty found to fit the needed logic blocks inside the FPGA demanded an ASIC device.

The ASIC prototype chip [9.15] was designed using the Standard Cell technique in a CMOS 0.5 $\mu$ m technology with three layers of metal. The total area was 19mm<sup>2</sup> for 60kgates and prototypes were packaged either in a ceramic 68 pins LCC or in a 68 pins plastic TQFP. The BTI, when powered at 3.3V, dissipates 200mW at 80MHz operating frequency. To get the best efficiency to noise ratio in the operating background conditions a careful setup is needed of programmable parameters like angular acceptance, time filters, drift velocity and wire dead time. The latency is equal to the maximum drift time (dependent on the programmed drift velocity) plus 100ns (8 clock cycles) for chip calculations.

BTIs can be programmed either via a JTAG (IEEE Std. 1149.1-1990) port or via a parallel interface using the trigger data bus for back-propagating parameters from TRACO and TS.

The JTAG port can be used either to check chip interconnections during the PCB verification or to program internal registers. A built-in feature of this interface is the capability to



**Fig. 9.26:** Comparison of the measured efficiency versus the track inclination for the FPGA prototype and the BTI model with LTS.

monitor chip activity without interfering with the trigger function. This can be performed accessing (via a SAMPLE instruction) the snap registers connected to the input signals and to the output bus.

The parallel interface, much faster than the JTAG one, can access all the BTI internal registers and is thought to be a backup solution for BTI setup. Using the trigger bus the downstream devices have the possibility to access the internal registers of the connected BTIs.

System testability and visibility are good owing to built-in self test logic, snap registers insertion and boundary scan implementation.

The BTI includes BIST and emulation capabilities for testing purposes. In spite of the large input domain, chip validation can be performed using a set of 64k vectors only, thanks to the BIST circuitry.

Trigger diagnostics is possible using the BTI emulation mode. First the timing data of events to be tested are downloaded in the involved BTIs; then chips are programmed in emulation mode. When the specific trigger command is issued, the event is emulated at full speed and the relative trigger data can be found stored in the snap registers or can be directly read out with the programmed latency at the chip output.

The emulation mode was used to perform bench tests before using the BTI on a chamber prototype. The benchmark was composed by a 40K sample of hits generated using the full GEANT simulation of single muons in front of a BTI in order to have a realistic spectrum of the input data.

The efficiency as a function of the angle of incidence is reported in Fig. 9.27, showing that we have a flat response till 45°, while efficiency is rapidly falling till 55°, matching our design expectations.



**Fig. 9.27:** Bunch crossing detection efficiency for a set of ASIC BTIs. The LTS mechanism is not active and notice that the abscissa scale is not linear.

### 9.9.2 TRACO Prototype and Test Bench Performance

TRACO prototypes have been designed as standard-cell ASICs using ATMEL 0.5 $\mu$ m CMOS technology with three metal layers. It failed the tests and is being resubmitted.

### 9.9.3 TSS Prototype and Test Bench Performance

The first preliminary studies of the sorting algorithm were done using an FPGA device which revealed the necessity to use an ASIC in order to fulfil the speed requirements.

A full performance prototype was built at the end of 1997 using 0.7  $\mu$ m CMOS technology (two metal layers) [9.16]. The sorting is performed among four 10-bit words and the chip has all the functionalities necessary for the integration with the TRACOs and BTIs. The area of the chip is about 23 mm<sup>2</sup> and is determined by the 104 I/O pads. Ten prototypes were packaged in ceramic 120 pin QFP. The TSS, powered at 5V, dissipates about 150 mW at 40 MHz.

The chip prototype was mounted on a test board and tested in standalone mode. Patterns were generated with a VME 40 MHz pattern generator, the Pattern Unit [9.17] [9.18], and injected into the TSS prototype using flat cables. The output of the chip was read back by the Pattern Unit. Sequence of test patterns were prepared with a software emulator with the same functionality implemented in the hardware and the output of the chip was crossed-checked with the expected result. In this way it was possible to test all the functionalities of the TSS. The chip was working up to 60 MHz. The final TSS chip will be produced using 0.5  $\mu$ m CMOS technology with three metal layers and a significant improvement of the performance is expected.

### 9.9.4 TSM Prototype and Test Bench Performance

A prototype of the TSM system was built in 1999. The TSMS, TSMD0 and TSMD1 logic for the sorting functionality was implemented using three identical Quicklogic pASIC chips QL3025, speed grade 3, 208 pins. They were hosted on a prototype Server Board, with dimensions enlarged to fit in a VME 6U crate, which provided connections for the control signals and I/O (404 bits), and test points on all internal connections between the three chips. PREVIEWs and full-track data were generated at 40 MHz with three Pattern Unit [9.17] VME boards, mimicking seven Trigger Boards connected to the TSM system. Control signals were provided through a fourth Pattern Unit board mimicking the DT chamber Controller. A Pattern Unit with special adaptors with LVDS receivers was used to read the TSM output. In this prototype the two output data words, i.e. the two selected tracks in the chamber, were transmitted on two separate data buses in parallel in one clock cycle.

The system was tested using a set of about 1200 specific patterns, which, developed as case studies during the system design and simulation, test all the different aspects of the TSM sorting functionality. The TSM prototype successfully passed all the tests:

- (i) the functionality of the individual chips was validated;
- (ii) the propagation times to the chips, from the TSMS and the TSMDs, from the TSMDs to the LVDS output drivers were studied in detail and validated;
- (iii) finally the system functionality was validated.

The TSMS sorter chip was also implemented and tested using a Actel A54SX16, which showed better performance than the Quicklogic chip.

## 9.10 Test and Simulation Results

The drift tube local trigger will be implemented using special purpose devices. It was essential to have a simulation of these devices interfaced to the general CMS simulation in order to test the expected performance of the hardware and verify the algorithm effectiveness. Thus a very detailed simulation of the L1 muon trigger description was developed inside the CMSIM framework and is continuously maintained and updated to reflect hardware changes. Recently all the code was successfully translated in the OO technology inside the ORCA program [9.19].

The geometry of the muon chambers layout reflects always the latest available mechanical design, including dead areas. The digitization is performed on parametrized functions, (depending on impact position, crossing angle and B field components) obtained by detailed studies using the drift cells description program GARFIELD, whose predictions were verified on available situations at the test beam. The simulation includes all the known backgrounds (bremsstrahlung,  $\delta$ -ray production,...), excluding pile-up from events on other bunch crossings. While the CMSIM simulation uses floating point calculations although taking care of integer bit rounding effect, the ORCA simulation models the exact integer bit operations. Simulation results reported here refers either to CMSIM version 117 or to ORCA version 3.

Several tests were done either on bench or in test beam areas or under cosmic rays on the available trigger electronics during the past years. The BTI prototype was in fact completely tested in several environmental conditions.

In the following paragraphs we will show the test results of the BTI prototypes and the behaviour expected from simulation of the other devices in the drift tubes trigger chain.

### 9.10.1 Bunch and Track Identifier

The BTI ASIC tests were performed at the CERN Gamma Irradiation Facility (GIF) in August 1998 exposing a full size chamber to a muon beam with energies in the range 80-100 GeV and during winter 1998-1999 in the Legnaro INFN National Laboratories, exposing it to cosmic rays.

Two SLs were equipped with a card carrying eight BTI ASICs. The BTI is a synchronous device and therefore its performance is strictly related to the phase of the clock with respect to the particle crossing time. A 80 MHz FIFO unit [9.17] [9.18] was used to identify the 25 ns time slot assigned to the BTI trigger: the BTI computed parameters corresponding to any time slot were stored within the FIFO. The BTI data and the drift times were recorded using 960 MHz TDCs with multihit capability [9.20].

The recorded drift times were used to reconstruct beam tracks using a linear least squared fit. Only hits with drift times below 400ns were considered. The space-time conversion was linear using an average drift velocity of 56  $\mu\text{m}/\text{ns}$  and all three or four points tracks with a  $\chi^2$  fit probability better than 0.1% were selected.

The time difference between the event trigger and the BTI internal clock is computed using a TDC channel with a 50ns jitter. This difference defines the synchronization time of any event and can be used to select the events finding their phase with respect to the BTI clock. The performance of the BTI as a function of the synchronization time is shown in Fig. 9.28 for the HTRG signals in the standard configuration and in the minimum acceptance configuration acting on the alignment tolerance. As expected the dependence of the fraction of HTRGs on the synchronization time is a periodic function.

This dependence is much more important for the minimal acceptance than for the standard acceptance, because of the stricter requirement on alignment of the hits.



**Fig. 9.28:** High Quality trigger time slot assignment as a function of the synchronization time.



**Fig. 9.29:** Correlation between BTI position parameter and track fitted position. All triggers in the right time slot were included.

We verified that the probability of missing an HTRG in the  $\pm 4\text{ns}$  range around the distribution center introduces only a 1% systematic effect in the evaluation of the HTRG fraction.

The analyzed sample satisfies the following criteria:

- 3 or 4 points fitted track choosing the best  $\chi^2$  track with  $\text{Prob}(\chi^2) > 0.1\%$
- fitted position falling within acceptance of the installed BTIs
- synchronization time within  $\pm 4\text{ns}$  from the central value of the central time slot

### Position Studies

The BTI provides a position measurement in a reference frame with the origin on its left side (see Fig. 9.7).

The BTI position can be converted into an absolute position: Fig. 9.29 shows the correlation between the BTI computed position and the fitted track position for triggers issued only in the right time slot.



**Fig. 9.30:** Position resolution for different trigger qualities.

The difference between the computed position and the fitted position is given in Fig. 9.30: the width of the distribution provides the BTI position resolution.

According to the used BTI configuration, the internal calculations are done using a 0.7mm least count and the trigger is output with a 1.4mm least count. The distribution should therefore have a width  $\sigma \sim 1.4 / \sqrt{12} \sim 0.4\text{mm}$  in the hypothetical case that all the events would fall in one bin.

The resolution exhibits a slightly larger width, but it well agrees with expectations: HTRGs should have a better resolution than LTRGs, because of the larger lever arm for the calculation.

### Direction Studies

The performance check of the BTI angular calculations was done using cosmic rays. The BTI is not computing directly the track direction, but it determines a k-parameter related to the angle of incidence. The correlation between the measured k-parameter and the fitted track direction is shown in Fig. 9.31. Although the angle itself is never internally used, for sake of clarity it is better to compare the measured and fitted angle, converting the computed k-parameter into the angle  $\psi$ . Their difference is shown in Fig. 9.32, for different trigger qualities.

Once again the result matches expectations(  $\sigma \sim 60 / \sqrt{12} \sim 17\text{mrad} \sim 1\text{ degree}$ ).

### BTI Efficiency

The BTI triggers are assigned to 25 ns time slots. The time slot assignment determines the parent bunch crossing due to the mean-timer principle. If the trigger is found in the expected time slot the bunch crossing is correctly identified, otherwise it is wrongly fixed. Hence the BTI efficiency is defined as the probability per event to have a trigger in the right time slot.



**Fig. 9.31:** Correlation between the computed k-parameter and the fitted angle.  
All triggers in the right time slot were included.



**Fig. 9.32:** Angular resolution for different trigger qualities.



**Fig. 9.33:** Efficiency of the BTI at normal incidence as a function of the crossing point inside the drift cell.

The uniformity of the BTI response along the cell is shown in Fig. 9.33: the BTI efficiency as a function of the particle impact point at normal incidence is flat. This fact implies that the drift cells design has indeed met the basic linearity requirements.

But the space-time linearity relation required to assure a correct performance of the BTI is not achievable in the whole angular range. The BTI is still able to trigger correctly until the deviation from linearity is below 25ns, but at very large angle there are much bigger deviations.

Fig. 9.34 and Fig. 9.35 show respectively the behaviour of the BTI trigger efficiency and the probability of out of time triggers in selected  $5^\circ$  angular intervals, separately for the full sample and the four hits sample.

The BTI response is almost flat up to about an angle of incidence of  $30^\circ$ . Beyond this value the BTI starts to fail in finding alignments of four hits at the right time and simultaneously the probability of finding an out of time HTRG increases. The failure is dramatic for the full sample and rather more limited for the sample of fitted four hits tracks. This fact confirms that the HTRG



**Fig. 9.34:** Relative fraction of HTRGs (full symbols) and LTRGs (open symbols) in the BTI as a function of the muon angle of incidence. The circles refer to the subsample of fitted four points tracks, while the squares refer to the full selected sample. The efficiency of the BTI is given by the sum of HTRGs and LTRGs and is 100% everywhere except at normal incidence due to the presence of the walls of the drift tubes.

loss in not due to errors in the BTI design but to the unavoidable linearity degradation of the cell with increasing incidence angle.

The out of time triggers are mostly issued in the time slot just before the right time slot. Therefore activating the LTS mechanism a fraction of the good LTRGs will be cancelled. Unfortunately LTRGs will be dominant at very large angle. Anyway it is important to remind that the chamber was the first full size prototype and improvements in its performance should be expected in the final version.

Another interesting check is the efficiency in the different configurations that were used for the BTI. The results for normal incident muons are summarized in Table 9.5. The relative fraction of LTRGs and HTRGs is consistent with the already measured probability of  $\delta$ -ray generation in the interaction of the muon with matter[9.4].

It is clear that the BTI inefficiency is negligible, but it is also evident that the LTS mechanism, activated for all configurations except the first one, is rejecting a small amount of good triggers.

Summing the HTRG fraction to the LTRG one, the overall trigger rate is roughly constant in all cases. Therefore in the minimal acceptance configuration there is just a different population, with HTRGs becoming LTRGs, rather than the creation of inefficiencies.



**Fig. 9.35:** Probability per event of out of time HTRGs in the same BTI at different time slots as a function of the muon angle of incidence.

The maximal acceptance configuration result shows that the acceptance requirement provided by the standard configuration is sensible, since no real efficiency gain is obtained relaxing the alignment tolerance.

### BTI Efficiency in Magnetic Field

Data were taken in magnetic field in several configurations [9.14]. Unfortunately for most of these configurations the BTI prototype data were not available due to system faults, but the drift-times were correctly recorded and therefore the analysis of the trigger performance using the full scale BTI model could be successfully performed. Thus, in this section, we will only quote the results of the software model.

**Table 9.5:** Efficiency figures for the tested configurations.

| BTI acceptance | LTS | HTRG fraction | LTRG fraction | Inefficiency |
|----------------|-----|---------------|---------------|--------------|
| Standard       | off | 84.0%         | 15.6%         | 0.3%         |
| Standard       | on  | 85.1%         | 13.6%         | 1.3%         |
| Minimum        | on  | 70.7%         | 28.2%         | 1.1%         |
| Maximum        | on  | 84.8%         | 13.8%         | 1.4%         |



**Fig. 9.36:** Definition of the magnetic field components.

The magnetic field components are defined in Fig. 9.36 with respect to the chamber layout:  $B_n$  is the component perpendicular to the chamber,  $B_w$  is the component parallel to the wires and  $B_E$  is the component parallel to the electric field.

The global effect of a magnetic field is an elongation of the electron drift path to the anode, resulting in a longer maximum drift time. It also introduces deviations from linearity of the space-time relationship, that are quite important for  $B_w \neq 0$ .

Data were taken with the chamber normal to the beam for several values of  $B_n$  or  $B_w$  or  $B_E$  separately, keeping null the other field components.

When the chamber was inclined in the magnetic field the situation was more complex: in the case of vertical wires the magnetic field had the two components  $B_w = B \sin \psi$  and  $B_n = B \cos \psi$ , while in the case of horizontal wires the two components were  $B_E = B \sin \psi$  and  $B_n = B \cos \psi$ .

The trigger efficiencies for the situation  $B_w \neq 0$ ,  $B_E = B_n = 0$  are shown in Fig. 9.37 for several track inclinations.

The efficiency shows a marked dependence on the field sign for inclined muons. This effect is explained from the drift lines distortion introduced by the magnetic field. The major effect is the tilting of the drift lines by the Lorentz angle in such a way that they do not appear anymore



**Fig. 9.37:** Efficiency of the BTI software model as a function of  $B_w$  for several track inclinations. The lines are drawn just to guide the eye.

**Table 9.6:** Efficiency of BTI model with  $B_E \neq 0$  at normal incidence.

| $B_E(T)$   | 0.0   | 0.5   | 1.5   |
|------------|-------|-------|-------|
| $B_n(T)$   | 0.0   | 0.0   | 0.0   |
| $B_w(T)$   | 0.0   | 0.0   | 0.0   |
| Efficiency | 0.943 | 0.936 | 0.887 |

**Table 9.7:** Efficiency of BTI model with  $B_n \neq 0$  at normal incidence.

| $B_E(T)$   | 0.0   | 0.0   | 0.0   |
|------------|-------|-------|-------|
| $B_n(T)$   | 0.0   | 0.5   | 1.0   |
| $B_w(T)$   | 0.0   | 0.0   | 0.0   |
| Efficiency | 0.943 | 0.946 | 0.483 |

normal to the beam at  $\psi = 0^\circ$ . Therefore the beam inclination with respect to the drift lines for one field sign approaches the normal and for the other sign is larger than the chamber nominal inclination. Of course the other sign of the chamber angle with respect to the muon beam direction should cause the symmetric behaviour.

The efficiencies measured for the simple magnetic field configurations are reported in Table 9.6 and Table 9.7, while for the inclined beam situations the data are reported in Ref.[9.14].

In CMS the BTI is expected to work on chambers in which the field will be generally well below 0.2T. Only at some corner chambers the field is expected to be highly inhomogenous with components  $B_n$  varying from 0T to 0.8T and  $B_w$  or  $B_E$  varying from 0T to 0.3T.

Looking at the obtained results we see that the effect is negligible for a field with components  $B_n < 0.5T$  and  $B_w$  or  $B_E < 0.2 T$ . The CMS region where the magnetic field exceeds these values is only the far corner of the first muon station. Since this region is fully covered by the forward chambers we do not expect any trigger loss.

### Performance of the BTI in the $\theta$ view

The  $\theta$  view SL is looking at the tracks in the non-bending plane projection. Therefore the muon tracks will be distributed about the line connecting to the production vertex, due to the effect of multiple scattering. Each BTI in this view will see tracks coming from the vertex at fixed angle depending on its geometrical position. The BTIs will therefore be programmed to forward only tracks reconstructed within a predefined angular acceptance.



**Fig. 9.38:** Efficiency of the  $\theta$  view trigger as a function of pseudorapidity for a sample of muons of  $17\text{ GeV}/c < p_T < 20\text{ GeV}/c$

The BTIs close to the forward walls are the ones that will experience the worst situation due to the stray magnetic field: in particular the first corner station will be the most suffering one.

All these effects were included in the simulation program and Fig. 9.38 shows the efficiency of the  $\theta$  view BTI in the first corner station. The effect of the linearity degradation with large angle combined with the magnetic field distortion is evident. A reasonable global efficiency is anyway reached, although at the far end the fraction of HTRGs is quite poor.

### 9.10.2 Track Correlator and Trigger Server

The TRACO and TS devices are interacting at several stages making selections on track quality and basing the choice on both devices sorting mechanism. In particular the TRACO is using ghost suppression algorithms based on TS $\theta$  results.

It is quite difficult to isolate contributions of the single device without having at least a small contribution on the effective flow chosen on the other one. The simulation results [9.6] are therefore presented at the TS output for particular settings of both devices and must be considered as global results, unless explicitly specified.

However we must remind that TRACO and TS have very different impact on the quoted performance: while the contributions on single muon efficiency and noise are to be considered more or less equally divided, resolution and acceptance cuts are mainly connected to the TRACO algorithms and dimuons efficiency is more related to TS design.

## Efficiency

The efficiency of the TRACO-TS system to find a track candidate when a single muon track traverses the detector, was studied using two samples:

- the first sample consisted of 10000 single muons of  $p_T = 100 \text{ GeV}/c$ , generated inside the iron in front of one correlator with incident angle  $-60^\circ \leq \psi \leq 60^\circ$



**Fig. 9.39:** Difference between the generated and the computed angle for different quality of the triggers. Notice the different scale for the histograms at right bx and the histograms at wrong bx.



**Fig. 9.40:** TRACO efficiency as a function of the incident angle (Notice that the abscissa scale is not linear).

- the second sample consisted of 100000 single muons generated from the interaction vertex with  $3.5 \text{ Gev}/c \leq p_T \leq 300 \text{ Gev}/c$ .

Fig. 9.39 shows the difference between the computed incidence angle and the actual incidence angle of the muon for different track quality and, only for the uncorrelated triggers, for cases where the determined bunch crossing is the correct or the wrong one.

It is evident from the distribution that the resolution of uncorrelated triggers is worse than the one of correlated triggers. Besides the angle calculated for the uncorrelated LTRGs misidentifying bunch crossing has got a wide range.

Fig. 9.40 shows the efficiency of the TRACO as a function of the angle of incidence: the TRACO has a flat probability ( $\sim 80\%$ ) to correlate within its  $45^\circ$  acceptance range, while outside this range only uncorrelated triggers are available. We do not expect anyway to trigger on tracks with angle larger than  $45^\circ$ .

The performance as a function of the muon momentum was evaluated at the Trigger Server output for three relevant major configurations of the TRACO acting on the uncorrelated LTRGs acceptance:

This configurations were:

- *L accepted on any θ trigger*: uncorrelated LTRGs are accepted only if they are confirmed by a BTI trigger of any quality in the  $\theta$  view. This is supposed to be the standard configuration for data taking
- *L accepted on HTRGθ trigger*: uncorrelated LTRGs are accepted only if they are confirmed by a BTI trigger of HTRG quality in the  $\theta$  view



**Fig. 9.41:** Local drift tubes trigger efficiency as a function of transverse momentum for the major configurations.

- *L not accepted*: uncorrelated LTRGs are not accepted

The performance for the four muons station, once corrected for station acceptance is the same: therefore we will give only results for station 1.

Fig. 9.41 shows the efficiency, corrected for the muon chambers geometrical acceptance, for the three selected conditions. Before activating the complete suppression of uncorrelated LTRGs there is clearly space for an intermediate step that causes a negligible efficiency loss. But even in the case we are forced to reject uncorrelated LTRGs, the efficiency remains in a sensible range.

In the standard configuration the fraction of events giving triggers only at the correct bunch crossing is slowly decreasing from  $\sim 66\%$  at 5 GeV/c to  $\sim 60\%$  at 300 GeV/c, while events not triggering at the right bunch crossing are in fact giving triggers only at wrong bunch crossings.

The relative fraction of triggers at the correct bunch crossing divided by track quality is shown in Fig. 9.42 in the standard configuration, showing a roughly constant value.

Since the sample consisted of single muons, the noise markers are the fraction of two tracks per event selected by the TRACO and the quantity of out of time triggers.

Fig. 9.43 reports the fraction of Second Tracks generated after applying the standard TRACO/TS selection mechanisms and Fig. 9.44 shows the average fraction per generated event of out of time triggers.

Therefore the uncorrelated LTRG suppression, despite generating a large inefficiency, has a limited impact on the noise reduction.



**Fig. 9.42:** Relative fraction of triggers divided by quality as function of transverse momentum



**Fig. 9.43:** Fraction of Second Tracks per events as function of transverse momentum for the major configurations



**Fig. 9.44:** Fraction of out of time triggers as a function of transverse momentum for the major configurations

Since the  $\theta$  view BTI efficiency is decreasing with pseudorapidity we could expect effects on the performance of the local trigger. The  $\theta$  view filter is anyway acting only on uncorrelated LTRGs and the impact is quite low, as can be deduced from Fig. 9.45.



**Fig. 9.45:** Efficiency of the drift tubes local trigger as a function of pseudorapidity.

## 9.11 Noise Reduction

The design of the trigger devices was done with the purpose of providing a robust and efficient system. Unfortunately the way to meet these requirements introduces a certain number of redundancies in the system causing an important fraction of false or duplicated triggers [9.3] [9.6].

### 9.11.1 Noise Generation Mechanisms

The BTI trigger algorithm can actually work requiring only three layers of staggered tubes. The drawback to three layers algorithm is the fact that an inefficiency or a bad measurement on any of the cells becomes an inefficiency or a wrong trigger. The introduction of the fourth layer with the minimal request of an alignment of three out of four hits maximizes the efficiency and minimizes the wrong measurements. But some spurious alignments of three hits can occur at any bunch crossing, depending on the actual track crossing position and direction.

Most of the bad alignments are generated from the unavoidable left-right ambiguity even at several bunch crossings distance from the alignment of the four hits.

An example of the mechanism is shown in Fig. 9.46: a real track orthogonal to the chamber is displayed and the hit positions are marked with small circles on the track line. The BTI, looking for alignments of at least three hits, is able to find the alignment corresponding to the real



**Fig. 9.46:** Illustration of the ghost generation mechanism inside BTI.



**Fig. 9.47:** Illustration of the irreducible redundancies between overlapped BTIs.

track, but other two tracks are detected. These tracks, called ghost tracks, correspond to alignments of a mixture of real hits and their mirror images. Indeed the BTI supposing that wire 2 is inefficient and supposing that the signal of wire 4 comes from the right side of the tube, finds a false alignment at time  $\Delta t_1$  after the right bunch crossing. In the same way, supposing that wire 5 is inefficient, the BTI finds another ghost track, formed from the signals of wires 2 and 4 and the mirror image of signal from wire 3, at time  $\Delta t_2$  before the right bunch crossing.

This effect occurs inside the BTI at different bunch crossings and therefore generates temporal noise: let's call the ghosts generated by this mechanism *noise of type I*.

Furthermore in order to be fully efficient the trigger system provides some overlapping between adjacent devices: one BTI is overlapped by five cells to its neighbours and BTIs in the outer Superlayer are always assigned to three consecutive TRACOs.

The overlap between BTIs is foreseen in order to minimize the impact of the loss of one device on the trigger efficiency, since the remaining one can be programmed to partially cover the dead area switching on some redundant patterns.

It is not possible to define a set of completely non-redundant patterns and therefore some of them are available in two consecutive BTIs: in fact there are five redundant patterns generating LTRGs on the devices close to the one generating the HTRG at the same bunch crossing.

In Fig. 9.47 we see a case where a valid HTRG pattern in one BTI is also seen as a valid LTRG pattern in the adjacent one.

Therefore the TRACO will have the chance to make a choice between candidate tracks in adjacent BTIs that are images of the same track, carrying exactly the same information. This is



**Fig. 9.48:** Illustration of the double track selection due to overlapped TRACOs. The solid lines are the acceptance window of the  $i$ -th TRACO and the dashed lines are the acceptance window of the  $(i+1)$ -th TRACO. The diagram on the right draws the acceptance windows on the same origin to evidence their intersection (shaded region). A muon falling in this intersection region is assigned to both TRACOs

not a problem for the First Track sorting, since both are equivalent, but it may result that the TRACO is forwarding the same track twice, with a chance of losing other available candidates. This is a duplication of the trigger at the same bunch crossing and generates spatial false triggers called *noise of type II*.

There is another situation (*noise of type III*) that is generating spatial noise candidate tracks. The BTIs in the outer sorter are assigned to three consecutive TRACOs, being in the left, the central or the right group of the outer SL. The BTI data are sent to each TRACO through a dedicated port. Each port is programmed with a different angular acceptance window, depending on the fact that it is communicating to the left, central or right group. These tolerance windows are partially superposing: therefore a candidate track falling in the intersection zone, is forwarded from the BTI to more than one TRACO as shown in Fig. 9.48. Hence, as in the case of adjacent BTIs, adjacent TRACOs can forward to the Trigger Server twice the same track. Again this fact may introduce a bias in the Second Track selection at the TS level.

### 9.11.2 Noise Reduction Methods

We have seen that there is a temporal noise, due to left-right ambiguity (*type I*), that generates ghost tracks at wrong bunch crossings and spatial noise caused either by redundancy of the BTI equations (*type II*) or by the overlap of the BTI acceptance ports (*type III*), generating copies of the same track.

Some filters have been provided to reduce the overall importance of these effects.



**Fig. 9.49:** Time distribution of BTI output triggers, for different trigger qualities and configurations.

The BTI can only act on temporal noise. The mechanism provided to perform this task is the Low Trigger Suppression algorithm: the LTRGs occurring in the range (-1 bx,+8bx) around a HTRG are cancelled.

Fig. 9.49 shows the time distribution of BTI triggers for HTRGs and LTRGs without filtering and for LTRGs, once the Low Trigger Suppression is activated.

There is a low probability to issue an HTRG at the wrong time slot, while there is a large number of wrong LTRGs. The average fraction of wrong triggers per event in the different configurations is given in Table 9.8. The noise reduction caused by the LTS mechanism is evident.

Most of the wrong HTRGs are due to noise hits: one of the drift times is different than expected and the alignment occurs around the right angle and position, but at a wrong time, and only a LTRG is issued at the right time. This effect is causing a small efficiency drop when the LTS mechanism is activated, since the right LTRG is cancelled by the nearby wrong HTRG.

Fig. 9.50 shows the time distribution of the triggers after TS processing in the standard configuration, separately for each trigger quality. Correlated triggers containing at least one HTRG segment are quite clean and single uncorrelated HTRGs are reasonably clean. The bunch crossing identification is instead bad for correlated LL and single uncorrelated LTRGs.

Clearly some noise filtering must be provided to reduce this kind of triggers to an acceptable level. Some of the implemented filters are optional, since the importance of the reduction will only be clear on the field.

**Table 9.8:** Average fraction of out of time triggers in the BTI at normal incidence.

| BTI acceptance | LTS | %H out of time | %L out of time |
|----------------|-----|----------------|----------------|
| Standard       | off | 3.0%           | 351.2%         |
| Standard       | on  | 3.1%           | 148.2%         |
| Minimum        | on  | 1.1%           | 175.6%         |
| Maximum        | on  | 4.1%           | 165.3%         |



**Fig. 9.50:** Time distribution of TRACO triggers, divided by trigger quality.

In order to reduce the *type I noise* we introduced a LTS mechanism inside the TRACO: the low quality tracks ( $L_L, L_o, L_i$ ) are cancelled if a HTRG occurred within the neighbouring bunch crossings. It is possible to suppress triggers at bx from +1 to -4 with respect to any HTRG without any latency addition.

*Noise of type II and III* can affect only the Second Track selection. It is possible to avoid sending twice the same track using a geometrical suppression filter. If a HTRG was selected in the First Track sorting operation, all the LTRGs in the neighbouring BTIs are removed from the Second Track sorting list. This filter, always active, acts on *type II noise*. A similar procedure, inhibiting Second Track  $L_o$  selection, can be applied to neighbouring TRACOs inside to chamber Trigger Server to remove *type III noise*.

The effect of the application of the filters for *type I and II noise* is shown in Table 9.9, compared to a sample of muons at 100 GeV/c in the standard TRACO configuration: the LTS acts on out of time triggers, while the LTRGs suppression on adjacent BTIs acts on the two tracks fraction.

There is another possible cut to be applied to clean the TRACO output: a programmable tolerance window is implemented for the bending angle. The bending angle for some low

**Table 9.9:** Effect of type I and type II noise filters. The type III filter is always active.

|                                     | Standard | Type I filter on | Type II filter off |
|-------------------------------------|----------|------------------|--------------------|
| Two tracks fraction                 | 6.5%     | -                | 19.4%              |
| Out of time uncorrelated L fraction | 53.5%    | 38.7%            | -                  |



**Fig. 9.51:** Average bending angle at the different muon measurement stations for some low transverse momentum muons

momentum muons at all the measurement stations is shown in Fig. 9.51. Indeed there is a large spread for the average bending angle values at stations 1 and 2, while the bending angle is close to zero at station 3 and 4.

We cannot safely apply any cut in the first and second station, while a tolerance on the bending angle can be used for station 3 and/or 4.

Table 9.10 shows the effect of the bending angle cut on efficiency and noise (i.e. the fraction of out of time triggers) for  $p_T = 8$  GeV/c muon tracks. The cut is an 8 bits value to be downloaded to the TRACO.

**Table 9.10:** Efficiency and noise for different bending angle acceptance windows in the outer stations.

| $\Phi_b$ cut<br>(degrees) | Station 3      |           | Station 4      |           |
|---------------------------|----------------|-----------|----------------|-----------|
|                           | Efficiency (%) | Noise (%) | Efficiency (%) | Noise (%) |
| 51.6                      | 98.0           | 98.0      | 98.0           | 92.6      |
| 43.5                      | 97.3           | 78.0      | 97.2           | 75.0      |
| 32.3                      | 96.7           | 65.0      | 96.8           | 65.0      |
| 17.5                      | 96.4           | 48.9      | 95.1           | 50.7      |
| 9.0                       | 94.7           | 34.2      | 76.2           | 30.3      |



**Fig. 9.52:** (a) Efficiency of the TS to deliver two track segments in a muon station at the correct bx, when two 100 GeV muons crossed the station, as a function of the distance between the two muons. (b) Probability of the correct identification of both muons by the TS system, as a function of their distance.

### 9.11.3 Dimuons Detection Efficiency

The noise filter algorithms described above were developed to reduce the rate of double tracks at the output of the drift-tube local trigger for single incident tracks, and therefore they only slightly affect the single muon trigger efficiency. On the other hand, the noise suppression is expected to have a larger impact on the efficiency to correctly detect muon pairs when both muons cross the same station, since at most two track segments are delivered at the output of the TS system.

Ideally the trigger system should maximize the efficiency to detect muon pairs and the purity altogether, i.e. the two output track segments should correspond to the two different muons.

The performance of the system on dimuons was studied at the output of the TS with the default algorithm (i.e. with *type I, II and III noise filter on*), for muon pairs crossing a single muon station. Fig. 9.52(a) shows the efficiency of the TS to deliver two track segments in a muon station at the correct bx, when two 100 GeV muons crossed the station, as a function of the distance between the two muons. The small drop in efficiency occurs when the two tracks are closer than the distance of two physical TRACO units, and is due to the noise suppression algorithms, which in this case can reject good track candidates.

In Fig. 9.52(b) the probability of correct identification of both muons by the TS system is shown as a function of their distance. The purity is maximal when the two tracks are separated by more than two TRACOs. The drop of the probability of correct identification at small track separation is due to the noise, since a ghost track can be selected as the second track candidate.

## 9.12 Radiation Tests

Since the barrel muon chambers are shielded by the iron yoke against the effects of charged low energy particles, the background flux will be dominated by neutrons produced all around inside the cavern by interactions of the beam halo with the devices on the LHC beam line and the detector itself, that thermalize interacting with the cavern.



**Fig. 9.53:** Expected neutral particle fluence through the innermost (MB1) and the outermost (MB4) muon barrel stations.

Extensive simulation studies [9.21] were done to estimate the rate of background particles at all positions inside CMS. The results of the simulation are shown in Fig. 9.53, where we immediately see that the flux is quite low at the barrel muon chambers positions as compared to the other detectors. The energy spectra of neutrons and gammas were determined. The  $\gamma$ -ray background is about the same in all the stations, while the neutron background is linearly decreasing with energy and is naturally ending around 100 MeV in the outermost station and at few hundred MeV in the innermost one, that is suffering from high energy neutrons flooding through the CMS calorimeters and coil.

Preliminary tests were performed to find the effects of these backgrounds on the performance of the local drift tubes trigger.

### 9.12.1 Gamma Irradiation Studies

The high rate gamma background is generating noisy hits inside the chamber, disturbing the correct muon track measurement. The full size DTBX chamber prototype, equipped with BTIs, was installed in the CERN GIF test area where a 15 Ci  $^{137}\text{Cs}$  radiating 66 MeV source is installed.

**Table 9.11:** Comparison between BTI performance without radiation and maximum radiation at normal beam incidence.

|                                    | %HTRG | %LTRG | Inefficiency | %H out of time | %L out of time |
|------------------------------------|-------|-------|--------------|----------------|----------------|
| No Radiation                       | 84.0% | 15.6% | 0.3%         | 3.0%           | 350%           |
| Radiation<br>10 Hz/cm <sup>2</sup> | 83.0% | 16.6% | 0.5%         | 2.5%           | 800%           |

It was therefore possible to test the performance of the BTI in a radiation environment [9.15]. Data were taken using several filters in front of the source reaching a single hit rate of  $10 \text{ Hz/cm}^2$  on the chamber: this rate is the maximum expected one at LHC in the barrel muon detector due to radiation background.

The data were compared to those without radiation looking for differences in the efficiency, the noise probability and the position resolution. Results for efficiency and noise are collected in Table 9.11.

The position resolution is reported in Fig. 9.54. No significant effect is seen, but a large increase of out of time LTRGs and a slight deterioration of the LTRG position resolution. The noise filters provided in the trigger chain are expected to cope easily with this background.

Therefore the BTI sensitivity to random single hits is low and limited to the generation of LTRG noise. Many filtering algorithms are already provided in the drift tubes local trigger design to reduce this kind of background.



**Fig. 9.54:** BTI position resolution in a gamma radiation environment of  $10 \text{ Hz/cm}^2$  single hit rate.

### 9.12.2 Neutron Irradiation Studies

The radiation dose absorbed after ten years of operation at LHC is expected to be enough low to avoid significant permanent radiation damage. But on the other hand it will not be accessible, since most of it is located within the cavern, lodged on the chambers.

We expect therefore that most of the reliability of these electronics will be associated to the probability of occurrence of rare Single Event Effects (SEE) induced by the interaction of the ionizing particles with the silicon chips.

The most likely occurring SEE is called Single Event Upset (SEU). It is detected as a modification of the memory state. All memory devices (SRAMs, DRAMs, FLASH memories, microprocessors, DSPs, logic programmable state machines, etc.) are subject to such a rare event, which is caused by large energy deposition inside a sensitive node of the device. Occasionally this energy deposit can be the cause of a device latch-up: in this case the effect is destructive and the SEE cannot be recovered.

Low energy neutrons are copiously produced in the nuclear laboratories by scattering of proton or deuteron nuclei accelerators on low atomic mass nuclei targets.

The nuclear INFN laboratory of Legnaro has a 7 MV Van de Graaff accelerator. We used the reaction  ${}^9\text{Be}(\text{d},\text{n}){}^{10}\text{B}$  using a thick beryllium target to generate fast neutrons, while thermal neutrons were generated using the same reaction moderating them by inclusion of the Be target in an heavy water tank surrounded by very thick graphite walls. The tested devices were the first prototype of the detector control board, the readout front-end board, the front-end trigger device and a prototype trigger server board.

The boards were irradiated with thermal neutrons on four data taking periods. Since the neutron flux inside the graphite is modified by the inserted boards, we had to measure the actual neutron flux on each device measuring the activation of Indium and Cadmium-Indium targets placed just in front of the integrated circuits.

The only device experiencing SEU was SRAM#1. Fig. 9.55 shows the plot of the SEU numbers versus the integrated neutron dose for all the test periods. The slope of the average line fitted in this plot is a measurement of the SEU cross section of the device. We can only quote a 90% confidence level upper limit of the SEU cross section for all the other tested integrated circuits. Results of the thermal neutron runs are summarized in Table 9.12. The error in SRAM#1 SEU cross section evaluation is essentially systematic: we quote the spread in the calculation between the four different data taking periods. The Mean Time Between Failures is computed for the whole barrel muon detector, considering the number of pieces of each chip used in the electronics layout.

There is some evidence reported in literature that the SEU cross section for fast neutron will be dependent on the neutron energy. The neutrons produced by deuteron interaction with a thick Beryllium target are not monochromatic, but measurements are nonetheless useful to give an indication of the existence of fast neutrons induced SEU. Instead it is quite difficult to get the absolute SEU cross section as a function of the neutron energy.

As expected after the thermal neutron test, we had a large number of SEU from SRAM#1. The final fast neutron dose was one order of magnitude larger than the thermal neutrons



**Fig. 9.55:** SEU progressive number on SRAM#1 versus integrated neutron flux in the thermal neutron run. The line is the measured SEU cross section evaluation.

**Table 9.12:** SEU cross section and estimates of mean time between failures in the full barrel muon detectors due to thermal neutrons interactions. The limits are 90% C.L.

| Component           | Total rate<br>$n/cm^2$ | Device SEU<br>cross section<br>$cm^2$ | Mean time between failures in<br>the full detector<br>hh:mm |
|---------------------|------------------------|---------------------------------------|-------------------------------------------------------------|
| LD Regulator        | $6.87 \times 10^{10}$  | $< 1.38 \times 10^{-10}$              | > 64:19                                                     |
| $\mu$ P             | $6.87 \times 10^{10}$  | $< 1.38 \times 10^{-10}$              | > 385:56                                                    |
| FLASH               | $6.87 \times 10^{10}$  | $< 1.38 \times 10^{-10}$              | > 385:56                                                    |
| SRAM#1              | $6.87 \times 10^{10}$  | $(1.13 \pm 0.2) \times 10^{-9}$       | 23:34                                                       |
| SRAM#2              | $6.87 \times 10^{10}$  | $< 1.38 \times 10^{-10}$              | > 192:58                                                    |
| EPROM               | $6.87 \times 10^{10}$  | $< 1.38 \times 10^{-10}$              | > 385:56                                                    |
| Optical transceiver | $6.87 \times 10^{10}$  | $< 1.38 \times 10^{-10}$              | > 385:56                                                    |
| ASIC TSS            | $2.36 \times 10^{10}$  | $< 1.38 \times 10^{-10}$              | > 33:09                                                     |
| BTI                 | $5.69 \times 10^{10}$  | $< 1.38 \times 10^{-10}$              | > 1:35                                                      |

**Table 9.13:** Fast neutrons induced SEU estimates and upper limits. The quoted cross section assumes that any neutron in the spectrum has equal probability to cause a SEU. The limits are 90% C.L.

| Component           | Total rate<br>n/cm <sup>2</sup> | Device SEU<br>cross section<br>cm <sup>2</sup> | Mean time between<br>failures in the full<br>detector<br>hh:mm |
|---------------------|---------------------------------|------------------------------------------------|----------------------------------------------------------------|
| LD Regulator        | 9.69x10 <sup>11</sup>           | < 9.79x10 <sup>-12</sup>                       | > 907:42                                                       |
| $\mu$ P             | 9.71x10 <sup>11</sup>           | < 9.77x10 <sup>-12</sup>                       | > 5457:34                                                      |
| FLASH               | 9.28x10 <sup>11</sup>           | < 1.02x10 <sup>-11</sup>                       | > 5214:56                                                      |
| SRAM#1              | 5.74x10 <sup>11</sup>           | (3.76±1.31)x10 <sup>-10</sup>                  | 70:54                                                          |
| SRAM#2              | 1.16x10 <sup>12</sup>           | (2.29±1.55)x10 <sup>-12</sup>                  | 14561:46                                                       |
| EPROM               | 8.46x10 <sup>11</sup>           | < 1.12x10 <sup>-11</sup>                       | > 4753:40                                                      |
| Optical transceiver | 9.50x10 <sup>11</sup>           | < 9.99x10 <sup>-12</sup>                       | > 5336:52                                                      |
| ASIC TSS            | 1.44x10 <sup>12</sup>           | < 6.61x10 <sup>-12</sup>                       | > 2016:25                                                      |
| BTI                 | 1.03x10 <sup>12</sup>           | < 9.18x10 <sup>-12</sup>                       | > 29:02                                                        |

one: owing to this fact we could observe SEU also on SRAM#2. Although this RAM is of the same type and belongs to the same lot, we obtained a SEU cross section two orders of magnitude lower than SRAM#1 (4 SEU after  $1.52 \times 10^{12}$  n/cm<sup>2</sup> against 135 SEU after  $0.75 \times 10^{12}$  n/cm<sup>2</sup>). Thus we observed the effect reported in the existing literature of big variations for the same device.

Results are collected in Table 9.13: the quoted numbers presumes that neutrons of any energy have the same probability to cause a SEU, i.e. we did not allow neither any threshold nor any energy dependence in the SEU cross section.

After the whole bunch of tests, each device had received a dose greater than  $10^{12}$  n/cm<sup>2</sup>, equivalent to the expected dose after more than ten years of operation at LHC. We therefore verified the status of each device after irradiation, in order to see if the neutrons had produced any permanent damage. The only device showing a measurable deterioration was the Trigger Server ASIC (TSS), which was drawing a standby current increased by 10% with respect to the same current as measured before the tests. The final chip will be done in 0.5  $\mu$ m technology and we expect a significant performance improvement. Besides none of the devices underwent a latch-up SEE, but the test neutron energy (< 11 MeV) could be too low to release enough energy.

A more detailed analysis can be found in [9.22].

## 9.13 Status and Schedule

The drift tubes local trigger schedule is shown in Fig. 9.56.

The status of each important item is as follows:

- BTI: 200 prototypes of full performance chip fully tested on bench and on beam
- TRACO: 50 prototypes of full performance chip being tested; new submission planned for 10/00
- TSS: I prototype tested on bench; full performance chip being designed
- TSM: I prototype being designed
- BTIM: I and II prototypes tested; III prototype submitted
- CB: I and II prototypes tested; full performance board being tested
- PHITRB128: full performance board being tested
- PHITRB32: full performance board expected by 01/01
- THETATRB: full performance board under test
- SB: I prototype board being tested
- SCB: I and II prototype of Trigger optical link tested; board expected by end 2001
- DTCM: I prototype expected by 03/01; full performance board expected by 12/01
- Minicrate: design and aluminium profile extrusion done

**Fig. 9.56:** Barrel muons local trigger schedule.

Responsibility sharing between involved institutes is given in Table 9.14.

**Table 9.14:** Drift tubes local trigger items responsibility

| Item      | Institute      |
|-----------|----------------|
| BTI       | Padova         |
| TRACO     | Padova         |
| TSS       | Bologna        |
| TSM       | Bologna        |
| BTIM      | Padova         |
| CB        | Padova         |
| PHITRB128 | Padova         |
| PHITRB32  | Padova         |
| THETATRB  | Padova         |
| SB        | Padova/Bologna |
| SCB       | Padova         |
| DTCM      | Padova         |
| Minicrate | Padova/CIEMAT  |

## References

- [9.1] M. De Giorgi et al, CMS TN/95-01.
- [9.2] M. De Giorgi et al, Proceedings of the First Workshop on Electronics for LHC experiments, CERN LHCC 95-96 (1995) 222.
- [9.3] M. De Giorgi et al, Proceedings of the Fourth Workshop on Electronics for LHC experiments, CERN LHCC 98-36 (1998) 285.
- [9.4] F. Gasparini et al, Nucl. Instr. and Meth A 336 (1993) 91.
- [9.5] L. Castellani et al., “BTI User Note”, [http://wwweda.pd.infn.it/~rmartin/dtbx/documents/BTI\\_ref.ps](http://wwweda.pd.infn.it/~rmartin/dtbx/documents/BTI_ref.ps).
- [9.6] R. Martinelli et al., CMS Note/99-007.
- [9.7] I. D’Antone et al, CMS TN/96-078.
- [9.8] M.Kloimwieder, “Improving the  $\eta$ -Assignment of the DTBX Based First Level Regional Muon Trigger”, CMS Note1999/054.
- [9.9] CMS Note 2000/Draft, The Track-Sorter-Slave in the DTBX trigger.
- [9.10] CMS Note 2000/Draft, The Track-Sorter-Master in the DTBX trigger.
- [9.11] R. Cirio et al., CMS IN 2000/031.
- [9.12] A. Cavicchi et al, CMS TN 96-002.

- [9.13] M. De Giorgi et al, Proceedings of the Second Workshop on Electronics for LHC experiments, CERN LHCC 96-96 (1996) 22.
- [9.14] M. De Giorgi et al, Nucl. Instr. and Meth A 398 (1997) 203.
- [9.15] M. De Giorgi et al, Nucl. Instr. and Meth A 438 (1999) 302.
- [9.16] G.M. Dallavalle et al, Proceedings of the Fourth Workshop on Electronics for LHC experiments, CERN LHCC 98-36 (1998) 291.
- [9.17] G.M. Dallavalle et al, Proceedings of the Fourth Workshop on Electronics for LHC experiments, CERN LHCC 98-36 (1998) 294.
- [9.18] CMS IN 2000/Draft, Pattern Unit for high throughput device testing
- [9.19] C.Grandi, Proceedings of the International Conference on Computing in High Energy Physics and Nuclear Physics (CHEP2000), Padova, Italy, page 220.
- [9.20] M. Passaseo et al.,Nucl. Instr. and Meth A 367 (1999) 418.
- [9.21] CMS, The Muon Project, Technical Design Report, CERN/LHCC 97-32.
- [9.22] S. Agosteo et al, CMS Note 2000/024



# 10 Drift Tube Track-Finder

## 10.1 Requirements

The task of the Drift Tube Track-Finder system (DTTF) is to find muon tracks in the barrel region originating from the interaction point and to measure their transverse momentum and their location in  $\phi$  and  $\eta$ . Using the trigger primitive data delivered by the Drift Tube Chamber system (DTBX) and from the Cathode Strip Chamber system (CSC), the DTTF reconstructs muon candidates joining together track segments caused by the same track. After the track-finding process, the DTTF system assigns a transverse momentum measurement, a  $\phi$  and  $\eta$  coordinate measurement and a quality word to the muon track candidates. Finally, the Track-Finder system selects the four highest transverse momentum tracks in the detector barrel and forwards them to the Global Muon Trigger system.

The performance of the DTTF system is directly related to the performance of the Muon Chambers and on the relative alignment of the chambers in the Muon Detector system. The requirements on the detector systems are reported in Section 9.1.

## 10.2 System Overview

The Track-Finder system is portioned in 72 sectors, each of them covering  $30^\circ$  in the  $\phi$  angle. The segmentation along the  $\eta$  coordinate follows the CMS Muon barrel detector partitioning in wheels. Each sector covers a solid angle containing four Muon Barrel stations, MB1, MB2, MB3 and MB4, and it is equipped with a Sector Processor board. The task of each Sector Processor is to find up to two muon track candidates inside its own sector, using the track segment data from the DTXB chambers. In order to handle the barrel-endcap overlap region between the barrel and the endcaps of the Muon detector, the Sector Processors of the outermost wheels receive track segments data also from the Cathode Strip Chambers system. Fig. 10.1 shows the longitudinal view of the CMS muon system with the boundary division between the DTTF and the CSC Track-Finder, which identifies the barrel-endcap overlap region.

The Sector Processors are organized in twelve wedges along the  $\eta$  coordinate. The central wheel is logically split in two and equipped with two Sector Processors per sector: one Sector Processor covers the solid angle in the positive z-direction and the other one covers the solid angle in the negative direction. This partitioning is applied in order to reduce the number of interconnections and make the Sector Processor internal structure simpler. The logical layout is shown in Fig. 10.2.

Each wedge is equipped with a Wedge Sorter board, that collects the track candidates from the six Sector Processors in the wedge. Each Wedge Sorter applies a cancellation scheme in order to reduce the number of tracks found twice by neighbouring Sector Processors and to cancel out split tracks. Every Wedge Sorter selects the two highest transverse momentum tracks among the remaining candidates. The Barrel Sorter board collects the muon candidates from the twelve Wedge Sorters, and performs a selection on the candidates in order to reduce the number of tracks split in two neighbour wedges. Among all the surviving candidates, it selects the four muon



**Fig. 10.1:** Longitudinal view of the CMS muon system. The DTBX chamber naming scheme used in the Chapter (MB1 to MB4, ME1 from the CSC system) and the boundary division at  $|\eta|<1.04$  between the DTTF and the CSC Track-Finders are shown.

candidates with the highest transverse momentum and it forwards them to the Global Muon Trigger.

## 10.3 Subsystem Interfacing

The DTTF system gets input data from the DTBX chamber system and the Cathode Strip Chamber system. It outputs its results to the Global Muon Trigger unit. A subset of the DTBX input data from the barrel-endcap overlap region is forwarded towards the CSC Track-Finder.

The DTTF has a control interface in form of VME crate controllers that are connected to the controlling workstations. The DTTF system uses the TTC system for clock and control as well.

### 10.3.1 Input Data

#### DTBX $\phi$ Track Segment Data

The main input data stream of the DTTF system are the DTBX  $\phi$  station track segments that come from the Drift Tube Trigger Sector Collector units. The Drift Tube trigger system sends all data bits of one track segment in each bx. This can be the track segment that belongs to the bunch crossing determined by the standard latency scheme, or it can be a track segment of the previous bx, if there were two of them, and the bx of the standard latency did not result in any



**Fig. 10.2:** Logical segmentation and block diagram of the Track-Finder system. Six Sector Processors, each covering four muon stations MB1, MB2, MB3 and MB4, are organized in wedges along the  $\eta$ -coordinate. Each sector in the central wheel is logically split in two and equipped with two Sector Processors.

segment. The  $\phi$  station track segments are composed of 12 bit  $\phi$  position data (2's complement), 10 bit  $\phi_b$  bend angle data (2's complement), 3 bit quality, 1 bit second track segment tag, 1 bit calibration tag and 1 bit suppression info. Two bits are reserved and can be used for synchronization information. As the MB3 stations do not deliver  $\phi_B$  values, this sums up to 110 bits in every bx. This data set is sent to the DTTF using optical links. The input of the DTTF is designed to receive these links.

### DTBX $\eta$ Station Data

The  $\eta$  track segment data is sent through the Sector Collector as a bit vector hit map of the  $\eta$  stations. Each  $\eta$  station is sending 8 hit bits for the z coordinate and 8 quality bits. This sums up to 48 bits for each sector, ending up with 240  $\eta$  bits for every wedge. Each  $\eta$  Track-Finder processor collects all  $\eta$  track segment bits belonging to one wedge, whereby these data bits are then forwarded to the Wedge Sorter, using the same type of optical links as used for the  $\phi$  track segment data.

### CSC Data

The data exchange of the barrel-endcap overlap region includes two track segments per logical sub-sector from the first wheel of the CSC system for each bx. These track segments are



**Fig. 10.3:** DTTF connection scheme for one wedge.

composed of 12 bit  $\phi$ , 3 bit quality, 1 bit  $\eta$ -tag and two of the bunch crossing counter's least significant bits. This sums up to 36 bits in each bx, that is sent via Channel Link connections.

### 10.3.2 Output Data

#### Overlap Output to the CSC

The DTTF system also sends information from the two outer wheels to the CSC Track-Finder. This contains one track segment in each bx from each chamber in the 1st station. The following bits are contained: 12 bit  $\phi$  (2's complement), 5 bit  $\phi_b$  (2's complement), 3 bit quality, 1 bit second track segment tag, 1 bit calibration tag and two of the bunch crossing counter's least significant bits. This means 24 bits in each bx, that will be sent using Channel Link connections.

## System Output

The DTTF processors send parameter data bits about at most two found muons to the Wedge Sorter, that sends the same bits of the best two muons to the Barrel Sorter. This forwards the four best muons found in the barrel region to the Global Muon Trigger unit.

These muons are described by the following bits: 5 bit  $p_T$ , 8 bit  $\phi$ , 6 bit  $\eta$ , 1 bit charge, 3 bit quality, and 1 bit for the  $\eta$  assignment quality, resulting in 24 bits for each muon. The internal connection to the Wedge Sorter uses the backplane, while the Wedge Sorter outputs will be sent to the Barrel Sorter using Channel Links. The Barrel Sorter sends its results to the Global Muon Trigger unit using Channel Links as well.

### 10.3.3 Internal Data Exchange

The DTTF system needs internal data exchange too. The input data of the  $\phi$  Track-Finder is sent also to its next wheel and sideways neighbours in  $\phi$ . While the next wheel neighbours are in the same crate, thus the data exchange can use the backplane, the sideways neighbours are in different crates. These data bits are sent using interconnection cables with Channel Links. Only the track segments of the 2nd, 3rd and 4th stations will be exchanged, the 1st stations data do not need to be forwarded. Table 10.1 summarizes the input/output connections of the DTTF system.

**Table 10.1:** Summary of I/O connections per processor board.

| Name                        | Direction | Type          | Bits/bx | Lines |
|-----------------------------|-----------|---------------|---------|-------|
| Barrel $\phi$ inputs        | IN        | Optical       | 110     | 11    |
| Barrel $\eta$ inputs        | IN        | Optical       | 240     | 30    |
| Sideways neighbours inputs  | IN        | Channel Link  | 224     | 80    |
| CSC overlap inputs          | IN        | Channel Link  | 32      | 16    |
| Wedge Sorter outputs        | OUT       | Channel Link  | 62      | 24    |
| Barrel Sorter outputs       | OUT       | Parallel LVDS | 124     | 48    |
| Sideways neighbours outputs | OUT       | Channel Link  | 224     | 80    |
| CSC overlap outputs         | OUT       | Channel Link  | 24      | 10    |
| TTC Interface               | Special   |               |         |       |
| VME Interface               | Special   |               |         |       |

## 10.4 Track-Finder Algorithm

### 10.4.1 The Barrel DT Track-Finder Algorithm

The Track-Finder algorithm can be described as a three-step process, as shown in Fig. 10.4. In the first step the Extrapolation Unit of the Sector Processor tries to match track segment pairs of distinct muon stations, using a pairwise matching method. This is performed by

extrapolating to the next station from a track segment using the spatial and angular measurement  $\phi$  and  $\phi_b$  of the track segment (see Fig. 10.5). The matched pairs are then forwarded to the Track Assembler Unit which links the segment pairs to full tracks. The Assignment Unit performs the last step, assigning the track parameters to the candidates.



**Fig. 10.4:** Principle of the Track-Finder algorithm (3-step scheme).

### The Extrapolation Unit (EU)

The basic principle of the EU is to attempt to match together track segments caused by the same track. This is done by using a pairwise matching based on the principle of extrapolation. Using the spatial coordinates  $\phi_{source}$  and the angular measurement  $\phi_{b,source}$  of the source track segment, an extrapolated hit coordinate  $\phi_{ext}$  in another chamber may be calculated, according to Eq. (10.1). The  $\phi_{deviation}$  is parametrized in terms of the bending angle of the source track segment. If a target track segment is found to be at the extrapolated coordinate within a certain extrapolation threshold  $threshold_{ext}$  the match is considered successful, as shown in Fig. 10.5.

$$\phi_{ext} = \phi_{source} + \phi_{deviation}(\phi_{b,source}) \quad (10.1)$$

$$threshold_{ext} \geq |\phi_{ext} - \phi_{target}|$$

Fig. 10.6 shows the relation between the bending angle  $\phi_b$  in the source station and the



**Fig. 10.5:** Left side: A track segment consists of the spatial coordinate  $\phi$  and the bending angle  $\phi_b$ . Right side: Basic concept of the pairwise matching. If a track segment is found to be within the extrapolation window given by  $\phi_{ext}$  and  $threshold_{ext}$ , the extrapolation is considered successful.

relative deflection of the particle track between target- and source-station for several Muon Station pairs. The graphs show unambiguous relationships  $\phi_b = f(\Delta\phi)$  between the source bending angles and the relative deviations, proving the feasibility of extrapolation between these station pairs. The same condition can be found for all other station pairs except for those extrapolating from station three, in which no unambiguous relationship can be found. Moreover, for small bending angles no prediction can be done at all. This effect is caused by the zero crossing of the bending angle exactly at the Muon Station 3, caused by muon energy loss. However, since all other extrapolations are feasible these problems can be circumvented by extrapolating from Muon Station 4 to Muon Station 3. Simulations of trigger acceptance show that it is necessary to accept tracks with at least two out of four track segments [10.1][10.2][10.3]. Thus for the matching process six station pairings are necessary; 1-2, 1-3, 1-4, 2-3, 2-4 and 4-3. The threshold values for the extrapolation calculations are stored in memory based look up tables addressed by the bending angle value of the source station. A set of look up table is provided for each possible station pairing.

In Section 9.12.3 the time distribution of the track segments delivered by the Trigger Server has been described. This effect can have an impact in the design of the Track-Finder, since the out-of-time track segments caused by the same muon could be joined together to form additional track candidates. In order to reduce as much as possible this effect, track segments are used as source of an extrapolation only if they are defined at least as HTRG<sup>1</sup> uncorrelated track segments. LTRG uncorrelated track segments are still accepted as possible target segments. This



**Fig. 10.6:** Relationship between bending angle measurement  $\phi_b$  in the source station and the deflection of a muon track  $\phi_{target} - \phi_{source}$ , for different Muon Station pairs.

requirement reduces the impact of the out-of-time track segments, reducing the number of additional found tracks, and it does not show sizable effects on the overall Track-Finder efficiency (see Section 10.9). In the hardware implementation the possibility to use all kind of segments as sources for extrapolations is left as an option.

Due to the bending effect in the magnetic field a muon can cross sector boundaries in  $\phi$ . The maximum deflection for strongly bent muons is found to be below the dimension of two sectors. Thus the Extrapolation Unit in one Sector Processor needs to involve the track segments of the neighbouring  $\phi$ -sectors. The non-projective geometry of the chambers with respect to the

---

<sup>1.</sup> See Section 9.4.1 for the definition of HTRG and LTRG uncorrelated track segment.

muon tracks in the  $(r,z)$  view requires to examine also the neighbouring wheel. The muon track in the  $(r,z)$  projection is almost a straight line; checking the wheel contrary to the flight direction in  $z$  is not necessary. Moreover, a muon originating from the interaction point does not cross more than one wheel boundary. These considerations lead to the conclusion that it is sufficient to look to the corresponding sector and its directly adjacent neighbours when performing the extrapolations. Thus the Extrapolation Units examine six detector  $(\eta,\phi)$ -segments, its own sector plus five neighbours (see Fig. 10.7). Since it is not necessary to use the track segments from the neighbour MB1 stations, each Sector Processors receives only the track segment data from the neighbours MB2, MB3 and MB4 stations.



**Muon Stations 1 to 4**

each Sector Processor  
gets data  
from neighboring  
 $\eta\phi$  sectors

**Fig. 10.7:** Five neighbouring sectors must be evaluated in the extrapolation process. Each Sector Processor receives track segment data from the neighbouring  $(\eta,\phi)$  sectors.

In each Muon Station the Trigger Server board delivers up to two track segments. The Extrapolation Unit attempts to match each source segment to twelve segments in the next stations (twelve = two track segments times six neighbouring chambers).

Since six station pairings are necessary, in total twelve source track segments exist per sector. Thus the Sector Processor incorporates twelve separate Extrapolation Units, each of them extrapolating one source track segment to another station and comparing the extrapolated value to twelve target track segment. As a consequence 144 comparisons are carried out in parallel in one Sector Processor. The results of each comparison are forwarded to the Quality Sorter Unit.

## Quality Sorter Unit

The Quality Sorter Unit selects the two best extrapolations per source track segment and outputs the relative address of the target track segment.

The selection criteria are based on the quality bits of the target track segments and the relative location of the target track segment with respect to the source track segment. For each of the two target track segments the Quality Sorter Unit outputs the relative address of the target track segment.

The main task of the Quality Sorter Unit is to reduce the data stream from the Extrapolation Units to the track assembler. The probability to find more than two successful extrapolations originating from one single track segment is negligibly small. However, the Extrapolation Units deliver the extrapolation result and the extrapolation quality for each of the twelve possible track segment pairs. In this way the Quality Sorter Unit simplifies the track segment assembly.

## The Track Assembler

The task of the track assembler is to find the two tracks in a sector exhibiting the highest number of matching track segments. It selects the two highest ranking tracks from all the formed ones and outputs the relative addresses of the matching track segments. The Track Segment Router extracts the corresponding track segment data from the data pipeline using the relative addresses of the track segments. In order to reduce the number of Input/Output connections from the EUs to the Track Linker unit, only the EUs belonging to a single Sector Processor and the EUs in the next wheel neighbour Sector Processor are connected to the Track Linker unit [10.4].

The Track Linker finds tracks by setting up conditions for their existence. For instance a 1-2-3-4 track can be built, if there is an existing 1-2 extrapolation, from its target in the Muon Station MB2 there is another existing 2-3 extrapolation, and from its target in Station MB3 there is an existing 3-4 extrapolation. For the existence of the 1-2-3-4 track the target of the 3-4 extrapolation has no importance, the track extends to the Station MB4 if there is at least one existing 3-4 extrapolation. Thus this target needs no further check, making easier the hardware implementation of the algorithm.

The Track Linker sets up additional conditions for finding a track. For tracks that are formed using more than two track segments, the validity of the intermediate extrapolations will also be checked. This means for instance, for the existence of a 1-2-3 Track not only valid EU12 and EU23 extrapolations are needed, but also a valid EU13 Extrapolation. Furthermore, the algorithm checks whether the last segment of extrapolation EU13 is the target segment of the previous EU23 extrapolation. This condition increases the protection against noise and delta-rays in case of longer tracks.

In Fig. 10.8 it is shown how an extrapolation from MB1 to MB2 (EU12) can be joined to an extrapolation from MB2 to MB3 (EU23) if they share the same track segment in MB2. In order to validate this joining, it is required also that an extrapolation EU13 exists, sharing the same track segment of the previous joined ones. A track consisting of three track segments is now formed.



**Fig. 10.8:** Illustration of the basic steps of the Track Linker algorithm. Two pairs of track segments are joined together if the last track segment of the first pair is the starting segment of the second pair. In order to validate the linking, a successful extrapolation from the first to the last segment in the resulting track is required.

The Track Segment Linker attempts to find a track candidate for each track class originating from one single track segment. There are eleven classes: T1234, T123, T124, T134, T234, T12, T13, T14, T23, T24, T34 (the digit denotes the station belonging to the track).

After these selections 22 possible candidates (two start track segments times eleven track classes) are forwarded to the Track Selector unit. These candidates are ordered in terms of number and position of track segments used in the candidate building. The Track Selector selects the highest ranking track candidate among the 22 provided. At the end of the process the first muon track is found.

A Cancellation sub-unit discards all the candidates, out of the 22 provided ones, which represents sub-patterns of the first found track. After this, a second muon track is selected from the remaining candidates, following the same procedure.

The Address Assignment sub-unit outputs the addresses of the track segments belonging to the two found tracks to the Assignment Units.

### Assignment Units (AU)

The track segment addresses forwarded by the Track Linker are used to extract the physical parameters of the track segments from the data pipeline. Once the track segment data are available to the Assignment Units, memory based look up tables are used to determine the

transverse momenta of the particles. The momentum and the location of the tracks as well as a quality information about the track finding is output.

$p_T$ -assignment unit: the transverse momentum is assigned using the difference in the spatial  $\phi$  coordinate of the two innermost track segments. Fig. 10.9 shows the difference in the measured azimuthal position as a function of the transverse momentum. For some station pairing the relationship between the difference in  $\phi$  coordinates and the transverse momentum is not unique. This is due to low-momentum particles which are back-bent in the transverse plane. In such cases the difference in  $\phi$  coordinates tends to be as low as in the case of high-momentum particle. By the way, the bending angle of the innermost station track segment can still distinguish between low momentum and high momentum particles. The relationship is parametrized using two sets of functions, one set for low transverse momentum track candidates and one set for high transverse momentum track candidates. The bending angle of the innermost station track segment selects which function has to be used in the transverse momentum assignment. Memory based look up tables are actually used in the hardware for the transverse momentum assignment, parametrizing the two sets of functions. The charge of the particle is assigned using the  $\phi_b$  sign of the innermost track segment used to reconstruct the candidate, i.e. checking the most significant bit of the track segment  $\phi_b$  data.

The transverse momentum is measured with a resolution of 5 bits ( $0 < p_T < 140 \text{ GeV}/c$ ), one additional bit indicates the charge of the particle. Table 14.1 shows the 5 bit coding of the transverse momentum assignment. The  $p_T$  scale is defined at 90% efficiency: a cut on a certain  $p_T$  threshold implies that 90% of the reconstructed candidates which have a transverse momentum greater or equal to the threshold value are accepted.

$\phi$ -assignment unit (PHIAU): the assigned  $\phi$  coordinate values correspond to the spatial coordinate of the track segment in the second muon station. If the track candidates do not have a track segment in the second muon station, an extrapolation towards the second muon station position is performed, using memory based look up tables. The measurement of the  $\phi$  coordinate is given with a precision of 8 bits.

$\eta$ -assignment: the  $\eta$  parameter is assigned by a specifically designed board. The algorithm is described in Section 10.4.3.

Quality Assignment Unit: it assigns a quality code for each track. Three bits are used in the quality assignment, according to the track class of the muon candidate (see Table 10.2). Track classes are ordered according to number of participating track segments and to the momentum resolution, which is better when assigned using the inner muon station track segments.

### 10.4.2 Barrel-Endcap Overlap Region Handling

The barrel-endcap overlap region is the region in pseudorapidity in which muon tracks can cause track segments in the Drift Tube Chambers of the barrel as well as in the Cathode Strip Chambers of the endcap[10.5]. It roughly ranges from  $\eta=0.8$  to  $\eta=1.2$ . In contrast to the barrel the overlap region is characterized by a strongly non-uniform magnetic field which causes the bending of tracks in the  $r\phi$  projection to vary with pseudorapidity.

Track finding in this region is handled partly by the DTTF and partly by the CSC Track-Finder. Each Track-Finder receives track segments of some of the other Track-Finder's chambers and handles its part of the overlap region in its characteristic way. The CSC Track-Finder attempts



**Fig. 10.9:** Difference of the measured azimuthal positions as a function of the transverse momentum  $p_T$  for all muon station pairing.

**Table 10.2:** Quality bit code assignment according to the candidate track class.  
The digit in the track class names denotes the station belonging to the track.

| Quality bit code | 7     | 6            | 5    | 4    | 3                 | 2          | 1   | 0          |
|------------------|-------|--------------|------|------|-------------------|------------|-----|------------|
| Track Classes    | T1234 | T123<br>T124 | T134 | T234 | T12<br>T13<br>T14 | T23<br>T24 | T34 | Null Track |

to link track segments by three-dimensional extrapolations (road finding). This is possible as each track segment from the CSC system contains a  $\phi$  and an  $\eta$  coordinate. The inclusion of track segments from the DTBX chambers into this scheme is described in Chapter 12.

The DTBX chambers, on the other hand, deliver separate track segments for the  $\phi$  and  $\eta$  coordinates. The correlation between these segments is ambiguous if there are more than one  $\phi$  or more than one  $\eta$  track segments in a DTBX chamber. The DTTF uses only the  $\phi$  track segments and attempts to link them by two-dimensional extrapolation in the  $r\phi$  projection. CSC track segments are incorporated into this scheme as additional targets.

### Specifications of the Outermost Wheel Sector Processors

The Sector Processors in the outermost wheels (Overlap Sector Processors) differ in a few aspects from the Sector Processors in the other wheels in order to accept track segments from the CSC system the barrel-endcap overlap region. This DTTF receives up to 2 track segments from each logical sub-sector of the first wheel of the CSC system. These segments are treated like track segments of stations from a virtual neighbouring barrel wheel.

Each Overlap Sector Processor performs its extrapolations using different memory based look up tables depending on whether the target segment is in its own wheel or virtual next wheel (CSC system). The track linking stage in the Overlap Sector Processors is identical to the one in the other Sector Processors. The  $p_T$ -assignment unit works in the same way as for the other Sector Processors, except that different look-up-tables are used for tracks with track-segments in the CSC system.

### Achieving a Clean $\eta$ -Boundary

It is a requirement for the Global Muon Trigger that no tracks are duplicated by the Track-Finders: i.e. no single track should be reported by both, the DTTF and the CSC Track-Finder. A boundary in  $|\eta|$  (approximately at  $|\eta| = 1.04$ , see Fig. 10.1) is defined up to which only the DTTF system is allowed to report muons and above which only the CSC system is allowed to report muons. One uses the fact that the  $\eta$ -coordinate of the track segments in the CSC system is known. In either Track-Finder this  $\eta$ -coordinate is used as a non-ambiguous  $\eta$ -measurement of a track to decide whether the track will be reported.

The CSC system forwards track segments to the DTTF system up to a certain limit in pseudorapidity which is already inside the CSC Track-Finder's part of the barrel-endcap overlap region (e.g.  $|\eta| = 1.2$ ). If the segments lie on the CSC side of the defined boundary they are tagged by setting a tag-bit. The DTTF system uses all CSC track segments to assemble tracks and later discards tracks if they include a tagged CSC segment as these tracks will be reported by the CSC Track-Finder. This way the DTTF system and the CSC system both base their  $\eta$ -boundaries on the same measurement of  $\eta$  and a duplication of candidates can be minimized.

#### 10.4.3 $\eta$ Track-Finder Algorithm

The principle of the  $\eta$  Track-Finder [10.6] is to use the data of the middle superlayer (SL) of the first three DTBX stations in order to find tracks in the  $(r,z)$  plane. The middle SL measures the  $z$ -coordinate of a passing muon, which is used as input for the  $\eta$  Track-Finder. For each station this is defined as an eight bit word for the hit position in  $z$  and an additional eight bit word for the quality of the corresponding hit information, as described in Section 9.3.

The  $\eta$  Track-Finder is a stand-alone system. It reconstructs basic tracks using  $(r,z)$  plane hits which lie on a straight line pointing back to the vertex of interactions. This task is realized in form of a pattern matching method. A list of possible occurring patterns has been generated using simulated data samples. Each pattern has an  $\eta$  value assigned, therefore if a set of hits matches a particular pattern, a track is found and the pattern  $\eta$  value is assigned to the track.

The found  $\eta$  tracks have to be matched with the tracks found in the bending plane. In order to perform this task, the  $\eta$  Track-Finder receives coded track address data from the  $(r,\phi)$  Track-Finder containing the necessary wheel, station and sector information of all  $r/\phi$  track segments (TS) used to reconstruct each track found in the  $(r,\phi)$  plane.

In the simulation different matching schemes have been studied in order to verify the impact of the quality of the  $\eta$  track and the minimal goodness of track consistence between the  $(r,\phi)$  and  $\eta$  Track-Finder. All different combinations of the  $\eta$ -TS quality (LTRG or HTRG) result in 26 different quality numbers (Table 10.3). While the matching the different found tracks of both Track-Finders are compared and setting a minimal level for the track consistence (i.e. one, two or three TSs are found in the same chamber) triples the number of different matching schemes. Therefore, up to  $26^*3$  different matching schemes can be exploited and their efficiencies, resolutions can be compared (see results in Section 10.9.3).

**Table 10.3:** Definition of all possible  $\eta$  track quality classes and numbers. They are grouped together by their quality significance.  
(0 ... no TS, 1 ... LTRG, 2 ... HTRG)

|       | Station |   |   |       | Station |   |   |       | Station |   |   |
|-------|---------|---|---|-------|---------|---|---|-------|---------|---|---|
| Qu. # | 1       | 2 | 3 | Qu. # | 1       | 2 | 3 | Qu. # | 1       | 2 | 3 |
| 1     | 0       | 0 | 1 | 10    | 1       | 1 | 1 | 19    | 2       | 1 | 1 |
| 2     | 0       | 1 | 0 | 11    | 0       | 1 | 2 | 20    | 0       | 2 | 2 |
| 3     | 1       | 0 | 0 | 12    | 0       | 2 | 1 | 21    | 2       | 0 | 2 |
| 4     | 0       | 1 | 1 | 13    | 1       | 0 | 2 | 22    | 2       | 2 | 0 |
| 5     | 1       | 0 | 1 | 14    | 1       | 2 | 0 | 23    | 1       | 2 | 2 |
| 6     | 1       | 1 | 0 | 15    | 2       | 0 | 1 | 24    | 2       | 1 | 2 |
| 7     | 0       | 0 | 2 | 16    | 2       | 1 | 0 | 25    | 2       | 2 | 1 |
| 8     | 0       | 2 | 0 | 17    | 1       | 1 | 2 | 26    | 2       | 2 | 2 |
| 9     | 2       | 0 | 0 | 18    | 1       | 2 | 1 |       |         |   |   |

Thus, it is possible to set a minimum threshold for track identification and track matching and the choice of one of the possible schemes defines the matching probability and the relative  $\eta$  resolution, as shown in Fig. 10.31 and in Fig. 10.33. This option allows to change the matching and track identification scheme very quickly and offers a flexible  $\eta$  assignment method.

If the two tracks can be matched, the  $\eta$  value calculated by  $\eta$  Track-Finder will be assigned to the  $(r,\phi)$  track candidate. Otherwise, if a matching is not possible, a coarse  $\eta$  value is

assigned to the  $(r,\phi)$  track candidate checking the position where the track passed a wheel boundary. Therefore it is possible to assign an  $\eta$  value to each found track in the bending plane ending up with two different resolutions, as shown in Fig. 10.31. An additional bit is forwarded up to the Global Muon Trigger informing whether the assigned  $\eta$  value corresponds to the coarse or to the improved assignment method.

## 10.5 Track-Finder Hardware Implementation

The DTTF processors receive the track segment information from the trigger servers and deliver the parameters of the found muon track to the Wedge Sorter[10.7].

The DTTF processors are organized into crates where one crate contains six processors of one wedge. A wedge is a subset of those muon stations that have the same  $\phi$  coordinates along the CMS detector's z-axis. Thus one wedge contains five sectors belonging to the five different barrel wheels. In the hardware organization this requires six Sector Processors, as there are two processors serving the central wheel, separately looking for muon tracks going to the positive and negative z-directions. One of these central wheel processors is set up to find all muons that don't leave the wheel and those that leave in the positive z-direction, while the other one finds only those muons that leave the wheel in the negative z-direction.

The DTTF processors find muons that have a track segment in the lowest found station in the wheel they serve. They also follow up muons in the wheel behind, if these muons cross their own wheel. As the DTTF processors have no information about the previous wheel, it is possible that a muon track will be found twice, by two processors belonging to two consecutive wheels. In this case the muon track in the next wheel will be cancelled out.

One wedge crate, as shown in Fig. 10.3 will contain the following boards:

1. Crate controller - VME interface
2. General DTTF processor boards
3. Last wheel DTTF processor boards with overlap connection facility
4.  $\eta$  Track-Finder processor board
5. Wedge Sorter board
6. TTC clock receiver and distribution board

Depending on the place required by the external connections one 9Ux400 mm crate can include one or two wedge units. One crate controller for each crate is also necessary and probably one TTC clock receiver board too.

The crates containing the DTTF components are mounted in racks that are located inside the counter room separated from the CMS detector by a concrete wall. This also means that no radiation considerations should be taken into account when designing the hardware. The DTTF receives the TS data word from the trigger server via high speed optical links. The output of the DTTF goes to the Muon Sorter using simple cable connection as the distance is less than 10 meters.

### 10.5.1 DTTF Processor

The DTTF processor functional units are shown on Fig. 10.10.



**Fig. 10.10:** DT Track-Finder Processor block diagram.

#### The Sector Receiver Unit

The task of the sector receiver unit is to receive and synchronize the track segments coming from the Trigger Server boards. It also sends these track segments after synchronization towards the previous wheel neighbours and sideways neighbours. Its block diagram is shown in Fig. 10.11

The optical links are connected to the link receivers. These are located on mezzanine cards plugged in connectors on the sector processor board. The optical link connectors are mounted on the mezzanine cards and openings on the motherboard front panel allow a direct connection to them, similarly to the arrangement standardized for PMC (PCI Mezzanine Card). The electric connection between the mezzanine boards and the motherboard happens in form of parallel low-voltage TTL signals and a clock signal derived from the optical link clock serves as strobe for the parallel signal lines.

The external output to the neighbours is a Channel Link connection, as described in Section 10.3. On the other side the sector receiver also contains Channel Link receivers to get the track segments from the neighbours. These feed also FIFOs in order to deserialize the 2nd track segment of an event that is delivered from the neighbours in the consecutive bx.

#### The Extrapolation Unit

The Extrapolation Unit gets the  $\phi$  and  $\phi_b$  data from the Sector Receiver and performs the extrapolations to find out if the targets are inside the extrapolation window determined by the position ( $\phi$ ) and bending angle ( $\phi_b$ ) of the source. Its block diagram is shown in Fig. 10.12. The extrapolation windows' upper and lower limits are stored in lookup tables (LUTs) in the memory



**Fig. 10.11:** Sector Receiver Unit block diagram.

area of the extrapolator FPGA chips. The extrapolator uses only 8 most significant bits of both the 12 bit  $\phi$  and the 10 bit  $\phi_b$  as the required accuracy of the extrapolation does not need a higher resolution. The extrapolation process happens during two bx time (50ns).

With exception of the 4-3 extrapolators an extrapolation has one source and 12 possible targets, two target track segments in the same sector, two-two at the sideways neighbours and the similar 6 (two-two-two) in the next wheel. If the target track segments come from the sideways neighbours, the extrapolator adds an offset value to the left side neighbours  $\phi$  values and subtracts this offset from the right side neighbours  $\phi$  values that transforms these  $\phi$  locations into the sector's own coordinate system. The value of this offset in the 8 bit space is 133, but this can be changed according the detector's real geometric arrangement. Between the stations #3 and #4 the extrapolation will be performed in the opposite direction as the  $\phi_b$  value of the Station #3 does not allow to extrapolate from there. This also means that for this case separate LUTs are necessary for the extrapolations that uses the sideways neighbour track segments as sources. These extrapolations have only two possible targets.

Taking into account that two independent LUTs are necessary for both the upper and the lower limits, the total number of LUTs is 52. As each LUT is addressed by 8  $\phi_b$  bits and delivers a 8 bit limit value, it requires  $256 \times 8$  bit = 2048 bit memory space. The total memory space required by all LUTs is  $52 \times 2048$  = 106,496 bits. The output of the Extrapolation Unit contains twelve 12 bit extrapolation result tables and six 6 bit extrapolation result tables. The number of output lines is hence 180.

The VHDL implementation of the extrapolator used an arrangement where 3 chips were applied. Chip A performed the extrapolations 1-2 and 2-4, Chip B the extrapolations 1-3 and 2-3 and Chip C the extrapolations 1-4 and 4-3. This arrangement minimizes the number of inputs and outputs of the chips.



**Fig. 10.12:** Extrapolation Unit block diagram.

### The Quality Sorter Unit

The Quality Sorter Unit finds out in the extrapolator tables those two extrapolations that show the highest input TS quality. Its block diagram is shown in Fig. 10.13. As the extrapolation tables contain extrapolations of the same source, the quality of the source has no impact on the quality sorting, just the target quality.

The Quality Sorter is fed by the track segment quality bits of the stations #2, #3 and #4. station #1 is not needed as this station is not target of any extrapolation. The quality tables are masked with the extrapolation tables, thus the target qualities, whose target track segment was not part of a successful extrapolation will be cancelled out from quality sorting. As next step the Quality Sorter finds the highest quality target by consecutive pairwise comparisons, as shown in Fig. 10.13.

When the highest quality target is found, it is erased from the quality table and the sort procedure will be repeated to find the second highest quality candidate. The output of the Quality Sorter is a similar table as the extrapolator table, but in this table only the bits of the two highest quality extrapolations are left "1".



**Fig. 10.13:** Quality Sorter Unit block diagram.

### Track Assembler Unit

The most sophisticated part of the DTTF hardware is the Track Assembler unit. Its functional block structure is seen in Fig. 10.14.

The task of the Track Assembler is to find those track segments that can be connected to a series of consecutive valid extrapolations and thus form a possible muon track. The Track Assembler uses the twelve 12 bit extrapolation result tables and six 6 bit extrapolation result tables from the Quality Sorter as input, this means 180 input bits. Its outputs are the track segment addresses of both found track, four 4 bit addresses for each, 32 bits all. These output bits are forwarded to the track segment selection sub-unit of the Parameter Pipeline that uses them to select the physical parameters for the chosen TS. The addresses of the stations #2 and #3 will be sent to the Wedge Sorter too in order to perform the fake pair cancellation. A subset of these addresses will also be sent to the  $\eta$  Track-Finder to allow the  $\eta$  matching there.

The Track Assembler finds the tracks by setting up conditions for their existence. For tracks longer than only 2 track segments, one valid extrapolation, the validity of the intermediate extrapolations will also be checked. For creating a 1-2-3 track all the valid 1-2, 2-3 and 1-3 extrapolations are needed. This condition increases the protection against background and ghosts in case of longer tracks. In order to limit the number of the condition equations they don't contain condition for the very last track segment of a track (see Fig. 10.15). They only check whether at least one extrapolation towards a last track segment was successful, as this is a proof of its existence.



**Fig. 10.14:** Track Assembler Unit block diagram.



**Fig. 10.15:** Assembling of a track with several track segment candidates in the last station. There is no condition set up for the track segment location in the last station, but only a check whether at least one valid extrapolation towards this Station exists. The target track segment address is output.

The result of each equation is a single bit showing if all the conditions are TRUE for a given track. The results of the 68 equations are written into a priority table with 68 entries. The table is ordered from the highest priority track candidate down to the lowest priority one. The highest priority track candidates are those forming a 1-2-3-4 Track. They are organized into two groups of 12 members. The first group contains all possible tracks that start with the first track segment in the station #1, the second group contains those starting with the second track segment in the station #1. There are groups with four members for the tracks composed by 3 track segments,

in priority order 1-2-3, 1-2-4, 1-3-4, 2-3-4. The lowest priority table entries are reserved for the track candidates containing only two track segments: 1-2, 1-3, 1-4, 2-3, 2-4 and 3-4.

The priority encoder sub-unit finds the highest priority entry among the 68 possible candidates. In order to realize this sub-unit in a synthesizable structure the priority encoder is split into groups that follow the entry grouping by track categories. Two 12 bit group encoders find the highest priority entry among the 1-2-3-4 group entries. Twelve 4 bit encoders find the highest priority entry among the 3 track segment groups. The two track segment candidates are not handled by the group encoders. The results of the two 12 bit encoders and the twelve 4 bit encoders together with the non grouped two track segment candidates form the 22 bit input table for the global priority encoder. This sub-unit finds the highest priority entry among the group encoder outputs and the two track segment candidates. If the highest priority entry was found by one of the group encoders the result found there will be extracted to be used as track candidate number. If no candidate was found by the group encoders the track can be one of the two track segment candidates. Its number is directly decoded by the global priority encoder.

The last extrapolations of a track candidate don't participate in the priority equations, but they should be compared in order to avoid last segments that point actually to different targets. This is done by the last segment encoder, that is part of the address grouping sub-unit. Using the output of the priority encoder sub-unit's value the first address assignment sub-unit allocates the track segment addresses that contributed to the found track. The last address of the found track comes from the last segment encoder sub-unit, while the other addresses are directly decoded from the priority value. The addresses and the type code of the found track are put in a pipeline in order to wait there until the track assembler finds the second muon track.

The 68 bit priority table contains not only the entries of the tracks, but also those of their sub-tracks. In order not to find one of these tracks as second muon track, all entries of part tracks of the first found muon track should be cancelled out from the priority table. The cancellation procedure creates a second priority table without these part tracks. The cancellation is performed using a cancellation table. The first found muon track is used to address the cancellation table. The table's 56 bit output is used as mask to delete the sub-track entries from the existing 68 bit priority table. The resulted 56 bit second priority table is used to find the second muon track in the same manner as it was done for the first one.

The second priority encoder sub-unit works in the same way as the first one, but it uses the 56 bits of the second priority table as input. Correspondingly it lacks the first 12 bit group encoder and its second global priority encoder has only 21 entries. The second address assignment sub-unit works in the same way as the first one. It gets the track segment addresses and the last address from the pipeline, that stores these data during the time of track finding.

## The Pipeline and Selection Unit

The Pipeline and Selection Unit stores the input track segment data to be used for parameter assignment. All data needed later for the assignment are kept in a pipeline and will be forwarded with the bx clock as shown in Fig. 10.16.

This includes the  $\phi$  and  $\phi_b$  bits of the own sector and all neighbours, but not the quality bits. The output of the pipeline delivers the track segment parameters of those track that were found by the Track Assembler. As this solution requires a large amount of I/O pins, the pipeline and selection unit is divided into three chips. One of them serves as pipeline and selection of the station



**Fig. 10.16:** Pipeline and selection unit block diagram.

#1 and #3 track segments, the second one does the same job for the station #2 and the third for the #4 track segments.

### The Parameter Assignment Unit

The Parameter Assignment Unit calculates the  $p_T$  and  $\phi$  values from the track segment input data of the found tracks and forwards them to the Wedge Sorter. The parameter assignment unit is built into two identical chips, each performing the parameter assignment for one found track. Both Chips contain two separate sub-units, one for  $\phi$  assignment and one for  $p_T$  assignment.

The  $\phi$  assignment sub-unit finds out the  $\phi$  (position) values in the level of the station #2. Its block diagram is shown in Fig. 10.17. In the case when the found track contains a track segment in the station #2 its 12 bit input  $\phi$  value is cut off to 8 bit output value by eliminating the least significant four bits.

In the case when the found track does not contain input track segment in the station #2, but it contains one segment in the station #1 a 1-2 extrapolation table will be used to calculate the position at the Station #2 level. In the case of the 3-4 tracks, where neither station #2 nor station #1 hits are available a 4-2 extrapolation table will be used. These extrapolation tables deliver a relative position data, that will be added to the source position to get the absolute position at the level of station #2.

The calculation of the  $p_T$  values happens using the difference of the input track segment  $\phi$  values in those two stations having the lowest number, as shown in Fig. 10.18. This way there are separate LUTs used for 1-2, 1-3, 1-4, 2-3, 2-4 and 3-4 station pairs. In order to solve the

**Fig. 10.17:**  $\phi$  Assignment Sub-Unit block diagram.

problems caused by the  $\phi$ -difference -  $p_T$  ambiguities the  $p_T$  assignment uses different LUTs for high- $p_T$  and low- $p_T$  cases. The value of the  $\phi_B$  determines which LUT will be used.

**Fig. 10.18:**  $p_T$  Assignment Sub-Unit block diagram.

## Design Partitioning into FPGA Chips

Based on the hardware-level simulations the chips described in Table 10.4 are foreseen for the DTTF processor.

**Table 10.4:** DTTF chip sizes.

| Chip                 | Chips per Board | Inputs  | Outputs | Memory bits | Logic Cells |
|----------------------|-----------------|---------|---------|-------------|-------------|
| Extrapolation Unit   | 3               | 258     | 60      | 24576       | ~2300       |
| Quality Sorter       | 1               | 290     | 180     | -           | ~3700       |
| Track Linker         | 1               | 182     | 40      | -           | ~3900       |
| Pipeline & Selection | 3               | 193-249 | 40-68   | -           | ~2300-3000  |
| Assignment           | 2               | 82      | 16      | 77824       | ~380        |

### 10.5.2 Barrel-Endcap Overlap Region Handling

The DTTF processors of the -2nd and +2nd wheels (outermost wheels) differ from the basic DTTF processor, as these units should receive and elaborate the CSC overlap track segments. The connection scheme is shown in Fig. 10.19.

The DTTF system treats the track segments coming from CSC similarly to the next wheel neighbours' ones. According to the different scaling inside the CSC these DTTFs use different extrapolation LUTs. This also means that these DTTF processor boards contain extra chips comprising these special extrapolators. The track linking happens in the same way as in the basic DTTF processor, but for the parameter assignment again special LUTs are necessary.

The  $\phi_b$  (bending angle) data is not included in the direction from the CSC system towards the DT system and the opposite direction forwards only 5 bits  $\phi_b$  values due to the CSC Track-Finder properties. Pseudorapidity information is not forwarded in either directions except for a 1 bit  $\eta$ -tag received for each track segment from the CSC System.

As the CSC overlap connection transmits and receives track segments to and from the CSC system it should be synchronized with it. As the detailed bx budget for the CSC part is not finalized yet, no calculations can be made concerning the location and amount of needed delays. The synchronization between both systems will happen in the synchronization FIFOs of the sector receiver unit. In order to easily find timing errors both directions forward 2 least significant bits of the internal bx counter.

The CSC system sends two TSs per logical sub-sector from its ME1 Stations. The DT Track-Finder sends one TS in each bx, a possible second TS of the previous bx is tagged.



**Fig. 10.19:** Barrel-endcap overlap connections' scheme.

### 10.5.3 The $\eta$ Track-Finder Implementation

In each wedge there is one single  $\eta$  Track-Finder, that receives track segments from the TSS of station MB1, MB2 and MB3 in all five wheels. The track segments are sent in form of two eight bit words, one for the quality and one for the position. This sums up to  $5 \times 3 \times 16 = 240$  bits for each bx. This information is forwarded using two optical links for each station, ending up with 30 optical links for the whole detector.

The  $\eta$  Track-Finder optical link receivers will be mounted on daughter boards (mezzanine cards) that will be plugged on the Sector Processor board. Proper cutoffs in the front panel allow to mount the optical link connectors on the mezzanine boards. The interface between the receivers and the inputs of the processor will be a single ended 3.3 V CMOS parallel interface with the clock derived from the input synchronization. This also means this clock is not phase synchronized with the board's own clock.

The  $\eta$  Track-Finder can start 5 bx before the  $r/\phi$  Track-Finder, i.e. compared to the  $r/\phi$  Track-Finder the input is received earlier. From then onwards the two cards will work in parallel (track finding, sorting, etc.). Just before the  $p_T$  assignment the address information of the found tracks is sent to the Wedge Sorter and also to the  $\eta$  Track-Finder card. In the last two bx the matching will be performed and all values will be then forwarded to the Wedge Sorter at the same time.

The algorithm of the  $\eta$  Track-Finder will be implemented in a FPGA design. It consists of two major parts, the Track-Finder Unit and the Matching Unit (see Fig. 10.20).



**Fig. 10.20:**  $\eta$  Track-Finder block diagram.

### The Track-Finder Unit

A pipeline is used for the track finding and sorting. The pattern matching is implemented in form of look-up tables and uses, together with the sorting, at most 12 bx of latency.

### The Matching Unit

The matching is done in a pipeline way also using look-up tables and requires at most 2 bunch crossings. In Table 10.5 the number and the format of the inputs and outputs are summarized.

## 10.6 Muon Sorter

Successive to the stages of track finding and linking is a muon sorting stage in which the four “best-muons” are chosen out of 144 candidates and are forwarded to the Global Muon Trigger (GMT) system. Best-muon here means the one with higher quality and higher transverse momentum, once fake and ghost tracks are removed.

Fakes rejection requires checking the consistency of track information internal to the muon Drift Tubes selection system and possibly the consistency with information from other muon detectors and calorimeters in CMS. For this purpose, the track sorting mechanism is organized in two consecutive layers.

**Table 10.5:** Input and output of the  $\eta$  Track-Finder board.

|                                         | Input                   |                                 |        |
|-----------------------------------------|-------------------------|---------------------------------|--------|
|                                         | Information             | Format                          | # Bits |
| <b>DT - TSS</b>                         | $\theta$ layer hit data | 8 bit position<br>8 bit quality | 240    |
| <b>r/<math>\phi</math> Track-Finder</b> | found track address     | 3 bit word                      | 18     |
|                                         | Output                  |                                 |        |
| <b>Wedge Sorter</b>                     | assigned $\eta$ values  |                                 | 72     |

In the first sorting layer, the two best-muon candidates in each of twelve azimuthal wedges of the CMS barrel are selected. The criterion of the two best in a narrow wedge is sensible from the physics point of view and the organization in azimuthal wedges makes easy the association with other CMS detector parts.

A Wedge Sorter (DTWS) has its inputs from the Track-Finder processors (DTTF) covering the same 30 degree azimuthal sector in the five barrel-wheels. Each of the six (there are two TFs in the central wheel) DTTFs provides two candidates.

In the second sorting layer (the Barrel Sorter (DTBS)), the 24 remaining candidates are analyzed and the four best-muons are forwarded to the GMT.

In both layers a fake suppression mechanism can be configured to maintain the fake rate at a tolerable value. For muons with transverse momentum larger than 5 GeV/c in the barrel region, it has been calculated that the total physics rate of events with final state dimuons is about 1% of the single muon event rate: a 1% probability of creating a fake second track in a single muon event (for instance by track splitting at the sector boundaries) makes a fake dimuon rate comparable to the true dimuon rate.

### 10.6.1 Wedge Sorter

The DT Wedge Sorter (DTWS, Fig. 10.21) consists of three functional blocks: a Fake Track Tagger (FTT), a Mask and Sort Logic (MSL), a Track Pipeline and Multiplexer (TPM).

The FTT filters out fake tracks. Tracks reconstructed across two consecutive wheels can be found in both wheels by two different TFs. Input to the DTWS is organized by consecutive wheels and tracks coming from consecutive wheels are examined for using the same segment(s) in either/both of muon stations three and four. To this end, each TF forwards to the DTWS the 4 bit address of the segment in station 3 and in 4. The track with worse quality or, when of equal quality, found by the larger  $\eta$  TF is flagged as duplicate. Short tracks from split tracks across neighbouring wheels are recognized by looking at the track  $\phi$  information in candidates from consecutive wheels: again the track with lower quality or, when of equal quality, found by the larger  $\eta$  TF is flagged as duplicate. Ghost tracks can arise from accidental alignment of low quality DT segments, such as

ghosts at the wrong bx: a filter against low quality candidates can be configured and activated. The FTT output consists of twelve disable signals that are delivered to the sorting block.

The MSL finds the two best muons of the wedge. Disable signals from the FTT or from configurable registers are used to prevent participation of single tracks or single DTTF processors in the sorting operation. The track selection is based first on higher quality and then on larger transverse momentum. The 12 word comparison uses the result of 66 two-word comparators running in parallel. The track quality is decoded and priority assigned before the comparison stage according to configurable options. The MSL output consists of 12+12 select signals for the best and the next-to-best muon.

In the TPM the full track data for the two best-muons are delivered. The TFs track data are pipelined and multiplexed. The select signals from the MSL are used to drive the multiplexers. The pipeline duration can be configured: in the current design it is foreseen that the TF delivers the track segment addresses to the DTWS one clock cycle earlier than the full track data: the FTT logic is performed while the TF calculates the final parameters of the tracks. With this scheme the DTWS contributes two bx to the total latency.

There are twelve DTWS VME boards located in the TF crates. Inputs are from the crate back plane and outputs on the front panel. Output signals are LVDS Channel Link.



**Fig. 10.21:** Wedge Sorter block diagram.

### 10.6.2 Barrel Sorter Board

The DT Barrel Sorter (DTBS) design closely follows the DTWS scheme. Tracks reconstructed across two adjacent wedges can be found by two DTWS: comparison of the 4 bit

addresses of segments in station 3 and 4 is used to flag the lower quality track as duplicate and to generate a disable signal for use in the sorting block.

The operation of sorting the four best tracks is performed in two cycles: the two best are selected in the first cycle and stored; the two best of the remaining tracks are selected in the second cycle. Select signals are generated to drive the multiplexers for outputting the full track data.

There is one DTBS board located in one of the DTTF crates. Input and output signals are LVDS Channel Link. The DTBS contributes four bx to the total latency.

## 10.7 Synchronization and Latency

### 10.7.1 TTC Interface

The on-line control functions that require exact timing and the synchronization steps are performed by the TTC interface. Every Crate has a TTC Receiver Board that distributes the 40 MHz Clock and other timing signals. As the TTC system allows sending broadcast and individual commands using the TTC system, these will also be forwarded. For the timing signals a clock receiver will be placed on each board in connection with the on-board synchronization chips.

### 10.7.2 Synchronization Procedure and Control

The DTTF participates in the DT muon detector synchronization on three different levels. The first level is used to synchronize the different muon stations of the detector to each other. A rough synchronization will be done based on the TTC system. The fine tuning uses real muon data. These will be delivered by the stations and the DTTF gives a possibility to read out the track segments before performing the extrapolation and linking using the local DAQ channel. The speed of this procedure is limited by the local DAQ readout bandwidth, but this type of synchronization procedure should only be done as part of the maintenance.

The second synchronization level serves to compensate the different cable delays. This is done by the DTTF Sector Receiver Unit, that is described in Section 10.5.1. The goal of this stage is that all DTTF processors in the system will work on the same events at a given time. This process uses calibration bits that are generated by the trigger server with a predefined delay in the event structure gap. The two least significant bits of the bx counter are also sent as verification. In order to avoid malfunctions caused by clock jitters the input read circuits can be directed to use the clock's positive or negative edge. This type of synchronization - re-synchronization happens in every gap, also during data taking. Every occurrence of this type synchronization error is logged and reported to the detector control.

The DTTF also contributes to the muon data readout synchronization. To perform this task the DTTF is able to generate pseudo-triggers based on one single track segment. In order to keep these triggers unique and recognizable the muon stations should be commanded to generate only one such trigger at a given time slot, thus the sorter units up to the Global Trigger will not sort out these synchronization triggers. In addition the pattern of the pseudo-trigger generates a muon of high quality, thus a noise output cannot result in sorting out the pseudo trigger by the sorter units.

### 10.7.3 Latency Budget

The Table 10.6 shows the detailed latency budget, separated for the DTTF and  $\eta$  TF parts. As shown these units perform their task in a parallel way. After the DTTF track assembling the address information is forwarded to both the DTTF parameter assignment unit and the  $\eta$  TF matching unit. In this stage both Track Finders should be synchronized.

The Sector Receiver's latency also contains the time of distribution the Track Segments to the neighbours. The Track Assembler's relatively long latency is required by the complicated logic functions preformed there.

**Table 10.6:** Latency budget.

| DTTF Stage           | bx | Cumulative bx | $\eta$ TF Stage |    | Cumulative bx |
|----------------------|----|---------------|-----------------|----|---------------|
| Input                |    | 0             | Input           |    | 0             |
| Sector Receiver      | 3  | 3             | $\eta$ TF       | 16 | 16            |
| Extrapolation Unit   | 2  | 5             |                 |    |               |
| Quality Sorter       | 2  | 7             |                 |    |               |
| Track Assembler      | 9  | 16            |                 |    |               |
| Parameter Assignment | 2  | 18            | $\eta$ Match    | 2  | 18            |
| Wedge Sorter         | 2  | 20            |                 |    |               |
| Global Muon Sorter   | 4  | 24            |                 |    |               |
| Link to Muon Trigger | 3  | 27            |                 |    |               |

## 10.8 Subsystem Controls

### 10.8.1 Board-level Control Hardware

The hardware solutions performing the monitoring and control functions extend to two installations that are included in all boards of the DTTF system. This contains a VME bus and a local JTAG chain connected to all chips on the board. They allow a constant access to all chips and it is possible to access all chips' inputs, outputs and internal registers both with write functions - control and status read functions - monitoring. The monitoring and control software ensures that these accesses don't disturb the main functions of the chips.

### 10.8.2 Board-Level Monitoring Solutions

#### VME Interface

All crates of the DTTF hardware contain an A24/D16 VME bus on their backplane. This establishes the connections between the control workstations and all boards. This VME bus is



**Fig. 10.22:** Board Level JTAG chain.

controlled by a crate controller. This is a VME controller with bus-master facility. It forwards the data accesses from the workstation and the interrupts towards the workstation. There will be no other bus-masters in the VME system.

### JTAG Chain

All Boards in the DTTF crates will have a VME interface but in general the boards' internal function will not be directly accessible by the VME bus, except for a limited set of control- and status registers. The boards contain a VME-JTAG controller behind the VME interface. This controller maps the VME accesses into JTAG accesses. The boards contains one JTAG chain and all chips on the boards will be accessible via JTAG. The daughter boards of the optical links are connected to the JTAG chain through path linkers. This allows to keep the chain functionally even if the link boards are unplugged.

The JTAG interface is used to program the chips with configuration accesses, and the same accesses are used to download the LUTs, as the LUTs are parts of the chips' internal structure. In offline mode it is possible to access the chips' ports in order to perform connectivity and functional tests as part of the maintenance.

The monitoring process happens by reading out the chips' internal registers or output pins. In online mode the signature analyzer chip allows to read out the content of the internal data path last data words without disturbing the regular functions or their timing. The signature analyzer can be programmed to sample the data path from or until a given bx or it can search for preprogrammed data words. The JTAG chain is accessed in such a way that its speed is determined only by the speed of the VME bus.

### 10.8.3 DTTF General Control System

The monitoring and control system is made up by workstations that are connected to the DTTF crates' VME controllers. A general software framework with a window based user interface allows the supervising personnel to access the different functions. The framework also sets up access structures that allows users with different access rights to act on the different control levels. The access control works dynamically depending on the system's actual running state, most maintenance and programming accesses are prohibited during on-line activities.

The high level modules of the monitoring and control software are written in Java in order to allow remote accesses in an easy way. This also ensures - based on the Java sandbox principles - that the security measures cannot be eluded using the openness of the control network.

#### Basic Modules

The lowest level of the monitoring and control software is formed by the basic modules. They act as hardware access servers for the higher level programs. As these modules establish the connection and hardware mapping they are written in C. These modules allow to forward the VME commands and data transfer through the VME controller. They also map the interrupt requests of the boards into workstation hardware interrupts and handle them. The mapping between the JTAG functions and the software requests also happens here.

#### Startup Modules

After establishing the connection with the boards these modules perform check tasks, they verify the proper functionality of the boards, including the correct programming of the chips. After this the program sets up the boards' mode registers according to the required type of the run. The setup of the TTC parameters happens here too. At the end of the startup procedure the program waits for the general start command from the operator, or, in case of an automatic startup, it starts the system.

#### Supervisor Modules

The supervisor modules are active during the normal run of the System. They allow access to basic system parameters and hardware registers. They perform a constant monitoring of the maintenance functions and observe the state of the system synchronization. They also allow a certain level data spying.

These functions are performed on a way, that the DTTF functions are not affected at all. No accesses are allowed that would have impact on the data or speed of the track finding.

#### Maintenance Modules

The board level maintenance programs allow to perform connectivity tests on the DTTF boards. First the VME access itself will be tested, after this the connections between chips. Finally these modules allow to test the correctness of the chip programs and memory content. The programming routines are used if the on-board chips should be reprogrammed or the content of the LUTs should be changed. The functionality test programs check the chips' internal functionality in order to find out possible malfunctions. They generate input data by programming the chip input pins and check the response on the chip's outputs. The data spy functions allow to follow up the data words on the boards. This function uses the pipelined structure of the chips and also the on-

chip FIFO units where the used data is stored. This functionality is available during the run in a limited extent and fully available in maintenance mode where it is connected with either limited clock frequencies or data block length.

## 10.9 Simulation Results

The performance of the overall DTTF design has been tested using software simulations which describe the system in all its different components. The simulations were run in both CMSIM and ORCA frameworks, in order to compare and cross-check the results. Both simulations have been made using the complete geometrical description of the CMS detector, version 118. The software packages include a detailed simulation of the chamber noise. Extensive comparison with the VHDL implementation of the hardware design has been also carried out, validating the software simulation results.

All simulation results here reported have been carried out using a sample of about 100.000 single muon events, positively and negatively charged, with delta-ray generation. The muon sample is flat distributed in the  $\phi$  coordinate ( $0 < \phi < 2\pi$  rad), flat distributed in  $\eta$  ( $|\eta| < 1.3$ ), flat distributed in  $p_T$  ( $3 < p_T < 100$  GeV/c). The background and trigger rate studies have been performed on the minimum bias samples described in Section 8.4.1. All the samples have been analyzed using the ORCA simulation software, release 4.

### 10.9.1 Overall Performance

Fig. 10.23 shows the DTTF efficiency with respect to the  $\phi$  coordinate. The efficiency is rather flat along the coordinate; small losses can be seen in the efficiency distribution, due to the geometrical acceptance of the Muon detector.



**Fig. 10.23:** DTTF efficiency versus the  $\phi$  coordinate ( $|\eta| < 1.04$ ). Small losses can be seen in the efficiency distribution, due to the geometrical acceptance of the Muon detector.



**Fig. 10.24:** DTTF efficiency versus the  $\eta$  coordinate.

The overall efficiency with respect to the  $\eta$  coordinate is shown in Fig. 10.24. The full coverage of the muon barrel system is achieved in the region of  $|\eta|<0.8$ . This explains the efficiency drop for  $|\eta|>0.8$ , that is recovered by the Global Muon Trigger systems which uses the information from the RPC system and the CSC Track-Finder (see Chapter 14). The efficiency drop around  $|\eta| \sim 0.2$  is due to tracks crossing the chamber gap between wheel 0 and wheel 1.

The efficiency with respect to the transverse momentum of the track is shown in Fig. 10.25. The full efficiency is already reached at  $p_T>5\text{GeV}/c$ , with a steep rise from minimum detectable transverse momentum. The distribution is flat, showing comparable efficiencies for different muon energies.



**Fig. 10.25:** DTTF efficiency versus the transverse momentum  $p_T$  ( $|\eta|<1.04$ ).

The transverse momentum resolution is given in terms of  $1/p_T$  residual distribution, defined as

$$\text{Residual} = \frac{1/p_T^{\text{meas}} - 1/p_T^{\text{gen}}}{1/p_T^{\text{gen}}}$$

The  $1/p_T$  residual distribution in the barrel region is shown in Fig. 10.26. The overall resolution (barrel+overlap region) is 15.2%. The resolution improves to 13.5% for tracks which are reconstructed using track segments from both MB1 and MB2. The negative offset of the residual distribution is due to the definition of the  $p_T$  scale at the 90% efficiency (see Section 10.4.1).



**Fig. 10.26:** Transverse momentum ( $1/p_T$ ) residual distribution ( $p_T$  scale at 90% efficiency).

The dependence of the  $1/p_T$  residual resolution with respect to the muon transverse momentum is shown in Fig. 10.27.

The dependence of the  $1/p_T$  residual resolution on the  $\eta$  coordinate is shown in Fig. 10.28. The resolution is rather stable, except for the regions around  $|\eta|=0.3$  and  $|\eta|=0.8$ . The gap between the central wheel and the neighbour wheel ( $|\eta|\sim 0.3$ ) degrades the  $p_T$  measurement, because in that region most of the tracks are reconstructed without track segments from MB1 and/or MB2. Similar geometrical issues hold for the region around  $|\eta|\sim 0.8$ , where the non-uniformity of the magnetic field has sizeable effects.



**Fig. 10.27:**  $1/p_T$  residual resolution with respect to the muon transverse momentum ( $p_T$  scale at 90% efficiency).

The combined effects of the overall efficiency and transverse momentum resolution are shown in Fig. 10.29, where the efficiency is plotted versus the muon  $p_T$  for different  $p_T$  thresholds applied.

Studies have been made to estimate the resolving power of the DTTF system with respect to events in which two muons were produced close in the  $\phi$  coordinate. A sample of dimuon events has been generated requiring two muons close in the  $\phi$  coordinate measured in station MB1. The sample has a flat  $p_T$  distribution ( $5 < p_T < 100$  GeV/ $c$ ); each muon has an independent random  $p_T$  generation.

Fig. 10.30 shows the efficiency of flagging an event as a dimuon event: only the events in which two muons were found are taken into account in the efficiency plot. When the muons are closer than  $\Delta\phi \sim 2.5^\circ$ , the overall efficiency drops to 40%. The resolving power of the system is limited only by the resolution power of each single muon chamber, meaning that the DTTF is not reducing the intrinsic dimuon resolution of the muon system.



**Fig. 10.28:**  $1/p_T$  residual resolution versus the pseudorapidity ( $p_T$  scale at 90% efficiency).



**Fig. 10.29:** Turn-on-curves (Efficiency versus  $p_T$  for various  $p_T$  thresholds).

### 10.9.2 Extrapolation Filter Effects

As described in Section 10.4.1, track segments are used as source in the extrapolations only if they are defined at least as HTRG uncorrelated track segments. This kind of filter reduces the impact of the out-of-time track segments coming from the DTBX chambers. Simulation results are summarized in Table 10.7, where the reconstruction efficiency is computed looking at the correct bx and the out-of-time candidate rate is computed at the previous bx. It should be noted that even without the filter the out-of-time rate is rather low, if compared to the rate of out-of-time trigger segments received in input by the DTTF system. On the other hand, this rate is reduced even more by the filter, without a high impact on the overall efficiency.

**Table 10.7:** Overall efficiency, ghost rate and out-of-time rate of the DTTF.

| Barrel Reconstruction Efficiencies (%) |         |        |         |                       |
|----------------------------------------|---------|--------|---------|-----------------------|
| Method                                 | no muon | 1 muon | >1 muon | out-of-time candidate |
| Without extrapolation filter           | 3.35    | 96.5   | 0.15    | 3.6                   |
| Applying extrapolation filter          | 4.19    | 95.7   | 0.11    | 2.0                   |

Another possible effect is the degradation of the  $p_T$  resolution when the extrapolation filter is applied. This could happen because loosing a track segment could mean a degradation of the track class of the candidate and a possible degradation of the candidate  $p_T$  assignment. The simulations showed no sizeable effect on the  $p_T$  resolution caused by the extrapolation filter.

### 10.9.3 $\eta$ Track-Finder Performance

In order to calibrate and test the  $\eta$  Track-Finder performance, a sample of 2 millions single muon events has been generated, positively and negatively charged. The muon sample is flat distributed in  $\phi$  ( $0 < \phi < 2\pi$  rad),  $\eta$  ( $|\eta| < 1.3$ ) and  $p_T$  ( $3.5 < p_T < 100$  GeV/c).

Fig. 10.31 shows the resolution in the  $\eta$  coordinate achieved by the  $\eta$  Track-Finder board. The  $\eta$  resolution obtained using the middle superlayer of the DTBX stations is compared to the resolution obtained with the coarse  $\eta$  assignment. The non-gaussian shape of the coarse assignment distribution is due to the non-linearized scale of the  $\eta$  assignment. Fig. 10.32 shows the  $\eta$  resolution for the coarse and improved  $\eta$  assignment with respect to the  $\eta$  coordinate.

Fig. 10.33 shows the calculated standard deviation ( $\sigma$ ) of the  $\eta$  resolution for each  $\eta$  assignment scheme (see 10.4.3). The resolution improves with a higher matching probability because of the improved assignment. Studies on the mis-matching probability due to muon background have been performed but did not show a significant impact. The assignment scheme can be easily tuned in the case a much higher expected background rate occurs and influences the  $\eta$  assignment quality.



**Fig. 10.30:** Dimuon efficiency with respect to the difference in the  $\phi$  coordinate measured in station MB1.

#### 10.9.4 Trigger Rates

The studies on the trigger rates have been performed on the samples of minimum bias events described in Section 8.4.1. The minimum bias samples have been generated according to an LHC luminosity of  $10^{34} \text{ cm}^{-2} \text{ s}^{-1}$ , which implies an average of 17.3 piled-up events per each beam crossing. Fig. 10.34 shows the integrated trigger rate of the DTTF system for a single muon trigger condition. All DTTF candidates have been included in the trigger rate calculation, i.e. neither particular selection on track quality or geometrical restriction have been applied. A trigger rate of 1.6 kHz can be achieved by setting a  $p_T$  threshold of 25 GeV/c (the  $p_T$  scale is defined at 90% efficiency with the measured  $p_T$  being greater than or equal to the  $p_T$  cut).

#### 10.9.5 Radiation Background

Studies have been performed on the behavior of the system in a high radiation background environment. In Section 9.12 tests on the performance of the Drift Tube chambers in a gamma radiation environment have been reported, showing the effects of the maximum expected rate at LHC in the barrel detector due to gamma radiation background. These tests have been performed on a prototype chamber in a test beam facility setup. An increasing rate of the out-of-time LTRG track segments is the most sizeable effect reported: in the maximum irradiation scenario the noise rate due to the LTRG track segments increases by a factor 2.3 per Muon Chamber with respect to the noise rate in the no-radiation scenario (see Table 9.11).



**Fig. 10.31:** Left side:  $\eta$  resolution achieved by the coarse  $\eta$  assignment. Right side:  $\eta$  resolution achieved using the middle superlayer (TS $\theta$ ) segments.



**Fig. 10.32:** Pseudorapidity resolution versus  $\eta$  coordinate for coarse and improved  $\eta$  assignment.



**Fig. 10.33:**  $\eta$  Track-Finder pseudorapidity resolution and matching probability for each matching scheme (matching schemes defined in Section 10.4.3).

This effect can increase also the rate of the out-of-time candidates in the DTTF system, while the in-time efficiency and ghost rate would not be changed. The out-of-time candidate rate is already reduced by the effect of the extrapolation filter described in Section 10.4.1. In order to estimate the effects of a high radiation background on the DTTF system, a rate four times higher than the expected radiation rate at LHC has been simulated. In Table 10.8 a comparison among three different schemes is shown, in terms of efficiency loss and out-of-time candidate rate.

The Scheme A and the Scheme B select the candidates according to their quality. The Sector Processor can implement different quality codings than the one given in Table 10.2, and the Barrel Sorter can sort the candidates according to their quality assignment. In particular the Scheme A requires that the lowest quality code is assigned to a candidate reconstructed using just two track segments, the source segment being an uncorrelated segment and the target being a low quality uncorrelated segment. The lowest quality tracks are then cut out by the Barrel Sorter. The implementation of this Scheme reduces by a half the rate of the out-of-time candidates, while leaves almost unchanged the overall efficiency. A more stringent selection is performed in the Scheme B, where the lowest quality code is assigned to a candidate reconstruct using just two track



**Fig. 10.34:** Integrated single muon trigger rate versus  $p_T$  thresholds ( $p_T$  scale at 90% efficiency,  $|\eta| < 1.04$ ), based on minimum bias event samples for an LHC luminosity of  $10^{34} \text{ cm}^{-2} \text{ s}^{-1}$ .

**Table 10.8:** High radiation background effects on the DTTF performance. Results from a DTTF simulation in a radiation scenario four times higher the expected rate at LHC. See text for the Scheme A and B description.

|                          | Efficiency loss (%) | out-of-time candidate rate (%) |
|--------------------------|---------------------|--------------------------------|
| <b>Default algorithm</b> | /                   | 12.7                           |
| <b>Scheme A</b>          | 0.8                 | 6.2                            |
| <b>Scheme B</b>          | 3.4                 | 1.9                            |

segments, either one of the two being an uncorrelated low quality segment. Using the Scheme B the out-of-time rate is comparable with the out-of-time rate in the expected radiation scenario (see Table 10.8), while the efficiency decreases by 3.4%. These are just examples of the flexibility of the DTTF system, in order to reduce the impact of possible background scenarios.

### 10.9.6 Muon Chamber Mis-alignment Effects

The individual and global performances of the Muon Chambers can affect the performance of the DTTF system. The most sizeable effect is related to the relative mis-alignment of Muon Chambers inside a sector. Since the  $p_T$  assignment algorithm is based on the difference of the  $\phi$  coordinate in two different Muon Stations, a shift from the nominal position of a Muon Chamber could degrade the  $p_T$  resolution.

In particular a longitudinal shift of a Muon Chamber from the nominal position could result in an under-estimation of  $p_T$  for a high momentum candidate (the greater is the  $\phi$  difference, the lower is the assigned  $p_T$ , see Fig. 10.9). In order to estimate the mis-alignment effect, a single muon sample with fixed  $p_T$  has been used ( $p_T=200 \text{ GeV}/c$ ). The measure of the efficiency applying



**Fig. 10.35:** Muon Chambers relative mis-alignment effects. The plots show the percentage efficiency loss when different  $p_T$  thresholds are applied, due to a longitudinal shift from the nominal position of MB1 (top) and MB2 (bottom).

a  $p_T$  threshold cut can be used to estimate longitudinal shift effects. Fig. 10.35 shows the percentage efficiency loss due to a mis-alignment from the nominal position of MB1 and MB2,

which are the chambers where the effect is more enhanced. It can be seen that a sizeable effect shows up for a longitudinal displacement greater than 5 mm in MB1, which is above the mechanical placement tolerance of the Muon Barrel chambers [10.8].

On the other hand, a mis-alignment effect can be recovered after calibration runs, which can allow to re-tune the look up tables for the extrapolations and for the parameter assignment, since each Sector Processor can load different look up tables tuned for its own sector chambers.

## 10.10 Prototypes and Tests

The DTTF system design needs a series of prototypes with increasing functionality. The Simulation Mapping Prototype (Fig. 10.36) was used to test the first simulation model's functions.



**Fig. 10.36:** Simulation Mapping prototype.

These functions were refined and the basic connectivity model was changed using the first prototype results[10.4].

The activity on the Technology Evaluation Prototype (Fig. 10.37) is done for check the feasibility and handling of basic technology issues, like the links, their handling and speed, the FPGA handling and programming, the test data generation and the control software-hardware solutions.

The Functionality Evaluation Prototype will be built as a board with possible realization of all DTTF functions. The same will be done for the  $\eta$  Track-Finder. In this stage no separate handling will be done for the barrel-endcap overlap connection features.



**Fig. 10.37:** Technology Evaluation prototype.

The Pre-Production Prototypes will show the full functionality of the DTTF  $\phi$  and  $\eta$  boards. They allow to build up one or several crates and give a development bench for the software activity.

## 10.11 Status and Schedule

### 10.11.1 Design Status

The DTTF conceptual hardware design is completed. Projecting the hardware functions into programmable circuits is finished. All chips are described either by a VHDL behavioral model or an Altera AHDL model. Some models that are available in the AHDL version are reconstructed in the VHDL one, but the tests show that for the final design in some cases the AHDL approach might result in a more effective chip design.

All chip behavioral models mentioned above have been compiled and the hardware feasibility has been checked. The compilation included the production of chip-level VHDL models, that describe the future circuits into the last detail. To analyze the model as an entity a testbed VHDL model was created. This testbed connects all chip-VHDL models in the same way as they will be connected on the PCB. In addition the testbed contains two external behavioral models. One of them is able to read files with input data, that are generated by the CMSIM and ORCA detector and front-end electronics simulations. This model evaluates these files and feeds their content into the board model in the same way as it will happen through the input links. The other external model receives the outputs of the testbed, not only the final output representing the

DTTF output towards the Wedge Sorter, but also the intermediate signals, those that are content of the chip to chip connections. All these signals are captured and written in files in a human readable ASCII format. As this file format is fixed, these output files can be used for off-line evaluation as well.

### **10.11.2 Prototype Status**

The Simulation Mapping Prototype was evaluated in late 1998. This activity has led to a new design structure embodied in the hardware simulation models. The Technology Evaluation Prototype is under test. The first series of programming and link tests are completed and the second series is planned using the results of the first one. The Functionality Evaluation Prototype is in the planning phase but for those prototypes it is necessary to analyze and fully understand the results of the Technology Evaluation Prototype tests.

### **10.11.3 Control Software Status**

A set of control programs was developed for the prototyping activity. They contain the drivers, the framework and the utilities for hardware control. These programs will constitute the backbone of the final control software.

### **10.11.4 Schedule**

The prototype setup should deliver results by the 1st quarter of the year 2001. Using these results the DTTF design will be finalized and the corresponding design review will take place in the 2nd quarter of 2001. In parallel with this the board design will start. The first DTTF Functionality Evaluation prototype board will be ready in 3rd quarter of 2001. Thorough stand-alone tests will be performed at single board level in 2001, connection and track finding tests with neighbouring boards in 2002.

In parallel with the DTTF prototype the hardware of the  $\eta$  Track-Finder processor will be designed. This will first result in developing the VHDL behavioral models, their synthesis and integration into the full DTTF VHDL testbed in order to investigate the interface. After these tests the  $\eta$  Track-Finder board will be designed by end of 2001.

The next step will be the development of the barrel-endcap overlap DTTF. This also needs VHDL models that contain all the extensions that allow to accept and evaluate CSC data, and forward DT data to the CSC Track-Finder processor. The development and prototype design of the overlap DTTF version is due in the first half of 2002.

The production of the boards will follow the requirements of the CMS trigger construction schedule. Preproduction prototypes are planned for mid. 2003 and the production boards will be available in the first quarter of 2004.

The software development is scheduled to follow a parallel process with the hardware development. The driver units will be tested with the corresponding hardware implementation and the control software is planned to follow the hardware test needs. The final version that will be used with the CMS run should be finished during the trigger system test runs.

## References

- [10.1] A. Kluge, T. Wildschek, CMS Note 1997/091.
- [10.2] A. Kluge, T. Wildschek, CMS Note 1997/092.
- [10.3] A. Kluge, T. Wildschek, CMS Note 1997/093.
- [10.4] G. M. Dallavalle et al, CMS Note 1998/042.
- [10.5] G. M. Dallavalle et al, “Issues Related to the Separation of the Barrel and Endcap Muon Trigger Track-Finders”, CMS Note in preparation.
- [10.6] M. Kloimwieder, CMS Note 1999/054.
- [10.7] J. Erö, “New Approach for the CMS Muon Trigger Track Finder Processor”, Proceedings of the Fifth Workshop on Electronics for LHC Experiments, Snowmass CO/USA, CERN/LHCC/99-33, p.309.
- [10.8] CMS, The Muon Project, Technical Design Report, CERN/LHCC 97-3.

# 11 Cathode Strip Chamber Local Trigger

## 11.1 Requirements

The basic criteria for the CSC Local Trigger have been established for some time [11.1]. Because the several megahertz of low-momentum muons produced at full LHC luminosity far exceeds the data acquisition system bandwidth, the Level 1 trigger electronics of the muon detectors of CMS must measure the momentum of penetrating particles using muon system information alone. The muon trigger selection will be on the observed  $p_T$ , and for a given  $p_T$  cut the muon momentum increases as pseudorapidity increases in the forward region. The best trigger momentum measurement comes from combining a vertex constraint with precise measurements of the bend coordinate ( $\phi$ ) supplied by the CSC Local Trigger in each muon station. The electronics that combine individual station coordinates into tracks and assigns  $p_T$ , the CSC Track Finder, is described in the following chapter.

The CSC chambers contain six layers of radial cathode strips to precisely measure the  $\phi$  coordinate and six layers of nearly orthogonal anode wires whose signals are used to measure the non-bend coordinate ( $\eta$ ) [11.2]. The CSC cathode strip technique of measuring position by the centroid of charge deposition works better in the presence of high-momentum muon bremmstrahlung than drift-time measuring devices. Another concern for the CSC system, because of its forward location, is the high level of backgrounds expected from punchthrough pions, low-momentum primary muons, secondary muons, and neutron-induced gamma rays. Figure 8.14 shows that the hit rate reaches as high as  $500 \text{ Hz/cm}^2$  in the innermost portion of ME1/1 and as high as  $70 \text{ Hz/cm}^2$  elsewhere. The integrated hit rates are as high as  $10 \text{ kHz}$  per strip and  $20 \text{ kHz}$  per wire group.

The CSC Local Trigger uses the six-layer redundancy of the CSC chambers to provide precise position information as well as to provide high rejection power against backgrounds. Muon segments, also known as Local Charged Tracks (LCTs) are found in the nearly orthogonal cathode and anode projections by somewhat different algorithms and by different electronic boards. For cathode and anode segments (CLCTs and ALCTs), the number of layers hit and the position and track angle through the chamber is reported. Up to two CLCTs and two ALCTs are found in each chamber during any bunch crossing. The two projections are then combined into 3-dimensional LCTs by timing coincidence. Each correlated LCT then provides to the CSC Track Finder a precision measurement of the bend coordinate ( $\phi$ ), bend coordinate angle of passage through the chamber ( $\phi_b$ ), approximate measurement of the non-bend angle coordinate ( $\eta$ ), and identification of the muon bunch crossing ( $bx$ ).

The basic design of the CSC Local Trigger electronics is driven by several general performance requirements:

1. Since the CSC Track Finder correlates 2-4 CSC stations, the LCT efficiency must be larger than 95% to have a high overall track finding efficiency. The LCT efficiency can be factored into four largely independent efficiencies: CLCT pattern finding, ALCT pattern finding, ALCT bunch crossing assignment, and ALCT-CLCT time coincidence.

2. The bend coordinate must be measured to an RMS accuracy of 0.15 strips (typically 1 mm), so that the CSC Track Finder can make a meaningful momentum measurement up to 100 GeV/c [11.3].
3. The system must be able to function well in the presence of high anode and cathode background hit rates.
4. The trigger must operate essentially without time.

There are additional electronics requirements that influence the design:

1. The CSC Track Finder must receive the LCTs by 52 bx, 1.3  $\mu$ s after the time of primary interaction.
2. The electronics must be able to survive the radiation level in the endcap muon region for 10 LHC years at  $10^{34}/\text{cm}^2/\text{s}$  luminosity. For electronics mounted on the chambers this is  $6.2 \times 10^{11}$  neutrons/ $\text{cm}^2$  and 1.8 krad of ionizing particles. For electronics mounted on the periphery of the endcap muon iron disks, this is  $4.1 \times 10^{10}$  neutrons/ $\text{cm}^2$  and 0.13 krad of ionizing particles. These estimates are quoted with a factor of three uncertainty.

The design of the CSC Local Trigger has evolved in parallel with the design of the CSC chambers. Therefore, the trigger design makes several requirements on the CSC detectors:

1. The chambers should be able to operate at or above a high voltage corresponding to a minimum ionizing signal charge of 112 fC charge when summed over cathode strips. The preamplifier noise level is approximately 1-2 fC. If the chamber signals are smaller than 50 fC, the momentum resolution of the CSC trigger will be degraded.
2. The CSC chambers should be located with an initial precision better than 1 cm from their nominal positions, in all coordinates, in order to supply a crude muon trigger. The alignment precision required in  $r\phi$  coordinates for optimum trigger momentum resolution is 1 mm. The trigger system contains look-up tables to reach this accuracy after some period of analysis of alignment system or muon track data.
3. The time distribution of hits arriving on the anode wires should have an RMS width no larger than 10 ns in order for the multi-layer anode bunch identification algorithm to correctly identify the bunch crossing with high (above 99%) accuracy.
4. The low-voltage power distribution must be designed to avoid over-voltage or under-voltage accidents, due to the inaccessibility of much of the system.

## 11.2 Overview

The Endcap CSC Muon Local Trigger receives signals from front-end cathode and anode electronic boards connected to the Cathode Strip Chambers. Segments of muon tracks are found separately in the nearly orthogonal anode and cathode views. In each view, segment positions, angles, and timing (bunch crossing) are measured. The cathode electronics design is optimized to measure the  $\phi$  coordinate with high precision, while the anode electronics design is optimized to determine the muon bunch crossing with high efficiency. Cathode and anode segments are correlated in time as well as the number of layers hit. These segments are found in the presence of large background rates from:

1. neutron-induced showers<sup>1</sup>,
2. decay muons, especially at the lowest momenta and in the first muon station,
3. punch-through pions, particularly in the first muon station,
4. primary muons having low energy, and
5. bremsstrahlung showers from the high-momentum muons themselves.

The maximum rate from neutron-induced gamma rays is approximately 10 kHz per strip for cathode hits and 20 kHz per wire group for anode hits. The maximum hit rates from all charged-particle sources are approximately a factor of 10 lower. Simulations indicate that at maximum luminosity, several background clusters exist within each CSC at any given time. To reduce the otherwise huge trigger rate, CSC trigger primitives are formed from tight spatial coincidences of clusters in the 6 chamber layers.

The CSC Local Trigger system selects the two highest-quality LCTs in each CSC chamber and forwards them to the CSC Trigger Track Finder. The CSC Trigger Track Finder operates with a basic segmentation of  $60^\circ$ . The ME1 chambers and outer chambers of ME2-4 cover  $10^\circ$ , while the inner chambers of ME2-4 cover  $20^\circ$ . Figure 11.1 shows how the chamber segmentations are mapped into the CSC Trigger Track Finder segmentation, as well as the  $30^\circ$  sectors (ME1/3 only) for use by the Barrel Track Finder.

The CSC Local Trigger forms LCTs from cathode and anode signals according to the block diagram shown in Figure 11.2. This figure also shows the physical location of each part of the CSC Local Trigger system. The division between the CSC Local Trigger and the CSC Track Finder are the optical links that carry LCT data from the collision hall to the counting room.

The most precise track measurement is obtained by charge digitization and precise interpolation of the cathode strip charges. A simpler and more robust method is used for the CSC local trigger to achieve half-strip localization of the muon track in each cathode layer [11.4]. This is done with a 16-channel “comparator” ASIC that compares the amplified and shaped signals from adjacent strips. If a strip signal is found to be larger than those on its neighbors, a hit is assigned to the strip. Simultaneous comparison of left versus right neighbor strip signals allows assignment of the hit to the right or left side of the central strip, effectively halving the resolution. The six layers are then brought into coincidence in “Local Charged Track” (LCT) pattern circuitry as shown in Figure 11.3. This establishes position of the muon to an RMS accuracy of 0.15 strip widths. Strip widths range from 6-16 mm. Because of the slow 150 ns rise-time of the cathode amplifier/shapers, the cathode electronics does not uniquely identify the bunch crossing.

In the CSC muon system, anode wires are spaced by about 3 mm. The anode wires are hard-wired together (‘ganged’) at the readout end in groups of 10-15 wires to reduce channel count. The algorithm used in determining muon segment position and bunch crossing in the anode view is shown in Figure 11.4. Anode signals are fed into amplifier/constant-fraction discriminators. Since the drift time can be longer than 50 ns, a multi-layer coincidence technique in the anode LCT

---

<sup>1</sup> Large numbers of neutrons over a broad range of energy are produced from secondary hadronic interactions in the forward region of CMS. These neutrons induce nuclear reactions that produce photons. These photons in turn create electrons that deposit energy in the CSC gas.



**Fig. 11.1:** The data mapping from CSC chamber-level LCT information into  $60^\circ$  sector information for the CSC Track Finder. Also shown are interfaces to and from the Barrel DT system for handling overlap regions.

pattern circuitry is used to identify the bunch crossing. For each spatial pattern of anode hits, a low coincidence level, typically 2 layers, is used to establish timing, whereas a higher coincidence level, typically 4 layers, is used to establish the existence of a muon track.

The CSC Local Trigger electronics system consists of seven types of boards:

1. CFEB - Cathode front-end boards. These boards amplify the cathode signals. After amplification, the CFEB boards contain parallel and independent trigger and precision charge readout data paths. In the trigger path, the positions of charge clusters (hits) are digitized in units of one-half of a cathode strip per plane at the 40 MHz LHC frequency. These hits are compressed by a factor of four and sent to the CLCT/TMB boards described



**Fig. 11.2:** Block diagram for data flow in the CSC Local Trigger.



**Fig. 11.3:** Cathode LCT formation from cathode comparator bits.

below. (In the precision charge readout path [11.5], charge is stored in switched capacitor arrays until a Level 1 Accept signal is received, and then the charges are digitized using 12-bit 20-MHz ADCs.)



**Fig. 11.4:** Anode LCT formation from wire group hits (left), and bunch crossing assignment based on numbers of hit layers (right).

2. AFEB - Anode front-end boards. These boards contain a combined amplifier/constant fraction discriminator ASIC to digitize the anode information. The anode hits are sent to the ALCT boards described below.
3. ALCT - Anode LCT-finding boards. These boards latch the anode hits at 40 MHz, find hit patterns in the six-layer chambers that are consistent with having originated at the bunch crossing point, and determine the muon bunch crossing by a multiple-layer coincidence timing technique. Up to two anode LCTs can be found per chamber. The anode LCT information is sent to the CLCT/TMB board described below.
4. CLCT/TMB - a combination of Cathode LCT-finding circuits plus Trigger Motherboard circuits. The CLCT section of these boards decodes the pattern of cathode hits from the CFEBs, and finds half-strip hit patterns in the six-layer chambers that are consistent with high-momentum muon tracks. The TMB section of these boards performs a time coincidence of anode and cathode LCT information, and when a coincidence is found, sends the information to the MPC board described below. The TMB selects up to two LCTs based on quality cuts. The TMB may perform a coincidence of LCT positions with RPC hits to reduce the likelihood of ‘ghost’ hits in the case that two or more LCTs are found. Upon receipt of a Level 1 Accept signal (L1A), the anode LCT, cathode LCT, and raw hits information is sent through FIFOs to the DAQMB board described below.
5. MPC - Muon Port Cards. Each MPC receives the LCTs from all of the CLCT/TMB cards in one sector of one endcap muon station, selects the three ‘best’ LCTs, and sends them over optical fiber links to the CSC Track Finder electronics located in the CMS counting room.

6. DAQMB - Motherboards with DAQ interfaces. These boards are part of the trigger system in the sense that they record the anode and cathode LCT and raw hits data in the case of a Level 1 Accept signal (as well as recording the precision charge/position information). The DAQMB sends the data over optical fiber links to the CSC muon system Detector-Dependant Unit (DDU) located in the CMS counting room.
7. CCB - Clock and Control Boards. These boards are the interface from the global CMS Trigger, Timing, and Control (TTC) system [11.6] to the CSC muon system, distributing those signals which are required for operation of the CSC electronics.

Figure 11.5 shows the physical organization of the system. The CFEB, AFEB, and ALCT boards are mounted on the chambers, while CLCT/TMB, MPC, DAQMB, and CCB boards are housed in crates mounted around the periphery of the endcap iron disks.



**Fig. 11.5:** Physical layout of the CSC trigger electronics.

There are 4 or 5 CFEB boards on each chamber, except for ME1/1 where there are 8. Each CFEB receives inputs from 96 cathode strips. There are 18, 24, or 42 AFEBs per chamber, corresponding to chambers that have 48, 64, or 112 wire groups per plane. Each AFEB receives input signals from 16 anode wire groups. There is one ALCT mounted on each chamber. The on-chamber CFEB and ALCT boards are mounted on the side of the chamber away from the iron disk to which the chamber is attached. The small AFEB boards are plugged into connectors on one edge of the chambers, close to the anode wires. Halogen-free twisted-pair cables carry discriminated signals from the AFEB to the ALCT boards using differential LVDS signal levels.

There is one CLCT/TMB and one DAQMB per chamber. These are located in crates mounted on the periphery of the endcap iron disks. Trigger signals from the on-chamber CFEB and ALCT are sent on high-quality (low-skew) cables to the front panel of CLCT and DAQMB boards located in the peripheral crates. Each peripheral crate services eight (ME1) or nine (ME2-4) CSC chambers. Each peripheral crate services  $20^\circ$  in  $\phi$  in ME1 and  $60^\circ$  in ME2, ME3, and ME4. There is also one CCB, one MPC, and a VME controller in each CSC peripheral crate. The CCB receives clock and control signals from the TTC system [11.6] and distributes them on a custom backplane within a peripheral crate. The custom backplane is also used for data transfer from CLCT/TMB modules to the DAQMB modules, and from CLCT/TMB modules to the MPC module. A possible layout of a peripheral crate is shown in Figure 11.6: each chamber is serviced by a CLCT/TMB boards paired with a DAQMB board. The central functions of the CCB and MPC modules are



**Fig. 11.6:** Slot assignments in a CSC electronics crate mounted on the periphery of the endcap iron disk.

reflected in the central locations of these modules within the peripheral crates, which minimizes signal propagation delay times and total signal routing length on the custom backplane.

### 11.3 Cathode Signal Processing

### 11.3.1 Amplification and Shaping

Each CFEB board reads out 96 cathode strips, arranged 16 strips wide by 6 layers deep. Halogen-free twisted-pair cables carry cathode signals from the edge of the chamber to the CFEBs mounted nearby on the surface of the chamber. The input cathode signals are sent into 16-channel amplifier-shaper ASICs. Each input signal is amplified and shaped into pulses having peak voltage approximately 100 mV per MIP (minimum ionizing particle) for a typical chamber gain of  $10^5$ . To optimize high-rate performance, circuits to cancel the long tail of the chamber pulse due to ion drift are integrated into the shaper. The output pulse shape is semi-Gaussian. The peaking time and the time for return to baseline with chamber pulse inputs are each about 150 ns. One output of each preamp/shaper channel is connected to a switched capacitor array for storage before possible precision digitization by a commercial ADC. The other output is connected to the trigger path which uses a comparator-network ASIC to digitize the CSC trigger cathode signals. The shaper output signals corresponding to cathode strips at the edge of CFEB towers are sent to adjacent

CFEB cards to allow the comparator-network ASIC to function in a seamless manner. Channel by channel gain calibration is done using a set of precisely matched capacitors that couple a test pulse to each input channel. Four levels of charge injection are available under programmable control for each strip. No calibration procedure is required for the cathode trigger.

### 11.3.2 Trigger Digitization

In CSC chambers, charge collected on the anode wires produces an opposite-sign signal on several strips. For precision track measurement, the position is determined by precise interpolation of the cathode strip charges. For the CSC local trigger, a simpler and more robust method is used to gain somewhat coarser resolution. A threshold is applied to determine strips containing significant charge depositions. The threshold voltage is set by a JTAG-controlled DAC. The trigger position is determined to one-strip width accuracy by determination of the strip with maximum signal. A further factor of two improvement to half-strip accuracy is gained by comparison of signals from the adjacent strips. Were this determination to be perfect, the RMS position resolution would be  $0.5/\sqrt{12} = 0.144$  strip widths, or about 1.5 mm per layer for a typical cathode strip width of 1.0 cm. These comparisons of strip charges to threshold and to each other are accomplished in the comparator-network ASIC as shown in Figure 11.7: the pulse from the preamp/shaper for strip  $N$  is compared to a pre-set threshold level and compared to pulses from neighboring strips (strip  $N-1$  and strip  $N+1$ ). Strip  $N$  has the peak charge if its pulse is larger than all three. The track hit position is localized to either right or left half of strip  $N$  by a fourth comparator which compares pulses from strip  $N-1$  and strip  $N+1$ . The output levels from the comparators are fed into AND gates and latched to produce two digital signals  $L_N$  and  $R_N$  indicating hits on strip  $N$  left or right side, respectively.



**Fig. 11.7:** Comparator-network ASIC block diagram.

The comparator ASIC receives inputs from 16 strips plus two additional neighbor strips. The signal is compared to threshold every clock cycle. After the signal first exceeds the threshold, a programmable delay (typically 150 ns) allows the cathode signals to reach their peaks. After this delay, the strip signal is compared to neighbor signals. The slow control system is used to select the operating mode for the comparator ASICs and to set the threshold DAC. A JTAG link from the

DAQMB writes three CFEB register bits that select the comparator's peaking time of 25 ns to 200 ns in 25 ns steps. In addition, two register bits select one of three possible trigger modes. Mode 0 requires the preamp/shaper output voltage to be above threshold for only one clock. Mode 1 requires that the voltage still be above threshold after the peak delay time. Mode 2 requires that the voltage remain above threshold for every clock cycle up to and including the peak delay time.

Half-strip localization errors occur for 10-20% of hits, partly due to analog circuit noise and partly due to the presence of delta rays and muon bremsstrahlung. In all but about 2% of the hits, the discrepancy is limited to one half-strip width.

Internally, 32 half-strip bits are produced by the comparator ASIC. Digital circuitry in the comparator ASIC compresses the 32 half-strip bits into 8 output time-sequenced "di-strip triad" bits. This is accomplished in lossless fashion because of two facts. First, the comparison of neighbor strips to find the charge maximum only allows one of two adjacent strips to produce a half-strip bit at any one time. Second, the shaper signals develop slowly compared to the bunch crossing time. The output triad bits come from the comparator ASIC at 40 MHz, thus taking 75 ns compared to the preamp/shaper return to baseline time of about 150 ns. The first triad bit indicates the presence of a hit on one of two adjacent strips, the second indicates which of the two strips contains the hit, and the third bit indicates whether the hit was located on the left or the right side of the strip.

The digitization circuitry of the comparator ASIC can be controlled by several input lines. The "peaking" time delay between the first observation of a signal over threshold and the comparison of neighbor strips can be adjusted between 1 and 8 clock cycles (25 ns to 200 ns). The signal can be required to exceed threshold in three modes: only at one clock cycle, at each clock cycle up to the "peaking time", or at the first clock cycle plus the cycle at the "peaking time".

Analog and digital signals are sent between adjacent CFEB boards in order to perform seamless cluster finding across board boundaries. Six analog signals (one for each layer) are sent to neighboring boards on each side, and six analog input signals are received. These are in pairs that alternate signal with ground. Digital signals are needed as "carry" bits for the cluster finding logic. Carry bits are received from the left-adjacent board (which has lower strip numbers), and transmitted to the right-adjacent board (higher strip numbers). LVDS drivers and receivers are used to minimize noise.

The six comparator ASICs on each 96-channel CFEB card produce 48 single-ended output bits at 40 MHz. This data is fed into serializers that convert the data to a higher bit rate and send it out as differential LVDS signals, including a clock signal which is carried with the data. A low-skew output cable brings these signals to the CLCT boards located in crates mounted on the periphery of the endcap iron disks. The links also send their own ID number (0 or 1) and a "Link Alive" bit that indicates the CFEB is powered up. The cable also contains two signals that are not fed through the bit serializers: the 40 MHz strobe sent to the CFEB board in order to clock the digital portions of the comparator ASICs, and a reset signal.

### 11.3.3 Cathode LCT Pattern-Finding

A muon passing through a CSC chamber will produce distinctive patterns of half-strip hits in the six-layer endcap muon CSC chambers. By identifying these patterns, the CSC Local Trigger provides high rejection power against backgrounds. The largest background source, neutron-induced gamma ray conversions, are generally low in energy, and produce mostly single-

layer or short multi-layer hits. Other backgrounds, such as low-momentum muons or punch-through particles often do not point well enough to the primary interaction region to be considered high-momentum muon candidates. The six-layer correlation can allow for half-strip location errors. These errors occur for 10-20% of hits, partly due to analog circuit noise, and partly due to the presence of delta rays and muon bremsstrahlung. The six-layer correlation also provides somewhat improved position resolution and provides a rough measurement of the bend angle ( $\phi_b$ ) within the approximately 15 cm path length between the first and the sixth plane of the chamber.

Patterns are found within half-strip patterns for tracks with  $p_T > 10 \text{ GeV}/c$  in ME1 and all tracks in ME2, ME3, or ME4. Low-momentum tracks ( $2.5 < p_T < 10 \text{ GeV}/c$ ) in ME1 bend the most, requiring an additional set of di-strip patterns. In ME1, where muon tracks have the largest curvature, a minimum number of layers (usually 4) with hits is required within the envelope of half-strip or di-strips shown in Figure 11.8. The envelope used in other stations is in general narrower, and depends on the station number and whether it is an inner or an outer CSC chamber. If gate array technology improves substantially by the time that CLCT implementation is frozen, more specific envelopes can be defined and a somewhat improved position resolution can be obtained.



**Fig. 11.8:** The envelope of cathode half-strip or di-strip hits used in CLCT pattern-finding.

On the CLCT/TMB board, inputs from the comparator ASICs are de-serialized and converted from differential LVDS to single-ended TTL levels. The clock signals returned with the comparator data from the CFEBs are synchronous with the LHC clock, but have different phases than the TMB/CLCT board clock. These phases depend on the lengths of cables between the CFEB and CLCT/TMB boards and may be inconvenient for reliable latching in the CLCT circuitry. Therefore, synchronization of the input comparator data on a time scale finer than one bunch crossing is necessary. A simple circuit eliminates the possibility of unreliable data reception from the CFEB. This circuit latches input comparator data on both rising and falling edges of the on-board clock. Then a programmable multiplexer selects the more favorable phase. Finally, a latch having a setup and hold time of less than one-half of a clock cycle stores the data on the rising edge of the on-board clock signal.

The comparator signals are then fed into one large field-programmable gate array (called variously FPGA or PLD, depending on the manufacturer) per CSC chamber. Hot or dead comparator di-strip channels can be masked on input to this device. The gate array performs the cathode segment-finding (CLCT) function shown in Figure 11.9. The CLCT gate array decodes the sequential comparator ASIC triad bits into up to 240 internal di-strip bits and up to 960 half-strip bits. The bits are “stretched” to a length such as 75 ns which allows for coincidence between layers in the presence of varying drift times. These bits are fed into the LCT trigger processor which look for multi-layer coincidences within predetermined patterns. The magnetic field in the endcap causes bending of charged tracks in the azimuthal direction, transverse to the strips. In the first muon station, high  $p_T$  ( $>10$  GeV/c) tracks bend a maximum of 1.8 strips in the 15 cm between the first and sixth layer in the chamber. Low  $p_T$  (2.5-10 GeV/c) tracks will bend as much as 7.2 strips in the chamber. The amount of bending is much less in the other endcap muon stations.

An  $n$ -layer ( $1 \leq n \leq 6$ ) coincidence within the envelope of di-strips shown in Figure 11.8 gives a pre-trigger indication. Typically,  $n=2$ . When a pre-trigger is found, a time delay such as 50 ns is taken to allow long-drift time hits to arrive. Then an LCT is found using restrictive  $m$ -layer ( $1 \leq m \leq 6$ ) patterns among the half-strip bits. Typically,  $m=4$ . Each pattern has an 8-bit pattern number. Higher pattern numbers are assigned to straighter high-momentum tracks with more layers hit. Muon stubs that overlap two CFEBs are recognized as a single stub. If more than one stub is found within 16 adjacent strips, a priority encoder on the output of the pattern-finding circuitry selects the single best cathode LCT according to the pattern number. If more than two cathode LCTs are found within the entire chamber, the best two are retained according to the pattern number. For diagnostic purposes, the raw comparator bits can be stored in a pipeline and frozen when a CLCT is found, for later serial readout through the DAQ chain. This readout is initiated by reception of L1A for the appropriate bunch crossing.

Data for each cathode LCT is sent to the TMB according [11.7] to Table 11.1. The “Valid Pattern flag” indicates a valid LCT pattern has been found and information is being sent on the current clock cycle. The 8-bit pattern number encodes the number of layers, the pattern of half-strips or di-strips found, and whether the pattern consists of half-strips or di-strips. The bend bit indicates whether the track is heading towards lower or higher strip number. Although the half-strip and di-strip patterns can be distinguished by pattern number, a separate bit to distinguish these cases is sent for ease of decoding. For high  $p_T$  patterns, the 8-bit half-strip ID is between 0 and 159. For low  $p_T$  patterns, the 8-bit di-strip ID is between 0 and 39. This number corresponds to the position of the pattern selected at the third or “key” layer of the chamber. It should be noted that this does not require a hit to have actually been registered in the third chamber layer. Finally, the 5 low-order bits of the bunch crossing number to which the cathode LCT data is associated is sent for timing verification.

## 11.4 Anode Signal Processing

The anode trigger is designed to optimize the muon bunch crossing identification. Each input channel of the AFEB is a ganged group of wires (10 to 20) from a layer. The AFEB cards amplify and discriminate the anode signals and send logic pulses when anode signals exceed a pre-determined threshold.

Logic pulses from the AFEB discriminators are sent by differential LVDS to a single on-chamber ALCT board which finds track segments and determines the bunch crossing time of



**Fig. 11.9:** CLCT block diagram, including input data decoding, pattern lookup, pattern selection, trigger output formatting, and DAQ diagnostic data recording.

**Table 11.1:** CLCT output bits to Trigger Motherboard.

| CLCT Output Data                    | Bits |
|-------------------------------------|------|
| Valid Pattern flag                  | 1    |
| Pattern number (0-255)              | 8    |
| Bend left/right (0/1)               | 1    |
| Half or Di-strip pattern flag (0/1) | 1    |
| Half- or Di-strip ID (0-159, 0-39)  | 8    |
| bx low-order bits                   | 5    |
| Total                               | 24   |

the track segment. Although the drift time distribution in CSC chambers has a small but long tail beyond 50 ns, the correct 25 ns bunch crossing can be identified with high efficiency by a coincidence technique. The ALCT board latches the anode wire group hits at 25 ns intervals, and identifies the bunch crossing from the first  $n$ -layer ( $1 \leq n \leq 6$ ) coincidence of hits. Test beam studies show that once the system is properly timed any choice of multiplicity within  $1 \leq n \leq 4$  yields a bunch crossing tagging efficiency in excess of 99%. The optimum *phase* of the 25 ns coincidence window depends on the required multiplicity. Although a 1-fold coincidence level produces high efficiency in the absence of backgrounds, it is susceptible to being fooled by early signals from the high rate of neutron-induced background hits, and higher coincidence levels are preferred.

The  $n$ -layer coincidence that identifies the bunch crossing also serves as a pre-trigger indication. Like the CLCT board, after a pre-trigger is found, a time delay such as 50 ns is taken to allow long-drift time hits to arrive. Then an LCT is found using restrictive  $m$ -layer ( $1 \leq m \leq 6$ ) patterns among the wire group bits that are consistent with coming from the primary interaction. Typically,  $m=4$ .

The ALCT board sends data for up to two muon track segments to the CLCT/TMB board mounted in the peripheral crates, which forms a coincidence between anode and cathode LCTs. For diagnostic purposes, the ALCT muon stub information and the discriminator output pulses are latched and pipelined for readout into the DAQ system, providing hit/no-hit information for each of the wire groups. The connection to the DAQ system is provided through a cable to the CLCT/TMB module.

### 11.4.1 CSC Anode Digitization

Each AFEB card contains one 16-channel preamplifier-shaper-discriminator ASIC. The small AFEB cards are mounted on the sides of the CSC chambers close to the anode wires in order to minimize input signal path length and thus noise levels. Typical thresholds are 20 fC, while a MIP signal exceeds 100 fC. The anode amplifiers are similar to the ones on the CFEB boards, but optimized for the summed anode input capacitance and shaped with a peaking time of 30 ns. The amplifier outputs are sent into constant-fraction discriminators. Time walk between 20 fC and 100 fC levels is observed to be less than 4 ns. The output stage produces logic pulses with a minimum width of 35 ns and a maximum width equal to the time over threshold. The logic pulses are sent from AFEB cards to the on-chamber ALCT board using differential LVDS levels. The AFEB cards accept an analog test pulse input from the ALCT that is fed to the input of all amplifier channels simultaneously. Power levels of +5.5V and (in the case of ME1/1) -4.3V are required. The AFEB cards also accept a “stand-by” level that can be used to disable the ASIC, for instance, in case of latch-up. A single 40-pin cable between the ALCT and each AFEB carries the 16 differential discriminator output signals, power levels and ground, the threshold level, test pulse, and stand-by level.

### 11.4.2 Anode LCT Pattern Finding

LCT trigger patterns among a set of hits in anode wire groups are found in the same way as those for the strips. Here the segmentation is much coarser and the roads are straight lines to the interaction region, independent of  $p_T$ . The roads differ across a chamber due to the changing polar angle. The anode patterns are found within the envelope of wire group hits shown in Figure 11.10.

The functional diagram for the anode trigger is shown in Figure 11.11. Within each road, the number of layers containing hits is counted on every bunch crossing. Two programmable layer-coincidence levels are employed. When the pre-trigger level is exceeded, the bunch crossing time is identified and a fixed delay, such as 50 ns, is imposed to wait for long-drifting anode hits. After the delay, if the second coincidence level is exceeded, the muon track position is defined.

Each pattern has a 2-bit pattern number. Higher pattern numbers are assigned to ALCTs with more layers hit. If more than one stub is found within 16 adjacent wire groups, a priority encoder on the output of the pattern-finding circuitry selects the single best ALCT according to the pattern number. If more than two anode LCTs are found within the chamber, the best two are



**Fig. 11.10:** The envelope of anode wire group hits used in ALCT pattern-finding.

retained and sent to the TMB according to the pattern number. For diagnostic purposes, the wire hit bits are stored in a pipeline and frozen when an ALCT is found for later serial readout through the DAQ chain. This readout is initiated by reception of L1A for the appropriate bunch crossing.

On the ALCT board, inputs from the anode preamplifier/discriminator ASICs are converted from differential LVDS to single-ended TTL levels in custom 16-channel delay ASICs. The delay ASIC delays the anode signals over a range of 32 ns in 4 ns steps, depending on 3-bit control inputs. The fine delay control changes the phase of the anode signals with respect to the LHC clock used for latching the anode signals. This is necessary to get optimum bunch crossing tagging efficiency. The delayed anode discriminator signals are then fed into field-programmable gate arrays that perform the anode segment-finding (ALCT) function.

Hot or dead anode channels are masked on input to the pattern-finding FPGAs. The anode bits are then stretched to a duration, such as 75 ns, that allows for coincidence between layers in the presence of varying drift times. These bits are fed into the LCT trigger processor which look for n-layer coincidences within predetermined patterns.

There are collision muon and accelerator muon ALCT patterns. The collision muon patterns project to the collision point, while the accelerator muon patterns are parallel to the beam axis. It is possible to completely shut off either of these types under software control. The number of layers struck within each pattern (0-6) defines a 3-bit pattern quality number. The pattern type bit (collision versus accelerator) is added to the pattern quality number to make a 4-bit quantity by which patterns are sorted. The pattern type bit is added as the high-order bit, with a software-selectable polarity. In normal operation, the polarity will be set so that collision patterns are represented by a one and thus preferred to accelerator patterns. If more than one ALCT pattern is found within 16 adjacent wire groups, a priority encoder on the output of the pattern-finding circuitry selects the single best pattern according to the pattern number. If two equal pattern numbers are found on different wire groups, then the pattern on the wire group furthest from the beam axis (lowest pseudorapidity) is selected. If more than two ALCT patterns are found within the entire chamber, the best two are retained according to the pattern number.



**Fig. 11.11:** ALCT block diagram, including pattern lookup, pattern selection, trigger output formatting, and DAQ diagnostic data recording.

Table 11.2 shows the data sent to the TMB for each anode LCT pattern [11.7]. The “Valid Pattern flag” signals that a valid LCT pattern has been found and information is being sent on the current clock cycle. The 2-bit pattern quality number is the number of layers hit minus three. The accelerator muon bit indicates if there were hit patterns that appear to be parallel to the beam axis. This may be used for triggering on accelerator or halo muons, or may be used to veto chambers containing such muons. The latter capability could be useful if the rate of these muons is higher than anticipated. The 7-bit wire group ID number indicating the position of the pattern within the chamber runs 0-111. This number corresponds to the position of the pattern selected at the third or “key” layer of the chamber. This does not require a hit to have been registered in the third chamber layer. Finally, the 5 low-order bits of the ALCT bunch crossing number is sent to the TMB for timing verification.

**Table 11.2:** Anode LCT output to Trigger Motherboard

| ALCT Output Data      | Bits |
|-----------------------|------|
| Valid Pattern flag    | 1    |
| Pattern quality (0-3) | 2    |
| Accelerator muon      | 1    |
| Wire group ID (0-111) | 7    |
| bx low-order bits     | 5    |
| Total                 | 16   |

## 11.5 Cathode-Anode Correlation

The Trigger Motherboard (TMB) portion of the CLCT/TMB card receives up to two anode stubs from the ALCT board and two cathode stubs from the CLCT portion of the CLCT/TMB card [1]. The functions of the TMB circuitry are:

1. Bunch crossing alignment of the anode and cathode tags.
2. Correlation of the Anode and Cathode LCT words and construction of two combined LCTs.
3. Transmission of LCT data to the Muon Port Card (MPC) for triggering, and transmission of DAQ data to the DAQ Motherboard (DAQMB).

Each of these functions is described in more detail in following sections. A preliminary plan has been developed [11.8] in which a coincidence between the LCT and the RPC information can be made at the TMB. This may allow a reduction in the rate of CSC ghosts in the case of two or more muon candidates in a CSC chamber.

### 11.5.1 Bunch Crossing Alignment

Incoming anode and cathode LCTs are not aligned in time. Anode LCTs are created faster than cathode LCTs because of the slow development of the cathode preamp signal, and because processing inside the ALCT card is faster than processing inside the CLCT logic. The TMB contains input pipeline logic in order to delay anode LCTs for a programmable number of bunch crossings up to 10.

### 11.5.2 Cathode-Anode Matching

The anode and cathode LCTs are matched according to the more precise ALCT bunch crossing number (BXN). The Cathode LCT BXN can differ by at most  $\pm 1$  bunch crossing. For each of the selected muons the TMB outputs a 2-bit bunch crossing match word as shown [11.7] in Table 11.3. These may be used by later boards in the trigger chain if additional quality information is needed. They also allow the analysis of the bunch crossing matching in the TMB, since a large number of bad matches could be an indication of a timing alignment problem.

**Table 11.3:** Bunch Crossing Match Bits.

| BXN Match | ALCT BXN - CLCT BXN |
|-----------|---------------------|
| 0         | 0                   |
| 1         | 1                   |
| 2         | -1                  |
| 3         | $> \pm 1$ (Error)   |

The ideal case for a high-momentum muon is one anode and one cathode LCT pattern. However, other cases may occur, which are distinguished by a 2-bit “STA” (Status type A) code as shown [11.7] in Table 11.4:

1. The TMB may receive one or two anode LCTs and zero cathode LCT patterns. This happens, for example, for very low-momentum muons. Although the non-zero data is forwarded to the MPC, this case is flagged by STA=1, as is the similar case of one or two cathode LCT and zero anode LCT patterns.
2. If the TMB receives two anode LCTs and one cathode LCT, the TMB outputs two LCTs, by copying the Cathode LCT bits into both muons. These, and the similar case of two cathode LCTs and one anode LCT, are flagged by STA=2.
3. If there are two anode LCTs and two cathode LCTs in one chamber, they are matched according to their pattern numbers: the largest ALCT and CLCT pattern numbers are paired, and the second largest ALCT and CLCT pattern numbers are paired. These, and the ideal case of a single match, are flagged by STA=3.

**Table 11.4:** Trigger Motherboard anode-cathode coincidence codes.

| STA value | Description                                                   |
|-----------|---------------------------------------------------------------|
| 0         | No CSC trigger data                                           |
| 1         | There is CSC trigger data but no anode-cathode match is found |
| 2         | Two cathode and one anode LCT patterns, or vice versa         |
| 3         | One or two LCT patterns, unambiguous assignment               |

TMBs maintain a local Bunch Crossing Number (BXN) using signals from the Clock and Control Board. The internal BXN is compared to the BXN received from the ALCT module, and the Sync Error bit is set if a mismatch is detected.

### 11.5.3 LCT Data Transmission to the MPC

The TMB sends up to two anode LCT and two cathode LCT patterns for one CSC chamber to the MPC every 25 ns. The bits are indicated [11.7] in Table 11.5. Some bits are the same for the two muons from one CSC; these are marked with an asterisk in the table. The data is passed on a custom connector backplane. Since 630 bits (data from nine TMBs) are sent every clock cycle, the data is compressed onto fewer backplane lines, using serialization before transmission to the MPC.

## 11.6 Muon Port Card Selection of LCTs

The high cost of optical links makes it prohibitively expensive to send every LCT from the TMBs to the counting room. Thus, a Muon Port Card (MPC) is used to reduce the data. In each of stations 2, 3, and 4, an MPC receives signals from 9 chambers corresponding to 60 degrees in  $\phi$  in one station: three high- $\eta$   $20^0$  chambers and six low- $\eta$   $10^0$  chambers. In station 1, an MPC receives signals from 8 chambers corresponding to  $20^0$  in  $\phi$ : two  $10^0$  chambers in each of four types, the high- $\eta$  and low- $\eta$  sections of ME1/1, plus ME1/2 and ME1/3. Each MPC in stations 2, 3, and 4 reduces the number of LCTs to three or less and sends them to the Sector Receiver (SR) module via optical links. In station 1, the number of output LCTs is two or less.

**Table 11.5:** Cathode and Anode LCT data sent from TMB to MPC.

| Signal                                       | Bits per 1 Muon | Packed Bits per 2 Muons |
|----------------------------------------------|-----------------|-------------------------|
| Valid pattern flag                           | 1               | 2                       |
| Cathode pattern number (0-255)               | 8               | 16                      |
| Cathode Left/Right bend (0/1)                | 1               | 2                       |
| Cathode half- or di-strip ID (0-159 or 0-39) | 8               | 16                      |
| Anode pattern quality (0-3)                  | 2               | 4                       |
| Accelerator muon flag                        | 1               | 2                       |
| Anode wire gang ID (0-111)                   | 7               | 14                      |
| BXN match                                    | 2               | 4                       |
| * Anode BXN low-order bits (0-31)            | 5               | 5                       |
| * TMB status bits STA (LCT trigger)          | 2               | 2                       |
| * TMB status bits STB (CSC/RPC coincidence)  | 2               | 2                       |
| * Synchronization error                      | 1               | 1                       |
| Total                                        | 40              | 70                      |

### 11.6.1 MPC Functionality

The MPC performs the following functions:

1. Data collection. The MPC receives data from nine TMB every bunch crossing. Each TMB sends up to two LCT patterns. Up to 18 LCT patterns may come to an MPC simultaneously.
2. Synchronization of incoming LCTs with the MPC local master clock.
3. LCT selection. The MPC selects the best three LCTs out of 18 possible, except in ME1 where two LCTs are selected out of 16 possible.
4. Reformatting of the three selected LCTs and their transmission to SRC via optical links.

### 11.6.2 MPC Input Synchronization

When the LCTs are sent from the TMB to the MPC, they have a phase shift with respect to the local clock at the destination, and must be synchronized. The local 40.08 MHz master clock at the MPC is sent from the Clock and Control Board. Also, there are 18 bunch crossing counts

accompanying the LCTs. These will be compared to the Port Card bunch crossing counter to detect errors. For the bunch crossing synchronization we propose to use pipeline shift registers on the inputs of LCT logic.

### 11.6.3 Selection Logic

Flexible selection logic is based on Look-Up Table (LUT) conversion. In this case the Pattern IDs from all incoming LCTs serve as addresses for the LUT, and the LUT output represents a quality factor corresponding to a particular combination of wire and strip patterns. Presently we are allowing up to 11 bits to be used to address the memory, of which the lower 8 bits are the CLCT pattern number and two higher bits are the ALCT quality number. One bit is unassigned. The LUT output is currently an 8 bit floating point number representing the product of the two pattern id's. The output of the LUT is arbitrary and reprogrammable allowing us to modify the function as we gain experience under real running conditions.

### 11.6.4 Interface to Sector Receiver Card

The distance between on-chamber electronics (FEBs, MBs, MPC) and the counting room is about 100 m. The MPC must be able to transmit 120 bits of data every 25 ns. Optical links are the best and possibly the only choice for communication between MPC and SRC. The general requirements for these optical links are:

1. Simplex links
2. 25 ns framing.
3. Simple error detection, no error correction.
4. Effective bandwidth of 1 Gbit/s, or 25 bits per bunch crossing.

Each MPC contain parallel to serial converters and optical transmitter modules, while each Sector Receiver (SR) contains optical receiver modules and serial to parallel converters. For prototyping purposes we have been using the Hewlett Packard HDMP-1022/1024 Transmitter/Receiver chip set. This chip set has user-selectable parallel data widths (16, 17, 20, or 21 bits) and high-speed serial data rates. To transmit 120 bits of data at 40 MHz using a 20-bit data width, we need six transmitters on each MPC, six optical modules, and six optical fibers. This chip set has relatively high power consumption and high price. We are currently prototyping with this chip set. Other chip sets will be evaluated in the coming year, including the Texas Instruments TLK2500/2501 chipset running at 80 MHz.

## 11.7 Clock and Control Board

CSC Clock & Control Boards (CCBs) receive timing information from the LHC accelerator Timing, Trigger, and Control (TTC) system [11.6]. One CCB resides in each detector-mounted peripheral crate to distribute the 40 MHz system-clock to the ALCT, CLCT, DAQMB, and TMB modules. The backplanes and CCB modules are designed to account for path-length delays so each trigger module receives the clock at the same point in time.

Clock, bunch-crossing-reset, and bunch-crossing-zero signals are distributed from the CCB to the trigger modules, instead of sending the bunch crossing number. Each trigger module

that needs the BXN will maintain an internal counter that increments every clock cycle. When a bunch-crossing-reset signal arrives, the counter halts, and resets to its initialization value. Counting resumes when the bunch-crossing-zero signal is received.

Additional CCB signals that may be needed by peripheral crate electronics to mitigate radiation upsets (SEU), such as CSC logic resets, are under discussion.

## 11.8 Synchronization and Latency

The source of clock signals for the CSC Local Trigger are TTCrx [11.6] receivers; one is mounted on each CCB card, *i.e.*, one per peripheral trigger crate. The CCB card has been designed to deliver isochronous clock signals to each occupied slot of these crates. Each CLCT/TMB card distributes the clock signal to the on-chamber trigger cards, *i.e.*, one ALCT card and up to five CFEB cards.

The return trigger data signals will arrive at the CLCT/TMB board with an arbitrary phase relative to the CLCT/TMB on-board clock due to chamber-to-chamber variations in the connecting cable lengths. Since these signals might lie within the setup and hold time of the input latches and not be reliably latched, phase adjustment in several steps within the 25 ns cycle will be provided for the clock signals from CLCT/TMB to the on-chamber trigger cards.

The timing of the anode hits relative to the LHC clock with which they are latched needs to be adjusted within  $\pm 2$  ns to obtain the optimum probability for identifying the muon bunch crossing. Muons have times of flight to the various endcap chambers that vary from 19-42 ns. Within a single chamber the time of electrical signal propagation varies by as much as 9 ns due to different wire lengths and cable lengths from front-end anode boards to the ALCT board. Therefore, the fine timing needs to be adjusted within each chamber. This is done on the ALCT card by 16-channel ASICs that delay the signals by 0-30 ns in 2 ns steps. These chips also shift differential LVDS signals to TTL levels.

No comparable provision is made for fine timing of cathode comparator signals, since the CLCT timing requirement is imposed as  $\pm 1$  bx by the anode/cathode time coincidence at the TMB. However, since the ALCT data is ready well before the CLCT data, the ALCT information is delayed by an integer number of crossings in the TMB in order to bring both types of LCT data into time coincidence.

The bunch crossing number BXN is calculated on the ALCT, CLCT/TMB, and MPC boards. In each case, the calculation is done using BxReset and BC0 pulses distributed from the CCB board. The BxReset pulse stops a 12-bit (0-3563) bunch crossing counter and loads it with a pre-determined offset value. Counting resumes from the preset values with the arrival of the bunch zero (BC0) pulse. The ALCT, CLCT/TMB, and MPC modules have appropriate BXN preset values that compensate for their different processing times so that their BXNs for a given muon will match. At each stage, the low-order bits of BXN are compared to ensure synchronization. Errors are handled by zeroing output data and recording the discrepancy for DAQ readout.

MPC optical link synchronization is handled by the Sector Receiver and is described in the following chapter.

### 11.8.1 Synchronization Procedure

The Anode LCT is used to synchronize the trigger system. The Anode LCT identifies the correct bunch crossing with greater than 99% efficiency. The BXN generated by the ALCT will be histogrammed and compared to the bunch crossing structure of the LHC beam. By using the repeating nature of the bunch structure, the ALCT synchronization can be determined from the CSC data in 25 minutes of running at  $10^{32} \text{cm}^{-2}\text{s}^{-1}$ . Each board in the CSC trigger chain counts BXN starting from its preset value and sends BXN on the output trigger link. This makes it possible to determine the offsets for the other boards in the CSC trigger system.

### 11.8.2 TMB Synchronization of ALCT Data

On the CLCT/TMB module, the ALCT data is first de-serialized. Although the incoming data clock is synchronous to the LHC clock, it will have a different phase from the TMB/CLCT board clock, depending on the length of cables between the ALCT and CLCT/TMB boards. This phase can be inconvenient for reliable latching in the TMB circuitry. Therefore, synchronization of the input ALCT data on a time scale finer than one bunch crossing is necessary. A simple circuit eliminates the possibility of unreliable data reception from the ALCT. One possible circuit latches input ALCT data on both rising and falling edges of the on-board clock. Then a programmable multiplexer selects the more favorable phase. Finally, a latch having a setup and hold time of less than one-half bunch crossing stores the data on the next cycle on the rising edge of the on-board clock signal.

### 11.8.3 Latency Determination

The estimated latency of the CSC Local Trigger is 51.5 bx, from the time of the collision until data is available at the end of the optical fiber in the counting room. This, plus the latency of the CSC Track-Finder, is sufficient to furnish CSC trigger data to the Global Muon Trigger 78 bx after the collision. The accounting of this latency, with emphasis on the critical path cathode signals, is shown in Table 11.6.

**Table 11.6:** Latency of the CSC Local Trigger

| Description                                             | bx this step | Total bx |
|---------------------------------------------------------|--------------|----------|
| Time of Collision                                       | 0.0          | 0.0      |
| Time of flight and signal propagation                   | 2.0          | 2.0      |
| Cathode preamp peaking time and latency                 | 6.0          | 8.0      |
| Comparator latency plus signal transmission to CLCT/TMB | 5.0          | 13.0     |
| Logic to find Anode and Cathode LCTs and combine them   | 15.5         | 28.5     |
| Port Card processing                                    | 5.0          | 33.5     |
| Optical link transmission (90m)                         | 18.0         | 51.5     |

## 11.9 Prototypes

Electronics prototypes were tested on full-size prototypes of the largest CSC chamber in the summers of 1998 and 1999 at CERN [11.9]. The first tests in 1998 were done at the H2 beam line, where a silicon beam telescope was used for resolution studies. Later tests in 1998 and 1999 were done at the GIF (Gamma Irradiation Facility), where LHC-like backgrounds were provided by an intense gamma source. The main purpose of the 1998 tests was to verify the performance of the CSC trigger electronics, while the main purposes of the 1999 tests were to test performance under high background rate conditions and verify engineering features such as large count and serial DAQ readout through a DDU.

Prototype comparator ASICs were produced in 1997, 1998, and 1999. Prototype ALCT and CLCT boards were produced in 1998 and 1999, and an additional ALCT prototype was produced in 2000. Near-final prototypes exist for all on-chamber CSC system electronics (CFEB, AFEB, and ALCT boards). These boards have been successfully radiation-tested during 2000.

Off-chamber CSC local trigger-related electronics (CLCT/TMB, MPC, DAQMB, and CCB boards) have been successfully prototyped but not finalized.

### 11.9.1 Comparator ASIC Prototypes

The 16-channel cathode comparator ASICs tested during summer 1998 had 32 half-strip output bits. Six of these chips were mounted on a 96-channel comparator board, shown in Figure 11.12, that attached directly through connectors to the cathode front-end board. The 192 half-strip bits were converted from TTL levels to differential LVDS and 96 of these signals were driven on cables to a cathode LCT card in a CAMAC crate, where they were recorded. From the 1998 data, the efficiency of the comparator ASICs for identifying the correct half-strip was determined in two ways. First, the half-strip bits were compared to bits predicted by the precision charge determination of the front-end DAQ cards[3]. These cards employ switched-capacitor arrays (SCAs) for charge storage and ADCs for digitization. The typical noise level on the DAQ data was 1.6 fC (1.6 mV after the amplifier/shapers), while the typical total cathode charge was 100fC. The efficiency was found to be  $90.4 \pm 0.2\%$  for exact half-strip match, while a match window of  $\pm 1$  half-strips yielded an efficiency of  $98.3 \pm 0.1\%$ . The second way the comparator match efficiency was determined is less biased but gives lower statistics. This method uses the precision DAQ data to track muons through the chamber, leaving out one layer (#3) from the fit. The extrapolated position in this layer is then compared to the half-strip bit found by the comparator ASIC. By this method, the efficiency for exact half-strip match is measured to be  $88.2 \pm 0.7\%$ , while a widened match window of  $\pm 1$  half-strips yields an efficiency of  $94.9 \pm 0.4\%$ .

Digital circuitry in the 1999 version of the comparator ASIC compresses the 32 half-strip bits into 8 output time-sequenced “di-strip triad” bits. An entire set of these ASICs for one entire full-size CSC chamber was tested during the summer 1999 test beam studies. Performance was similar to that of the 1998 Comparator ASIC, while the 4:1 compression plus the use of bit serializers resulted in a much higher ratio of channels to signal cables.

### 11.9.2 Cathode LCT Prototypes

In the 1998 tests, the half-strip bits found by the comparator ASICs were sent to a 48-strip cathode LCT card, which identified the multi-layer patterns of valid muon trajectories. All



**Fig. 11.12:** The 96-channel cathode Comparator cards used in the 1998 test beam studies. These receive amplified cathode signals from CFEBs on which they are mounted. Six Comparator ASICs produce 192 half-strip bits that are converted to differential LVDS for transmission to CLCT cards.

ideal patterns of muon tracks within the envelope shown in Figure 11.8 were included. A mezzanine card converted LVDS signals to TTL levels. These signals were distributed by a “front” Altera 10K50 PLD (programmable logic device) to Cypress 128Kx8bit SRAMs, which found the patterns, while a “rear” Altera 10K20 PLD collected the patterns (if any) found by the SRAMs and selected the best according to pattern number. Higher numbers corresponded to larger numbers of hit layers and straighter tracks. The CAMAC interface was implemented in another Altera 10K20 PLD. Output trigger information was fed through a National Instruments Channel-Link to a “Trigger Motherboard” which sorted LCTs in the case of multiple candidates, and performed a time coincidence between cathode LCTs and anode LCTs. These modules are shown in Figure 11.13, running in internal trigger mode at the H2 beam line.

The efficiency of the cathode LCT card for identifying muons was measured both by comparison with the precision DAQ data, and by comparison to external muon tracking provided by a silicon telescope. The position reported by the cathode LCT card corresponded to the track position at chamber layer 3. When DAQ data was found in layer 3, the position of the cathode LCT was compared to the layer 3 cluster center. Of these events, 95% were found as half-strip (high-momentum) patterns, 99.2% were found either as half-strip or di-strip (low-momentum) patterns, and 0.8% were not found by the cathode LCT card. The differences in position between trigger and precision DAQ data are shown in Figure 11.14. The inefficiencies can be mostly eliminated in the future by inclusion of additional trigger patterns for tracks that travel very close to the boundary between half-strips or di-strips.

The second method for estimating the efficiency and resolution of the cathode LCT card used the silicon tracking telescope. This telescope defines tracks to an accuracy of about  $100\text{ }\mu\text{m}$  after 100 cm extrapolation to the chamber. Half-strip cathode patterns were found for 96% of the



**Fig. 11.13:** CSC trigger prototypes for the 1998 test beam at H2. From the middle to the right of the picture are shown: ALCT, CLCT, and TMB modules. On the left are shown anode readout TDCs.

tracks, with position residuals very close to a box distribution one-half strip wide. Another 3.7% of tracks were found as di-strip patterns by the cathode LCT logic. The positions of these tracks in di-strip units shows that di-strip patterns occur at the boundaries between half-strips, where the half-strip pattern tables were not optimally tuned. Another 0.25% of tracks were not found by the cathode LCT logic. Again, the inefficiencies can be mostly eliminated by inclusion of additional trigger patterns for tracks that travel very close to the boundary between half-strips or di-strips where the pattern tables were not optimally tuned. The net cathode LCT efficiency is found to be  $99.75 \pm 0.04\%$  in this study.

The VME 9U-height LCT cards produced in 1999 are shown in Figure 11.15. These handle large numbers of front-end signals, due to the data compression in comparator ASICs as well as the use of Channel-Links. For this round of tests, the same hardware was used for cathodes and anodes, while the different algorithms were implemented using different PLD configurations. CLCT boards input signals from 480 strips, an entire chamber, while ALCT boards handled 192 wire groups, one half of a chamber. Input signals were received by Channel-Links and fed into a single Altera 10K200E PLD. Cathode and anode LCT boards differed only by the algorithms implemented in the PLDs. The LCT boards sent input bits and output LCTs to the DAQ system through a FIFO dump to a DAQ motherboard.

The ALCT version of the LCT99 module used the full envelope of wire groups shown in Figure 11.10. The CLCT version used the envelope shown in Figure 11.8 but with the edges at  $\pm 2$  units trimmed. The different patterns within the envelope were not distinguished, but rather the number of layers hit within the envelope was counted. We anticipate the final version of CLCT/TMB boards will use this type of module, updated with larger gate arrays having the capacity to use more patterns.



**Fig. 11.14:** Differences in position as measured by the trigger electronics and muon track location determined by precision charge readout. Test beam data from 1998 was used, and the position differences are shown in half-strip bins. The top plot shows the tracks found by triggering on half-strip patterns, while the bottom plot contains tracks found by triggering on di-strip patterns as well as tracks that were missed (right-most bin).

### 11.9.3 Anode LCT Prototypes

In the 1998 tests, the anode discriminator output bits were sent to a 48-strip anode LCT card (see Figure 11.13), which identified the multi-layer patterns of valid muon trajectories and the bunch crossing. A mezzanine card converted LVDS signals to TTL levels. These signals were distributed by a “front” Altera 10K50 PLD to Cypress 128Kx8bit SRAMs. The SRAMs contained all possible patterns as LUTs. A “rear” Altera 10K20 PLD collected the patterns from the SRAMs and selected the best according to pattern number. Higher numbers corresponded to larger numbers of hit layers and straighter tracks. The CAMAC interface was implemented in another Altera 10K20 PLD.

For the anode LCT card, there are two efficiencies to be evaluated: wire pattern-finding, and bunch crossing identification. The wire pattern-finding efficiency was found by comparison of



**Fig. 11.15:** LCT99, a VME 9U-height LCT prototype card produced in 1999, that can be configured to handle 480 cathode strips or 240 anode wire groups from a CSC chamber.



**Fig. 11.16:** The reduced envelope of cathode half-strip or di-strip hits used in pattern-finding in the 1999 CLCT prototype.

the position of layer 3 TDC hits to the anode LCT position. It is found that 98.7% of tracks match exactly, 0.8% match within 1 wire group, 0.4% are further away, and 0.1% of events contain no

anode LCT pattern. If  $\pm 1$  wire group matching is allowed, the overall anode LCT pattern finding is  $99.5 \pm 0.1\%$  efficient.

To find the bunch crossing efficiency, the anode LCT module uses an  $n$ -layer coincidence technique. At the test beam, the muons arrive asynchronously to the 40 MHz clock used by the synchronous electronics, unlike in the LHC conditions. The phase of the muon arrival was determined by using a TDC to measure the delay between a test beam muon scintillator paddle and the 40 MHz clock. The efficiency for correct bunch crossing identification depends on this phase. After choosing only test beam muons with phase within the  $\pm 2$  ns specification for system timing accuracy, the bunch ID efficiencies are, for layer coincidence levels 1-6, 98.0%, 99.2%, 99.2%, 99.1%, 98.5%, and 98.0%, respectively. Optimal efficiency is found for coincidence levels 2, 3, and 4.

The dependence of ALCT bunch ID efficiency on background rate was studied at the GIF facility. The efficiency is above 99% up to the nominal maximum LHC rate of 20 kHz/wire group, dropping only slightly to about 98.5% at seven times the maximum rate (140 kHz). The implications for the CSC trigger system are important: the CSC Track Finder does not need to correlate muon stubs from apparently different bunch crossings, but can demand that all muon stubs arrive in the same bunch crossing. This represents a large simplification in the CSC Track Finder logic described in Chapter 12.

Based on the 1999 prototyping, we re-designed the system to save costs in cabling and PC boards. This required new anode electronics that are mounted on the chamber. A 384-channel third-generation anode LCT trigger and readout card, called ALCT2000 (shown in Figure 11.17), was built during early 2000. This prototype supplies anode triggering and readout for an entire CSC chamber. Five such modules were built in early 2000 and have been used in tests with cosmic rays at Fermilab. The new module incorporates additional features in support of the anode front-end boards, such as front-end amplifier test pulsing, temperature monitoring, and voltage and current self-monitoring [11.10].

A block diagram of the ALCT2000 board is shown in Figure 11.18. Anode data flows from left to right, starting with the discriminator output signals. These pass into custom delay/translator ASICs which allow a variable delay (0-30 ns in steps of 2 ns) to ensure correct phasing of the anode signals with respect to the board clock, and convert differential LVDS levels to TTL single-ended levels. The delayed signals are then fed to four 96-channel LCT PLDs that find wire patterns and output them to a single Concentrator PLD. The Concentrator PLD finds the best two LCT patterns and outputs them to Channel-Link bit serializers for transmission to the TMB and full data dump to the DAQMB. Analog functions are at the bottom of the diagram. The Altera PLDs are loaded from flash RAMs on power-up, and can be re-configured using one of the two JTAG chains. Another JTAG chain allows for quick re-configuration of registers that control common functions such as trigger configuration and AFEB discriminator thresholds, and read back currents, temperature, and thresholds.

The trigger features of the ALCT2000 prototype are separated into two types of PLDs. The LCT PLDs each handle 96 channels of anode input data. They perform a number of functions shown in Figure 11.19. On the left side are the external ADB cards and delay ASICs. Proceeding to the right are the functions for masking off hot anode channels, injecting test patterns, one-shots to stretch anode signals to allow coincidence with late-arriving hits, sharing necessary signals between gate arrays, calculating the number of anode planes with hits in the predetermined collision and accelerator patterns, and priority encoding to select the best LCTs. In addition, there



**Fig. 11.17:** The “ALCT2000” module, a 384-channel on-chamber board incorporating trigger functions, pipelined raw data storage, and support of anode front-end boards.

is a parallel DAQ data path for the raw anode hits, which are fed into a FIFO for readout under control of the Concentrator PLD.

The Concentrator PLD provides a number of functions: it finds the best two LCT patterns, outputs them to the TMB, and controls the LCT PLDs. A block diagram of this chip is shown in Figure 11.20.

#### 11.9.4 TMB and CCB Prototypes

TMB and CCB prototypes were built for use in 1998 and 1999 test beam studies at CERN. These prototypes used Altera PLDs. The prototypes used in 1998 are shown in Figure 11.13. Signals were received by the TMB from ALCT and CLCT prototypes. Studies were complicated by the variable phase of the 40 MHz oscillator with respect to the test beam muons, which makes an accurate estimate of the time correlation efficiency impossible. Nonetheless, the tests showed a TMB efficiency of 98% for a  $\pm 1$  bx time coincidence. This number should be higher with a synchronous beam. The prototypes built for the 1999 studies, shown in Figure 11.21, used VME form factors and implemented VME interfaces for ease of configuration.



**Fig. 11.18:** Block diagram for the ALCT2000 prototype, including both analog and digital features.

### 11.9.5 MPC Prototype

The Muon Port Card prototype board built in 2000 receives data from up to three Trigger Motherboards (TMB99 prototypes), performs sorting of up to 3 best muons out of 18, and transmits data representing those best three muons to Sector Receiver (SR) board over six optical links. It was decided to interface with only three TMB99s in order to simplify the design of this first prototype. Subsequent prototypes will have the full input capacity. The prototype includes six HDMP-1022 G-Link serializers and six Methode MDX-19 optical modules for communication with one SR.

In addition to the main sorter logic, two groups of FIFO buffers are implemented to test the MPC internal logic and its communication with the Trigger Motherboards (TMB) over channel



**Fig. 11.19:** Block diagram for logic contained in the 96-channel LCT PLDs on the ALCT2000 prototype.

links and with the Sector Receiver (SR) board over optical links. The MPC testing circuitry consists of two independent groups of FIFO buffers: an input FIFO which can be used to input muons stubs into the board and an output FIFO which can be used to examine the results of the sorting logic. Test patterns representing six muons can be transmitted simultaneously from all input FIFOs to the G-Link transmitters and further to the SR at 40 MHz upon a VME command.

Data communication was tested between the TMB and MPC by loading pass-through logic in the sorting PLD. LCTs were input from the TMB over Channel Links and then read back from the Port Card via the output FIFO over VME. The test was successful. In addition, the sorting logic was tested by load many millions of test patterns into the input FIFO and reading the results of the sorting logic from the output FIFO. This test was also successful. A more complete test of the CSC Track Finder system using the MPC, SR and SP together is described in chapter 12.

### 11.9.6 Radiation Resistance

The CMOS electronics used in the CSC Local Trigger is subject to two types of effects due to radiation [11.11]. The first effect is due to ionization of charged particles. This gives rise to cumulative charge build-up and defect activation effects in the silicon dioxide insulation layers that depends on the total ionizing dose (TID). The second effect is that of logic upsets that cause errors, particularly in memory-based devices such as programmable gate arrays and RAM chips. These are known as single event upsets (SEUs).

The total ionizing dose for 10 LHC years at full luminosity is expected to be no more than 1.7 krad at the inner CSC chambers and no more than 100 rad at the outer CSC chambers [11.12]. These numbers are far below the 10-30 krad tolerance of most CMOS devices. The integrated neutron flux is estimated to be no more than  $6 \times 10^{11} / \text{cm}^2$  at the inner CSC chambers and no more than  $1.3 \times 10^{11} / \text{cm}^2$  at the outer CSC chambers [11.12]. Our guideline is to take a factor of three safety factor on the neutron rate: a maximum integrated neutron flux of  $2 \times 10^{12} / \text{cm}^2$  for



**Fig. 11.20:** Block diagram for logic contained in the Concentrator PLD on the ALCT2000 prototype.

on-chamber electronics and  $4 \times 10^{11} / \text{cm}^2$  for electronics in the peripheral crates for  $5 \times 10^7$  s of running time.

All of the on-chamber CSC trigger electronics have been tested for radiation resistance [11.13] [11.14] using 63 MeV proton beams at UC Davis, which give a well-calibrated TID dose and are expected to give SEU rates similar to the rates for neutrons from LHC collisions. No significant effects were seen due to TID within the expected dose on any devices. No problems were seen for the comparator ASICs.

Among ALCT components, the Altera PLDs show a strong SEU sensitivity. The PLDs can be periodically reloaded from Altera flash RAMs, which showed no SEUs. Reloading these devices takes about 150 ms. This refresh would be done using a central command, distributed



**Fig. 11.21:** The CCB (left) and TMB (right) modules that were used in 1999 studies, shown on extender cards.



**Fig. 11.22:** Block diagram of the MPC prototype modules that were built in 2000 for use in tests with CSC Sector Receiver and Sector Processor prototypes.

through the TTC system [11.6]. Tests have shown that if any bit flip is counted as an error, the cross-section is  $2.3 \times 10^{-9} \text{ cm}^2$  per neutron: each chip will suffer an SEU every 3 hours at full

luminosity (including the safety factor of 3). If only ALCT trigger errors are counted, the cross-section decreases to  $7 \times 10^{-11} \text{ cm}^2$ .

In principle, the SEU problem can be greatly reduced by a design in which critical ALCT logic is triplicated and a small “voting” circuit chooses the answer found by at least two of the three circuits have found. This reduces the SEU problem to only those cases in which two SEU’s have occurred within the same FPGA or PLD. According to Poisson statistics, this happens with probability  $P(2)=0.5 \times P^2$ , where  $P$  is the probability of each SEU. Since  $P$  is generally a small number,  $P(2)$  can be made extremely small. A refresh from flash RAMs at that time resets  $P$  to zero, and also fixes the rare cases in which the voting circuit itself undergoes an SEU. However, tests of such a design have shown a cross section of  $5 \times 10^{-11} \text{ cm}^2$ , only modestly smaller than the non-voted logic cross-section. One benefit of the voting technique is that the errors are internally detected and the erroneous ALCT outputs are suppressed.

The periodic refresh cycle is also required for the CFEB Xilinx FPGAs, which show SEU cross-sections in the range  $(1-4) \times 10^{-10} \text{ cm}^2$ . These chips reload in about 5 ms. Currently, conversion from Altera designs to Xilinx is being studied because of the much faster reload time.

Peripheral trigger electronics will be studied for radiation effects when the pre-production prototypes become available.

## 11.10 Simulation Status and Results

In the ORCA reconstruction program (version 4), the CSC Local Trigger package [11.15] implements each of the CSC Local Trigger modules and the trigger data passed from one module to the next as classes. The CSC Local Trigger package simulates the trigger functionality up to and including the Sector Receiver, and passes its results to the CSC Track Finder package.

The CSC Local Trigger package receives input from the CSC chamber simulation. These ‘DIGIs’ are the software analogues to the outputs of the AFEB and CFEB boards. The anode wire DIGIs contain the anode discriminator hits, and the cathode strip DIGIs contain the results of the cathode strip preamplifiers and comparator ASICs (half-strip hits). Both anode and cathode DIGIs contain timing information digitized in units of bunch crossings.

The classes for the ALCT, CLCT, TMB, MPC, and SR implement the patterns and logic as previously described for the hardware. The four data classes in the CSC Local Trigger package represent the connections from ALCT to TMB, from CLCT to TMB, from TMB to MPC, and the SR output (to the CSC Sector Processor). Since the Muon Port Card only sorts the correlated LCTs and adds a chamber ID, the class used for the data passed from TMB to MPC is also used for the data passed from MPC to SR.

The efficiency of CLCT pattern-finding can be determined by recording the number of chambers with CLCT stubs, given that an ALCT stub has been recorded for that chamber. Likewise, the efficiency of ALCT pattern-finding is given by the fraction of chambers with ALCT stubs, given that a CLCT stub has been found. If the stubs are required to lie within the chamber struck by the generated high-momentum muon, the found efficiencies are nearly 100%. Since both types of muon stubs are found with full efficiency, the combination is also fully efficient. The small inefficiency is strongly dependant on the tails of the drift time distribution and the amplifier response. Study of these effects is continuing [11.15].

In Figure 11.23 are shown the deviations between the  $\phi$  position reconstructed by the CSC local trigger and the  $\phi$  of the generated muon track. The deviations are shown for each muon station, in units of CSC Track Finder  $\phi$  bins which are 0.26 milliradians wide. Since the CSC strips are radial, a constant resolution in strip widths corresponds to a constant resolution in the  $\phi$  coordinate. The distributions are observed to be independent of radius, with only small tails. This is an indication that the deviation is essentially due to the one-half strip width granularity of the simple CLCT trigger algorithm used in the simulation. Future versions of CLCT boards may have larger FPGAs than current prototypes, that would allow them accommodate additional trigger patterns, thus improving the position resolution.



**Fig. 11.23:** The differences between the  $\phi$  position reconstructed by the CSC local trigger and the  $\phi$  of the generated muon track at layer 3 of the chambers. The differences are shown for each type of muon chamber in units of CSC Track Finder  $\phi$  bins (0.26 milliradians).

The  $\phi$  resolutions that are obtained by taking the RMS of the distributions within  $3\sigma$  of the means are shown for each CSC station in milliradians in Figure 11.24. This resolution determines the ability of the CSC Track Finder to measure track curvature in the muon system,

which will be discussed in the following chapter. These resolutions correspond to 0.15 to 0.2 strip widths in ME1 and 0.125 to 0.17 strip widths in the other stations.



**Fig. 11.24:** RMS differences between the  $\phi$  position reconstructed by the CSC local trigger and the  $\phi$  of the generated muon track. The differences are shown for each CSC station as a function of pseudorapidity, in units of milliradians.

The next set of plots show deviations between the  $\eta$  position reconstructed by the CSC local trigger and the  $\eta$  of the generated muon track, for each muon station. In Figure 11.25, the deviation is shown in units of  $\eta$  bins at the input of the CSC Track Finder, which are 0.025 units of pseudorapidity. This binning is wider than the intrinsic resolution of about one wire group, but has been shown to be sufficient for all CSC track finding purposes. In most cases, the distribution lies well within one  $\eta$  bin width.

Note that the  $\eta$  resolutions in ME1/A and ME1/1 are not quite ideal. This is due to the  $25^\circ$  tilt of the anode wires in that chamber, which is confirmed by Figure 11.26 (right). There is also a bowed appearance to the plots of the  $20^\circ$  wide chambers (ME2/1, ME3/1, and ME4/1) in Figure 11.26 (left). This effect is due to the difference in  $\eta$  for a given wire between the center of the chamber and the edge: the wire is stretched straight, while the line of constant  $\eta$  is an arc of a circle. The  $\eta$  resolution, shown in Figure 11.27, is obtained by taking the RMS of the distributions within  $3\sigma$  of the mean values. The resolution is almost constant at 0.007 units.



**Fig. 11.25:** The differences between the  $\eta$  position reconstructed by the CSC local trigger and the  $\eta$  of the generated muon tracks. The differences are shown for each type of muon chamber in units of  $\eta$  bins (0.025 units).

## 11.11 Maintenance and Operation

We plan to have spare boards and components sufficient for 10 years of LHC operations. Each board will have a JTAG- or VME-readable register that labels the particular trigger configuration that was loaded into the board. The entire set of LUT contents and FPGA programs will be given one unique identifier. There are many FPGAs and LUTs in the design, so it is important to verify that the correct patterns were loaded. We do not foresee generating LUT contents on the boards from a VME-loadable register. Rather, we will have precompiled FPGA programs (and LUT contents) that are certified to work properly and that meet all timing specifications.



**Fig. 11.26:** Differences between  $\eta$  reconstructed by the CSC local trigger and  $\eta$  of the generated muon track. These are shown for ME1/1 (left) and ME3/1 (right) versus strip number, in units of  $\eta$  bins (0.025 units).



**Fig. 11.27:** The RMS difference between the  $\eta$  position reconstructed by the CSC local trigger and the  $\eta$  of the generated muon track. This is shown for each type of muon chamber, as a function of pseudorapidity, in absolute units.

## **11.12 Status and Schedule**

The CSC trigger development schedule is shown in Figure 11.28.



**Fig. 11.28:** Schedule for CSC Local Trigger Development.

Near-final prototypes and radiation test results exist for all on-chamber CSC system electronics, *i.e.*, the CFEB, AFEB, and ALCT boards. The on-chamber boards will be produced on a time scale similar to that for the CSC chambers, so that electronics may be mounted semi-permanently on the chambers soon after the time of production for testing. Orders for production of on-chamber CSC system ASICs, including the Comparator chip, will be placed in early 2001. Production of on-chamber boards will commence later in 2001. Production and testing of these boards will take about two years.

Off-chamber CSC local trigger-related electronics, *i.e.*, CLCT/TMB, MPC, DAQMB, and CCB boards, have been successfully prototyped but not finalized. It is anticipated to begin production mode of these boards in 2002, about a year later than the on-chamber CSC electronics. Production and testing of these boards will take about two years.

The installation schedule for the CSC Local Trigger calls for on-chamber electronics to be installed as the electronics are checked out, so that chambers and electronics can be tested, shipped to CERN, and mounted as a single unit. Peripheral crate electronics will be installed at CERN starting in 2003 and will be completed by the middle of 2004 in order to perform integration with other elements of the trigger and DAQ, and to allow sufficient time for system tests. This

schedule is dependant on a timely completion of electrical, cooling, rack, and safety support services on the periphery of the endcap iron disks in the collision hall and the counting room.

## References

- [11.1] F. Loddo *et al.*, “CMS Muon Trigger - Preliminary Specifications of the Baseline Trigger Algorithms”, CMS TN-1996/060, May 1996.
- [11.2] The CMS Collaboration, “The Muon Project Technical Design Report,” CERN/LHCC 97-32, December 1997; “Large Cathode Strip Chambers For The CMS Endcap Muon System”, Nucl. Instr. Meth. A419:469-474, 1998.
- [11.3] S.M. Wang and D. Acosta, “Simulation Studies on the Transverse Momentum Resolution of the CSC Track-Finder”, CMS IN 2000/026.
- [11.4] M.M. Baarmand *et al.*, “Spatial Resolution Attainable With Cathode Strip Chambers at the Trigger Level”, Nucl. Instr. Meth. A425:92-105, 1999.
- [11.5] T.Y. Ling, “Front-End Electronics of the CMS Endcap Muon System”, proceedings of the 4th Workshop on Electronics for the LHC Experiments (LEB98), Rome, Italy, September 1998, CERN-LHCC-98-36.
- [11.6] “TTC Distribution for LHC Detectors”, IEEE Trans. Nuclear Science, Vol. 45, No. 3, June 1998, pp. 821-828; “1999 Status Report on the RD-12 Project”, CERN/LHCC 2000-002, 3 January 2000; and J. Christiansen *et al.*, “TTCrx Reference Manual Version 3.0”, October 1999.
- [11.7] The UCLA CMS Group, “CMS Endcap Muon Trigger System Inter-Module Signals”, CMS note in preparation.
- [11.8] G. Wrochna *et al.*, “CMS Endcap Muon RPC-CSC Trigger Connection”, CMS IN-1998/030, December 1998.
- [11.9] J. Hauser, “Primitives for the CMS Cathode Strip Muon Trigger”, proceedings of the 5th Workshop on Electronics for the LHC Experiments (LEB99), Snowmass, Colorado, CERN 99-09, pages 304-308, September 1999.
- [11.10] The UCLA CMS Group, “Anode LCT 2000 Design”, CMS note in preparation.
- [11.11] F. Facchio, private communication.
- [11.12] M. Huhtinen, private communication.
- [11.13] T.Y. Ling and S. Durkin, private communication.
- [11.14] M. von der Mey and J. Hauser, “Radiation Hardness of CSC Trigger Electronics”, CMS note in preparation.
- [11.15] B. Tannenbaum, “Endcap Muon Trigger Primitive Software”, CMS note in preparation.

# 12 Cathode Strip Chamber Track-Finder

## 12.1 Requirements

### 12.1.1 Physics Requirements

The L1 trigger electronics of the CMS muon system must measure the momentum of penetrating particles in order to reduce the several megahertz rate of low-momentum muons produced at full LHC luminosity. The task of the Cathode Strip Chamber (CSC) Track-Finder is to reconstruct tracks in the CSC endcap muon system and to measure the transverse momentum ( $p_T$ ), pseudo-rapidity ( $\eta$ ), and azimuthal angle ( $\phi$ ) of each muon. Although this task is the same for the Drift-Tube (DT) and CSC muon systems, the optimization of the design of the Track-Finder is significantly different for each muon system because of the different logical partitioning of the trigger primitives and the non-axial magnetic field in the endcap region. The algorithms of the CSC Track-Finder are inherently 3-dimensional to achieve maximum background rejection, as illustrated in Fig. 12.1. Moreover, the measurement of  $p_T$  uses spatial information from up to three stations to achieve a precision similar to that of the DT Track-Finder despite the reduced magnetic bending in the endcap. A  $p_T$  resolution of 25% is necessary (see Ref. [12.1]) to have sufficient rate reduction at L1 with a reasonable threshold.



**Fig. 12.1:** Illustration of the three-dimensional track-finding procedure.

### 12.1.2 Boundary Between DT and CSC Track-Finders

The region of overlap between the DT and CSC muon systems should be covered as efficiently as possible by the L1 trigger; however, it is important to define a sharp boundary between the DT and CSC Track-Finders to avoid a duplication of triggers for single muons in this region. Issues related to this separation are documented in Refs. [12.2] and [12.3].

The boundary in the coverage between the DT and CSC Track-Finders is currently set at  $\eta = 1.04$ , as shown in Fig. 12.2. This extends the coverage of the DT Track-Finder to the high- $\eta$  limit of the MB2/2 chambers, and the coverage of the CSC Track-Finder extends approximately to the outer radius of the ME2/2 chambers. This boundary also corresponds to the separation between the barrel and endcap RPC trigger system, which simplifies the association made in the Global Muon Trigger between RPC muons and DT/CSC muons. The CSC Track-Finder requires at least one track segment in the CSC muon system, since otherwise the track should be found by the DT Track-Finder.



**Fig. 12.2:** Illustration of the CMS muon system showing the boundary division between the DT and CSC Track-Finders at  $\eta = 1.04$ .

To cover the region of overlap efficiently, information from MB2/1 chambers is shared with the CSC Track-Finder, and information from the ME1/3 chambers is shared with the DT Track-Finder.

## 12.2 System Overview

The CSC Track-Finder is defined to be the collection of electronic boards which are on the receiving end of the optical links sent by the CSC local trigger and that transmit L1 muons to the Global Muon Trigger (GMT). The CSC muon system is logically partitioned into 12 azimuthal

sectors (6 per endcap) for purposes of regional track-finding. Thus, 12 Sector Processors (SP) identify up to the three best muons in each  $60^\circ$  azimuthal sector. Each processor is a 9U VME card housed in a crate in the underground counting room of CMS. Three Sector Receiver (SR) cards per sector also reside in the crate to collect the optical signals sent from the Muon Port Cards of one sector, although technology may permit the SR functionality to be merged onto the SP board. A maximum of six track segments are delivered to a Sector Processor from the first muon station (ME1) of a sector. These track segments arrive from three Muon Port Cards, each delivering up to 2 track segments in a  $20^\circ$  subsector, as illustrated in Fig. 11.1. For the other muon stations (ME2-ME4), one Muon Port Card per station delivers 3 track segments. In addition, up to four track segments from the DT muon system (2 from each  $30^\circ$  subsector) are propagated to a transition board in the back of the crate and delivered to each Sector Processor as well. The output from all 12 Sector Processors is sent to a Muon Sorter (MS) which selects the 4 best muons out of 36 for transmission to the GMT. A block diagram of the CSC Track-Finder architecture is shown in Fig. 12.3.



**Fig. 12.3:** Architecture of the CSC Track-Finder. The Sector Receiver functionality may be incorporated onto the same board as the Sector Processor.

## 12.3 System Interfaces

### 12.3.1 Crate and Backplane

The CSC Track-Finder is contained in several 9U VME crates. In the present prototype design, 2 Sector Processors, 6 Sector Receivers (3 per SP), and 1 Clock and Control Board occupy one crate; so 6 crates are needed for the entire CSC Track-Finder. The Muon Sorter is contained in an additional crate. VME addressing up to A24/D16 is carried over a standard VME 3U backplane.

A custom 6U point-to-point backplane is used to carry all track segment data from the Sector Receivers and DT Track-Finder to the Sector Processor. Channel-Link LVDS transmitters from National Semiconductor drive the data onto the custom backplane. The layout of one sector of one crate is shown in Fig. 12.4.



**Fig. 12.4:** Illustration of the card placement and backplane connections for one sector in the Track-Finder crate, according to current prototypes.

However, new optical link technology and recent high-density FPGAs may allow SR and SP functionality to be merged onto the same board, thus allowing the entire CSC Track-Finder

to be housed in one 9U VME crate along with the CSC Muon Sorter. The custom backplane in this case would deliver signals from the 12 SPs to the Muon Sorter.

### 12.3.2 Transition Modules

#### Transition Module to the DT Track-Finder

CSC track segments from the ME 1/3 chambers in the DT/CSC overlap region are sent from the ME1 Sector Receiver to the DT Track-Finder via a transition board on the back of the crate. One connection is needed to transmit two CSC track segments to one sector of the DT Track-Finder, and two such connections are provided from the ME1 Sector Receiver. The information sent is the quality and the  $\phi$  coordinate of each track segment as well as the bunch crossing number (BXN), as listed in Table 12.1. Look-up tables on the Sector Receiver convert the track segment quantities into the format expected by the DT Track-Finder. The transmission technology currently proposed is Channel-Link LVDS. A connector on the backplane carries the signals to the transition board from the Sector Receiver.

**Table 12.1:** Information delivered from one SR to one DT Track-Finder sector.

| Variable | Function           | bits / muon | bits / 2 muons |
|----------|--------------------|-------------|----------------|
| $\phi$   | azimuth coordinate | 12          | 24             |
| quality  | quality            | 3           | 6              |
| BXN      | LSBs of bunch i.d. | –           | 2              |

#### Transition Module from the DT Track-Finder

The DT track segments from the MB2/1 chambers in the DT/CSC overlap region are delivered to the Sector Processor via a transition board on the back of the crate. Connections from two sectors of the DT Track-Finder are required. The information received is the quality,  $\phi$  coordinate, and  $\phi_b$  bend angle of each track segment as well as the BXN, as listed in Table 12.2. Look-up tables on the DT Track-Finder convert the track segment quantities into the format expected by the Sector Processor. The chosen transmission technology is Channel-Link LVDS. A connector on the backplane carries the signals from the transition board to the Sector Processor.

### 12.3.3 Clock and Control Module

#### The TTC interface

The Clock and Control Board (CCB) provides the crate level interface to the TTC system. The TTC interface is based on the TTCrx chip [12.4]. The general sequence of L1A, Reset and BC0 commands is described in Chapter 16. Particularly, for the Reset sequence, the TTC sends a broadcast command (either system, or user) to the TTCrx indicating that the next L1A has to be treated as a Reset. After this data is sent, a single L1A is generated. CCB decoding logic recognizes this command and treats the next incoming L1A as a RESET signal. After some predetermined interval the next broadcast command is transmitted over the TTC indicating that the next L1A should be treated as Bunch Crossing 0. In a similar fashion, CCB recognizes this command and

**Table 12.2:** Information delivered from one DT Track-Finder sector to the SP

| Variable      | Function           | bits / muon |
|---------------|--------------------|-------------|
| $\phi$        | azimuth coordinate | 12          |
| $\phi_b$      | $\phi$ bend angle  | 5           |
| quality       | quality            | 3           |
| BXN           | LSBs of bunch i.d. | 2           |
| Synch./Calib. | Special Mode       | 1           |
| Flag bit      | Denote if 2nd muon | 1           |

treats the next L1A as a BC0. After that CCB internal logic enables generation of L1A to backplane upon every L1A from TTC system. The TTCrx chip can be programmed using an I2C interface and interface controller PCF8584 over VME.

### Clock adjustments and settings

There are fine and coarse delays for clock and command signals incorporated on the TTCrx chip [12.4]. We use the TTCrx Clock40Des1 deskewed signal as the main clock signal from the TTCrx board. Three other possible clock sources are ECL and NIM clocks from external connectors on the front panel, and clock from quartz oscillator.

The selected clock signal acts as a main master clock for the CCB internal logic. The phase of the clock signal provided to synchronization logic, can be adjusted with 2 ns step accuracy with respect to the main master clock. The phase of L1A, BC0, RESET, and two reserve signals (RSV1 and RSV2) distributed to all slots in a crate can be adjusted with 2 ns accuracy in respect to the main CCB master clock.

The selected clock is distributed to the SP and SR modules in the crate via its custom backplane. The phase of each clock signal distributed over this backplane can be adjusted with 2 ns step accuracy with respect to the main master clock individually to each slot in the crate. There is also the possibility to send just a single 25 ns clock pulse to all modules in crate upon special VME command.

### 12.3.4 Crate Power and Cooling

The crate power supply delivers  $\pm 12\text{V}$ ,  $+5\text{V}$ , and  $+3.3\text{V}$ . All other voltages, such as  $2.5\text{V}$  for high-density FPGAs, are obtained using DC-to-DC converters on the boards. The crate power consumption is approximately 500 W for current prototypes (6 crates total), and for a single crate Track-Finder solution using new low power devices.

## 12.4 Sector Receiver

Each Sector Receiver (SR) receives via optical links the Local Charged Track (LCT) information for 3 muons from each of two Muon Port Cards (MPCs) located at the periphery of the CMS detector (except for the first station, where 2 muons from each of three Muon Port Cards are received). This information is then synchronized and reformatted within the SR (via look-up

tables) into angular variables for the muons: the azimuthal angle ( $\phi$ ), the local slope angle in  $\phi$  ( $\phi_b$ ), and the rapidity ( $\eta$ ). These data, along with bits summarizing muon quality and other diagnostic information are then communicated to the Sector Processor and the (barrel muon) DT Track finder in the (differing) format expected by each. Complete input information is also stored for readout by the DAQ system for accepted events. In addition, a VME-readable counter and log of detected hardware errors is included. The basic elements of the SR are shown in Fig. 12.5.

## Sector Receiver Block Diagram



**Fig. 12.5:** Block diagram of the Sector Receiver logic

### 12.4.1 Optical link inputs

Each MPC transmits two or three muon LCTs to a SR. Each SR can receive data from two or three MPCs. For each  $60^\circ$  sector in the current prototype architecture, there are:

1. A SR receiving data from the three MPCs from ME1
2. A SR servicing the MPC from each of ME2 and ME3
3. A half-used SR servicing the MPC from ME4, in the re-scoped scenario with the fourth CSC station

A single SR design has the flexibility to handle these cases. In total, there are 120 bits sent from each MPC to a SR.

The bits from the MPCs are sent via optical links. The present design uses the HP serializer/de-serializer chipset (HDMP 1022/1024) with Methode optical transceivers. The HDMPs operate in simplex mode with frames 24 bits long, with one frame transmitted per bunch crossing. (The HDMPs automatically divide the time between bunch crossings into 24 bits, each approximately 1 ns long.) Of these 24 bits, 3 are for defining the frame and one is a “flag bit” for checking transmitter/receiver synchronization, leaving 20 bits available for data. Thus, each MPC drives 6 links, organized as 3 pairs.

#### **12.4.2 Backplane inputs.**

The SR gets its clock and control signals from the CCB off the backplane, as described in Section 12.3.3. The SR maintains an internal bunch crossing counter which is incremented each clock cycle. With the arrival of a RESET signal, the counter halts and re-initializes to a preset value. It then starts counting upon receipt of the BC0 signal. The 5 least significant bits of this locally determined bunch crossing number are compared to the Anode BXN arriving from the MPC, and an error condition results if they are different. One of the two reserved signals will be used to specify a test mode.

#### **12.4.3 Sector Receiver Outputs to Trigger Path**

There are two separate streams of outputs from the SR, one to the SP via the backplane, and one to the DT Track-finder via transition modules and cables. These output streams are similar but are customized for each recipient. Table 12.3 lists the bits transmitted to the SP. Table 12.1 lists the bits transmitted to the DT Track-finder. The muon track variables  $\phi$ ,  $\phi_b$ , and  $\eta$  are described further below. The accelerator muon bit is simply copied from the input from the MPC. The quality bits are computed from MPC inputs as described below. For each set of 3 muons coming from a MPC, the “CSC tag” encodes in 2 bits which pair of muons, if any, had a common CSC ID; this information helps resolve ambiguities in combining anode and cathode views. Finally, the Error bit is set if the data is invalid, for example if an optical link error has been detected.

#### **12.4.4 Sector Receiver Outputs to DAQ Path**

The SR will store all incoming data in a buffer for a time long enough (several  $\mu$ s) so that the DAQ system can read it for accepted events. The data will be transmitted from the SR to a Front-End Driver using an optical link.

Another buffer will contain a VME-readable error counter and error log, with bunch crossing number and error type for any errors found. Error conditions detectable by the SR include: the above-mentioned mismatch in BXN; a set error bit in data from the MPC, indicating error further upstream; and error bit provided by the HDMP chipset, indicating loss of synchronization in the optical link.

#### **12.4.5 Hardware Implementation**

##### **Optical receiving, de-serialization, synchronization**

As discussed above, the data from the MPCs arrive on optical cables, and is de-serialized, forming a 120-bit-wide data set arriving from each MPC each beam crossing. The data

**Table 12.3:** Information delivered from one SR to the SP

| Variable          | Function           | bits / muon | bits / 6 muons |
|-------------------|--------------------|-------------|----------------|
| $\phi$            | azimuth coordinate | 12          | 72             |
| $\phi_b$          | $\phi$ bend angle  | 5           | 30             |
| $\eta$            | pseudo-rapidity    | 6           | 36             |
| Accelerator $\mu$ | $\eta$ bend angle  | 1           | 6              |
| quality           | –                  | 3           | 18             |
| CSC ghost         | 2 hits in same CSC | –           | 4              |
| Error             | Data not valid     | –           | 1              |

are latched into the Front FPGA with clocks derived from the CCB-derived 40-MHz clock. Data arriving on separate links can be re-synchronized by programmable delays within the FPGA.

### Front FPGA

For each muon, the Front FPGA receives the data from the MPCs. It is connected to all address lines of the muon’s first stage of memory lookup tables, to the DAQ output path, and to the SRs VME interface. Thus, there is some room for flexibility and evolution in the design presented here, if design requirements evolve. The Front FPGA keeps a copy of the incoming data in a buffer several  $\mu$ s deep for readout via DAQ. For diagnostic purposes, the Front FPGA can insert dummy data into the Sector Receiver, simulating data arriving from the MPC, either in single-event-mode, or burst-mode at 40-MHz for 256 beam crossings. Finally, the Front FPGA contains logic for downloading the LUT contents via the VME interface.

### Memory Look-up Tables

The LUT functions are shown schematically in Fig. 12.6. In the current prototype, 6 identical 256K by 16 bit memories are used for each of the six muons. In the first stage, the cathode LCT information is translated into local or approximate values of  $\phi$  and  $\phi_b$ , and anode LCT information is translated into the approximate  $\eta$ . In the second stage, each coordinate is corrected using information from other coordinates. The second stage also corrects for any alignment problems as well as slanted anode wires. The details of these two stages are as follows. In addition, a sixth LUT is used to compute quality bits, as described in the next subsection.

The 8-bit “1/2 strip ID” corresponds to a coarse  $\phi$  position at layer 3 within a particular “station” (ME1–ME4) of 6 layers. For a track traversing this station at an angle, the true  $\phi$  depends on the depth within the 6 layers. Furthermore, at each station, chambers alternating in  $\phi$  are overlapped, with “front” chambers closer to the interaction region than “back” chambers. The Sector Processor can function most effectively if the  $\phi$  passed to it is at a conventional distance from the interaction region. In the first LUT, such a  $\phi$  is computed, using in addition: the 8-bit “CLCT pattern” which encodes which strip pattern was recorded in the 6 layers (and which also carries information for decoding the “1/2 strip ID”); the “L/R Bend” bit, and a “Front/Rear chamber bit” which is derived in the Front FPGA from the 4 CSC ID bits. The resulting “local  $\phi$ ” is 10 bits, in half-strip units, spanning only that particular chamber. From the same LUT also comes

### SR Look-Up Tables

Six 256K x 16 RAMs



**Fig. 12.6:** Sector Receiver memory lookup connections

a raw  $\phi_b$ , which with 6 bits encodes the change in  $\phi$  between layer 1 and layer 6, in fractional strip units, inferred from the CLCT pattern. (Although there are 256 possible CLCT patterns, many of them correspond to identical values of  $\phi_b$ .)

In the other first-stage LUT, the approximate 6-bit  $\eta$  is computed from the 7-bit ALCT wire group and the 4-bit CSC ID (which is necessary since wire grouping depends on the chamber). Unlike the first-stage local  $\phi$ , this  $\eta$  is already global, although still requiring correction in the second stage. In this LUT, we also use some of the extra output bits to pass on an in-time copy of the 4 CSC ID bits to the second stage LUTs.

In the second-stage LUTs, four independent quantities are calculated:

- For the SP, a 12-bit global  $\phi$  is computed from the 10-bit local  $\phi$  using the CSC ID, with alignment corrections also possible using the most significant 4 bits of  $\eta$  from the first stage.
- For the DT Track-finder, a similar 12-bit global  $\phi$  is computed independently in order to provide for coordinate conventions different from the SP.
- For the SP, 5-bit corrected  $\phi_b$  in radians is computed from the raw value in fractional strip units. This requires  $\eta$  since the strip width depends on the position along the strip. The CSC ID is also input to this LUT in case there are chamber-dependent corrections.
- For the SP, the 6-bit corrected  $\eta$  is computed from the first-stage  $\eta$ . This is corrected using the 2 most significant bits of local  $\phi$  from the first stage, primarily to account for slanted

anode wires it ME1/1. The (essentially negligible)  $\phi$ -dependence of  $\eta$  along other anode wires can also be corrected for, at no cost.

Items i) and ii) each have a dedicated LUT, while items iii) and iv) are combined into one LUT. The memories used in the working prototype are described in Section 12.10.1.

### Computation of quality bits

For each muon, 3 quality bits are passed to both the DT Track-finder and the SP. They are functions of the following inputs to the SR: the Valid Pattern flag, the cathode pattern number, the anode pattern quality, the ALCT/CLCT BXN match, all 4 TMB status bits, the incoming error bit, as well as SRs internally generated error flags. Flexibility is needed, since the preferred function may change with operating conditions and as understanding of the trigger performance increases. The simple functions now envisioned could probably be implemented in the Front FPGA. However, in order to maintain maximum flexibility and eliminate potential latency problems with this calculation, we are dedicating the sixth LUT to it for each muon.

### Back FPGA

After the second LUT stage, the data go directly to both the Back FPGA and the Channel-Link chips for transmission to the SP on the backplane (or to SP functions on the same board). Unlike the data to the SP, the data to the DT Track-Finder must go through logic in the Back FPGA in order to select 2 of 3 muons.

Since the Back FPGA is connected to all output signals as well as the VME interface, there is flexibility for various diagnostic tests, all run at 40 MHz. In a typical test, the Back FPGA records the data from 256 events as it passes by on the way to the Sector Processor. For more specialized tests of the SR-SP interface, 256 events can be loaded into the Back FPGA buffer and then clocked out to the SP. These capabilities facilitated debugging of both the SR and the SP. As with the Front FPGA, the Back FPGA is used to load and read the memory LUT contents.

### Computation of CSC ghost bits for SP

It is possible that two track segments delivered by a single MPC may have come from the same CSC chamber, in which case there is an ambiguity in the association of the anode and cathode LCTs. The SR can compare the CSC IDs of each track segment coming from a MPC and set a “CSC ghost” flag for the SP, which in turn will try all  $\eta, \phi$  combinations for this pair of track segments to resolve the ghosts in the track-finding. For example, for stations ME2-ME4, the three muons (designated A, B, and C) in the top half of one SR all come from one MPC. Each has its own CSC ID, and at most two of them can have the same ID. To assist the SP in resolving the ambiguities, 2 “CSC ghost” bits (see Table 12.3) are sent to the SP with the following binary code: 00 means all three IDs are different, 01 means A=B, 10 means A=C, and 11 means B=C. Calculation of the code is in one of the FPGAs for that set of three muons. Similarly, a separate 2-bit code is computed for the three muons in the bottom half of the SR. The case for ME1 is slightly different since 3 MPCs each deliver two track segments to one SR. In that case, only 3 bits are needed, where each bit denotes whether the two track segments from one MPC are from the same chamber.

## Reduction from three to two muons for the DT

As agreed on with the DT group, in the endcap-barrel overlap region, the DT Track-Finder will use muons only from (part of) chamber ME1/3. The MPC servicing this chamber provides up to three muons to the SR, including muons from ME1/1 and ME1/2. The DT Track-Finder can accept at most two muons. Therefore, special logic on the SR is required to select (at most) two muons relevant to the DT Track-Finder from the muons coming from the MPC, transfer those muons, and mask out the rest. The logic for this reduction is in a Back FPGA.

## VME and JTAG Interfaces

The VME interface is in a FPGA near the backplane. This FPGA also contains miscellaneous logic such as the bunch crossing counter. It drives a JTAG controller (design copied from the SP) for servicing the Front and Back FPGAs. There is also a front-panel JTAG connector for bench tests; this will not be used in the working experiment. Currently we use it to load the EEPROMs at the top of the board; the FPGAs themselves are then loaded from the EEPROMs.

## Operational Modes

The Sector Receiver has VME addressable registers to define various modes for reading and writing to memory LUTs, for normal operation, and for test modes. The number of clock cycles per start command and various clock delays are all VME-programmable. Depending on the mode, access to the memory LUT address and data lines is reconfigured using tri-state and bi-directional buffers. The desired interactions among the firmware of the three functional types of FPGAs have been successfully demonstrated in the prototypes.

# 12.5 Sector Processor

## 12.5.1 Overview

The Sector Processor reconstructs tracks from the track segments delivered by the Sector Receivers and the DT Track-Finder. The number of CSC track segments collected by one Sector Processor is 15 per bunch crossing, assuming that ME4 participates. Six track segments are delivered from ME1; three each are delivered from ME2–ME4. Additionally, 4 DT track segments are delivered from the MB 2/1 chambers in the outer wheel of the barrel muon system.

A description of the algorithms for the Sector Processor can be found in Refs. [12.5] and [12.6]. The reconstruction of complete tracks from individual track segments is partitioned into several steps to minimize the logic and memory size of the Track-Finder. First, the track segments from the CSC and DT trigger systems must be synchronized and possibly held for more than one bunch crossing to accommodate bunch-crossing misidentification from the LCT and BTI processors. Next, nearly all possible pairwise combinations of track segments are tested for consistency with a single track. That is, each track segment is *extrapolated* to another station and then compared to other track segments in that station. Successful extrapolations yield tracks composed of two segments, which is the minimum necessary to form a trigger. If an ambiguity is created when two muons enter the same CSC chamber, all possible  $\eta, \phi$  combinations are tried. The process is not complete, however, since the Track-Finder must report the number of *distinct* muons to the L1 trigger. A muon which traverses all four muon stations and registers four track

segments would yield six track “doublets.” Thus, the next step is to *assemble* complete tracks from the extrapolation results and cancel redundant shorter tracks. Finally, the best three muons are selected, and the track parameters are measured.

The overall scheme for the Sector Processor is illustrated in Fig. 12.7. Each of the important blocks is described in detail below.



**Fig. 12.7:** Block diagram of the Sector Processor logic

### 12.5.2 Bunch Crossing Analyzer

The input data to the Sector Processor from the DT and CSC trigger systems is synchronized to the local clock before being sent to the Extrapolation Units. A provision was made in the design include some ability to analyze track segments received in out-of-time bunch crossings for several reasons:

- The DT Track-Finder sends two track segments from one chamber over consecutive bunch crossings
- The bunch crossing assignment of the DT and CSC local triggers is not 100% accurate, although it is nearly so for the CSC system

- It will be easier to commission the system when cable delays are not exactly known

To incorporate a multi-bunch mode, we take advantage of the sparseness of the data. If the data is not sparse, CSC track segments would be lost already at the Muon Port Card, which selects only the three best track segments from 9 chambers. Therefore, we consider track segments from other bunch crossings only if there are empty track segments in the current crossing; otherwise, the size of the extrapolation logic would grow enormously.

The window over which track segments are collected is at least two bunch crossings wide. Although the window is left open for more than one bunch crossing, the Sector Processor must report triggers at the correct bunch crossing every crossing. In other words, overlapping time buckets are used.

This capability is introduced before the track segments are stored in a FIFO (for later retrieval by the Assignment Unit) and before the extrapolation logic. For a given station, the best three track segments (the best six for ME1) are selected from  $N$  crossings based on the track segment quality and on the deviation from the current crossing. The same can be done for the best track segments from MB2/1. In the simplest scenario, the track segments in crossing  $N$  have highest priority, followed by those in  $N+1$ . To keep the sorting logic compact and fast, the Muon Port Card sends the best three track segments in ranked order.

This scheme is shown in Fig. 12.8 for 3 track segments as input. The order of the track segments into the rest of the Sector Processor can be changed; but as this occurs before storage in the local FIFO, it does not influence the rest of the logic. A flag is set to record whether a track segment comes from the current bunch crossing or a different one. This flag will be used in the Final Selection Unit of the Sector Processor to determine if a trigger should be inhibited so that the Sector Processor does not generate extra triggers over several bx.

### 12.5.3 Extrapolation Unit

A single extrapolation unit forms the core of the Track-Finder trigger logic. It takes the three-dimensional spatial information from two track segments in different stations, and tests if those two segments are compatible with a muon originating from the nominal collision vertex with a curvature consistent with the magnetic bending in that region. All possible extrapolation pairs should be tested in parallel to minimize the trigger latency. However, we have excluded direct extrapolations from ME1 to ME4 in order to reduce the number of combinations and to reduce some random coincidences (since those chambers are expected to have the highest rates). The exclusion also facilitates track assembly based on “key stations,” which is explained in the next section.

The extrapolation logic should be programmable, and it is expected to be implemented in FPGAs. A logic diagram for the extrapolation of one track segment from station A to another in station B is shown in Fig. 12.9. The flip-flops for data pipelining are not shown. The extrapolation unit is composed of several sub-units which analyze the  $\eta$  coordinates of the two track segments from different stations, the  $\phi$  coordinates, and the quality of the resulting extrapolation. These sub-units are described below.

#### Eta Road-Finder:

The tests involving the  $\eta$  information from the two track segments are the following:



**Fig. 12.8:** Block diagram of the Bunch Crossing Analyzer

1. Determine if each track segment is in the allowed trigger region in  $\eta$
2. Compare the  $\eta$  values of the two track segments to determine if both lie along a straight line projection to the collision vertex within a certain tolerance
3. Check that the bend angle in  $\eta$  for at least one track segment is consistent with a track originating from the collision vertex. Presently, only one bit (the Accelerator Muon bit) is used to flag if a track segment is parallel to the beam axis rather than projective.

The “AND” of all 3 conditions results in one bit which is sent to all other extrapolation units involving the same pair of stations. In the event that two track segments come from the same CSC chamber, there is an ambiguity in the association of the  $\eta$  and  $\phi$  hits which gives rise to ghost hits. The Sector Processor can test all possible combinations by swapping the  $\eta$  coordinates of two track segments which come from the same CSC chamber. This is accomplished by sharing the result of the  $\eta$  tests with other extrapolation units. The overall output of an  $\eta$  unit, then, is the “OR” of its own test with the result from another  $\eta$  unit with a different track segment in the same source chamber (but the same track segment in the target station). This CSC ghosts handling is only foreseen for ME1 currently.

Those extrapolation units that test track segments from the DT muon system have modified conditions for the  $\eta$  unit because no  $\eta$  information is sent from the DT trigger system. In general, the conditions listed here apply only to the CSC track segment for tracks in the overlap region. The value of  $\eta$  from the CSC track segment is used for further tests in the  $\phi$  road-finder.



**Fig. 12.9:** Block diagram of the extrapolation unit logic, which compares a track segment in one station ( $A_1$ ) with that in another ( $B_1$ ).

### Phi Road-Finder:

The tests involving the  $\phi$  information from the two track segments are the following:

1. Compute the difference in  $\phi$  between the two track segments
2. Check that the difference in  $\phi$  is consistent with the bend angles in  $\phi_b$  measured at each station
3. Compare the difference in  $\phi$  to the maximum allowed at that  $\eta$ . Several thresholds may be employed to provide a coarse  $p_T$  measurement.

### Quality Assignment Unit:

The final quality assignment for the extrapolation is based on the bits from the  $\eta$  and  $\phi$  road-finder units as well as the track segment quality bits. It is generated by a small look-up table. The resulting quality word is either 1 or 2 bits, depending on the stations involved. Its definition is programmable, but we use it to assign a coarse  $p_T$  (low, medium, and high) to extrapolations involving the first muon station (ME1 or MB1). Otherwise, the quality just represents whether the extrapolation was successful or not. The expected  $p_T$  resolution for a  $\phi$  resolution of 10 bits is about 30% when ME1 is involved. The quality word is used later when muon candidates are sorted.

### 12.5.4 Track Assembly Unit

The track assembly stage examines the output of the extrapolation units and determines if any track segment pairs belong to the same muon. If so, those segments are combined and a code is assigned to denote which muon stations are involved. The identification of the participating track segments is registered also.

The underlying feature of a Track Assembly Unit is the concept of a “key station.” For this Track-Finder design, ME2 and ME3 are key stations. A valid trigger in the endcap region must have a hit in one of those two stations. In this way, the output of the extrapolation units can be separated into three data streams: one for patterns keying off ME3, one for patterns keying off ME2 in the endcap region, and one for patterns keying off ME2 in the DT/CSC overlap region. This is illustrated in Fig. 12.10. Only ME2 is used as a key station in the overlap region, since ME3 has no coverage there and ME1 has too many track segments. Some muons will be found by more than one stream, so the Final Selection Unit described in the next section must resolve the double counting.

Each track segment of a key station, of which there are three each for ME2 and ME3, is tested for extrapolations to the other stations. Therefore, the extrapolation results appropriate for that key segment are interrogated. The Track Assembler logic checks if the key track segment has successful extrapolations to more than one station. The output of this logic is a code designating the best track pattern which contains the given key segment. Thus, up to three tracks may be found per data stream, 9 total for all three streams.

There are six track segments allowed in ME1, and the extrapolation quality to ME1 is 2 bits. There are three track segments allowed in each of the other non-key stations, and the extrapolation quality to those stations is 1 bit. Thus, a total of 18 bits are interrogated. Since the number of input bits is small, each of these “Link” units can be implemented as a static RAM look-up memory, as shown in Fig. 12.11. The latency, therefore, is just one beam crossing. The output



**Fig. 12.10:** Illustration of the track assembly procedure separated into three data streams.

code is a 9-bit word labelling the track segments used in each station (*e.g.* 3 bits for ME1, 2 bits each for ME2–ME4), and a 6-bit quality word giving the type and rank of the assembled track. It is possible, however, that a reasonable latency also can be achieved using FPGA track-assembly logic, so this option is kept open as well.

### 12.5.5 Final Selection Unit

The final selection logic combines the information from the Track Assembler streams, cancels redundant tracks, and selects the three best distinct tracks. For example, a muon which leaves track segments in all four CSC stations will be identified in both track assembler streams of the endcap since it has a track segment in each key station. The Final Selection Unit must



**Fig. 12.11:** The Track Assembler Unit implemented as 9 static RAM memories for the endcap and overlap region

interrogate the track segment labels from each combination of tracks from the two streams to determine whether one or more track segments are in common. If the number of common segments exceeds a preset threshold, the two tracks are considered identical and one should be canceled (presumably the lower rank combination, if the two tracks are not completely identical). Thus, the Final Selection Unit is a sorter with cancellation logic. It sorts and cancels 9 tracks down to 3 since there are two endcap data streams and one overlap data stream.

A block diagram of the Final Selection Unit is shown in Fig. 12.12. The sorter part of the logic compares the qualities of all pairwise combinations of tracks from the Track Assembler streams. The cancellation part of the logic does the same for the hit labels. Not all track segments need to be identical for two tracks to be considered identical. Bremsstrahlung, for example, might cause a single muon to deliver two track segments in one station, and this would lead to a fake di-muon trigger which should be suppressed. The actual criterion employed should be programmable. The two comparison steps are done in parallel in one beam crossing. The next step of the logic, the

Final Decision Unit, examines the results of all these comparisons and reports the identities of the three best and distinct muons. It also takes one beam crossing. Finally, the track segment information of the selected muons is taken from a multiplexer and transmitted in the next beam crossing to the Assignment Units of the measurement system. Additional logic connected to the multiplexer determines if all track segments of a given muon come from a later bunch crossing, in which case the muon is suppressed before going to the Assignment Unit. This inhibits one class of double triggers mentioned in Section 12.5.2.



**Fig. 12.12:** Block diagram of the Final Selection Unit

### 12.5.6 Assignment Unit

The Sector Processor measures the momentum of the identified muons in the final stage of processing. This includes the  $\phi$  and  $\eta$  coordinates of the muon, the magnitude of the transverse momentum  $p_T$ , the sign of the muon, and an overall quality which we interpret as the uncertainty of the momentum measurement. The format of the data is specified in Table 12.4. In particular,  $p_T$  and the track quality are combined into an overall rank before transmission to the Muon Sorter. The coordinates are to be reported at the second station, since this is convenient for later association with RPC trigger data in the Global Muon Trigger. This is also convenient for the

Track-Finder because the muon track parameters do not need to be extrapolated back to the interaction point, which would be prone to errors.

**Table 12.4:** Information delivered from one SP to the Muon Sorter

| Variable | unit / precision              | range          | bits / muon | bits / 3 muons |
|----------|-------------------------------|----------------|-------------|----------------|
| $\phi$   | 2.5°                          | 0–60°          | 5           | 15             |
| $\eta$   | 0.075 $\eta$ unit             | 0.9–2.4        | 5           | 15             |
| Rank     | $p_T$ (nonlinear) and Quality | 2–140 GeV/ $c$ | 5+2         | 21             |
| Sign     | Muon sign                     | —              | 1           | 3              |
| BXN      | —                             | —              | —           | 4              |
| Error    | —                             | —              | —           | 1              |

The most important quantity to calculate accurately is the muon  $p_T$ , as this quantity has a direct impact on the trigger rate and on the efficiency. Simulations have shown [12.1] that the accuracy of the momentum measurement in the endcap using the displacement in  $\phi$  measured between two stations is about 30% at low momenta, when the first station is included. (It is worse than 70% without the first station.) We would like to improve this so as to have better control on the overall muon trigger rate, and the most promising technique is to use the  $\phi$  information from three stations when it is available. This should improve the resolution to at least 20% at low momenta, which is sufficient. (The best momentum resolution possible from an offline standalone muon measurement in the endcap is 15%, from Ref. [12.7].) We take advantage of the large multiple scattering for low  $p_T$  muons. Although there is a small probability that a scattering will offset the large magnetic bending between the first two stations (and thus appear as a high momentum muon), it is much less likely to offset the bending between all three stations.

In order to achieve a 3-station  $p_T$  measurement, one must be careful not to include too much data; otherwise, the size of the look-up memories will be prohibitive. We have developed a scheme that uses the minimum number of bits necessary in the calculation. The first step is to do some pre-processing in FPGA logic: the difference in  $\phi$  is calculated between the first two track segments of the muon, and between the second and third track segments when they exist. Only the essential bits are kept from the subtraction. For example, we do not need the same accuracy on the second subtraction because we are only trying to untangle the multiple scattering effect at low momenta. The subtraction results are combined with the  $\eta$  coordinate of the track and the track type, and then sent into a megabyte-sized memory for assignment of the track rank ( $p_T$  and quality) and sign. Tracks composed of only two track segments are allowed also in certain cases. This scheme is illustrated in Table 12.13 for the parameter assignment of one muon. Three such units are necessary for the three best muons selected by the Final Selection Unit. Since the first stage of the parameter assignment is done in an FPGA, additional logic may be added to cancel certain track classes when they occur near sector boundaries. This may help to reduce fake di-muon triggers.

The 2-bit quality assigned to a muon reflects the uncertainty in the  $p_T$  assignment. Specifically, the highest quality is assigned to tracks that have segments in 3 or 4 CSC stations, including ME1, since the best resolution is possible. Medium quality is assigned to tracks that have



**Fig. 12.13:** Block diagram of the Assignment Unit for one muon.

segments in only two stations, of which ME1 must be one. Finally, lowest quality is assigned to all tracks that do not include ME1, since only a very poor  $p_T$  resolution is possible.

### 12.5.7 Hardware Implementation

The Sector Processor logic should be fully programmable, so FPGAs and SRAM should be used. Microprocessors and DSPs are too slow for L1. The hardest challenge for the hardware implementation is the design of the Extrapolation Units, which have a high I/O count and a large amount of logic. It is expected that the Extrapolation Unit logic will be implemented in high-density FPGAs such as the Virtex family from Xilinx. The Bunch Crossing Analyzer and global FIFO are expected to be implemented in more moderately-sized FPGAs. The Track Assembler Units, and the Assignment Unit, will be implemented in SRAM memory in conjunction with FPGA logic.

Approximately 500 signals will be received by one Sector Processor every bunch crossing. Thus, both connector space and board routing will be challenge. The Sector Processor also contains several high-density FPGAs with a very large pin-count (presumably in a ball-grid array), which further complicates the board routing.

## 12.6 CSC Muon Sorter

A total of 12 SP are needed for both endcap regions and the DT/CSC overlap region. Each SP outputs three muons, but only the four best muons from the CSC chambers are to be

reported to the Global Muon Trigger (GMT), in ranked order. Thus the purpose of the CSC Muon Sorter is to select four best muons out of up to 36 muons coming from 12 SP, and transmit them to the GMT every 25 ns. The format of the data sent to the GMT is specified in Table 12.5.

**Table 12.5:** Information delivered from the CSC Muon Sorter to the GMT

| Variable | unit / precision  | range                         | bits / muon | bits / 4 muons |
|----------|-------------------|-------------------------------|-------------|----------------|
| $\phi$   | $2.5^\circ$       | $0\text{--}360^\circ$         | 8           | 32             |
| $\eta$   | $0.075 \eta$ unit | $0.9\text{--}2.4$             | 6           | 24             |
| $p_T$    | non-linear        | $2\text{--}140 \text{ GeV}/c$ | 5           | 20             |
| Quality  | –                 | –                             | 3           | 12             |
| Sign     | Muon sign         | –                             | 1           | 4              |
| BXN      | –                 | –                             | –           | 4              |
| Error    | –                 | –                             | –           | 1              |

## 12.6.1 Algorithm

The sorting is based on a 7-bit rank, which is provided by the SP. Higher ranks (i.e. larger 7-bit rank patterns) correspond to “better” muons for the purposes of sorting, so the MS selects the four muons with the largest rank and outputs them in descending order. The best muon should always be present on the first link to the GMT, the second best muon – on the second link to the GMT and so on. The rest of the bits belonging to each incoming muon are stored in pipeline logic until the sorting result is obtained.

Given the different demands upon the Muon Sorter, unlike the RPC project [4], we have chosen to implement the sorter algorithm in PLD logic and not an ASIC. Such a solution can provide a lot of flexibility and can be done faster than the current RPC ASIC. Also we can benefit from the rapid growth in PLD/FPGA technology. Our first sorter implementation is targeted to the 20KE Altera PLDs, the fastest available Altera PLD family. We also concentrate on a single chip solution for the sorting logic, which would provide the minimal latency and optimal board design.

All sorting schemes are based on multiple comparisons and data multiplexing. Different design approaches and schemes require different number of comparison steps and number of comparisons at each step. Our main goal is to reduce the latency of sorting. We assume that latency is the time interval between the latching of input patterns into sorter chip and moment when the addresses of selected patterns are available for latching at the external logic outside sorter chip.

The block diagram of the sorter PLD which was designed and simulated is shown in Table 12.14. Two sorting steps “4 out of 18” are realized in parallel at the beginning of sorting tree, and then one sorting step “4 out of 8”. In case of ( $n$ ) input patterns the total number of comparisons between all patterns is  $N=n(n-1)/2$ . If  $n=36$ , then  $N=630$ , and if  $n=18$ , then  $N=153$ . The first step of our scheme requires  $18\times 17=306$  comparisons, the second one – only 28.



**Fig. 12.14:** Block diagram of the sorter PLD. It receives 7 bits of rank for each of 36 muons found by the Sector Processors and selects the 4 highest rank. “FF” denotes a flip-flop, “LUT” denotes a look-up table RAM.

The sorting PLD contains input, output, and intermediate (not shown on Fig. 12.14) flip-flops (FF) for proper data pipelining in order to provide a synchronous operation at 40 MHz. Our initial single-chip design is based on Altera EPF20K200EFC484-1 PLD. Sorting latency is four clock cycles, or 100 ns. The sorting PLD outputs the 6-bit addresses of the first, second, third and fourth best muons. These addresses enable multiplexing of the pipelined muons to the sorter board outputs. One clock cycle later, sorting logic outputs four 8-bit patterns (5-bit  $p_T$  + 3-bit Quality) which correspond to the selected muons. So the total number of sorter logic outputs is [6-bit address (ADR) + 8-bit pattern (PAT)]  $\times$  4 = 56.

## 12.6.2 Hardware Implementation

As discussed in previous sections, the MS will accept trigger data from 12 separate Sector Processors. To match the current Sector Processor prototype, it should contain 12 input connectors and receive  $60 \times 12 = 720$  input signals using  $12 \times 4 = 48$  16-bit input LVDS receivers. It should also contain four connectors and eight parallel LVDS transmitters to the GMT. Due to large number of inputs and outputs, a single chip solution for the whole sorter board is not feasible at the moment. We propose to use several PLDs for the sorting and pipelining of 36 muons. We intend to implement the sorting logic, the interface to the GMT, the VME control logic, and the interface to the Clock and Control Board (CCB) on a single 9U  $\times$  400 mm board. This board would need to carry four stacked mezzanine receiver boards, each of them consisting of the connectors, interface,

and pipeline logic, for communication with three SP of the current prototype design. This leads to a 5 board Muon Sorter which we intend to locate in a separate 9U crate in the counting room.

However, if the Sector Receiver and the Sector Processor are combined onto one board using improved technology, it is possible to fit the entire CSC Track-Finder into one VME crate, including the Muon Sorter. In this case, a custom backplane will deliver the signals from the 12 Sector Processors to the Muon Sorter.

## 12.7 Synchronization and Latency

### 12.7.1 Synchronization Procedure

The Anode LCT is used to synchronize the trigger system. The Anode LCT can identify the correct bunch crossing with greater than 99% efficiency. Thus the BXN generated by the ALCT will be histogrammed and compared to the bunch crossing structure of the LHC beam. By using the repeating nature of the bunch structure it is estimated that a determination can be made in 25 minutes of running at  $10^{32} \text{cm}^{-2}\text{s}^{-1}$ . Once that has been determined, each subsequent board in the chain is counting the BXN for itself. By comparing with the BXNs from the prior boards it will be possible to determine the offsets for each board in the system.

### 12.7.2 Latency Determination

The estimated latency of the CSC Track-Finder is expected to be 26.5 bx, from the time data is available at the end of the optical fiber in the counting room (51.5 bx after the collision) until it is delivered to the Global Muon Trigger crate. Thus, the CSC trigger data is available at the Global Muon Trigger 78 bx after the collision. The accounting of this latency is shown in Table 12.6. This is not the latency presently achieved in the Track-Finder prototypes, which is documented in Section 12.10, but what is expected based on current technology. In particular, we expect to save 6.5 bx from the current SR and SP prototypes.

**Table 12.6:** Latency of the CSC Track-Finder

| Description                                     | bx this step | Total bx |
|-------------------------------------------------|--------------|----------|
| Delivery of optical signals to CSC Track-Finder | –            | 51.5     |
| SR optical receiving and synchronization        | 2            | 53.5     |
| SR Processing and transmission to SP            | 3.5          | 57       |
| SP processing                                   | 11           | 68       |
| SP to Muon Sorter transmission over 5m cable    | 2.5          | 70.5     |
| Muon Sorter processing                          | 4            | 74.5     |
| Muon Sorter to GMT transmission over 11m cable  | 3.5          | 78       |

## 12.8 System Monitoring

The Sector Receiver and Sector Processor will have VME-readable registers to log errors and other diagnostic information so that the system can be monitored through the VME bus. For example, if any board detects that a synchronization error has occurred, all subsequent trigger data will be flagged as “data not valid” and the error and bunch crossing will be logged in the VME registers.

The trigger data received and generated by the CSC Track-Finder for each event will be stored in FIFOs on the Sector Receiver and Sector Processor for subsequent readout by the DAQ if an L1A is issued. A dedicated link to a Front-End Driver of the DAQ is anticipated, since VME is too slow for the event readout. This trigger data will be used to monitor the performance and efficiency of the CSC Track-Finder from the offline data stream.

## 12.9 Simulation Results

The performance of the CSC Track-Finder has been simulated using the GEANT-based CMSIM and ORCA software packages. Most results are obtained using the ORCA4 framework, where GEANT hits were obtained using version 118 of CMSIM; however, some earlier studies (see Ref. [12.1]) of the  $p_T$  resolution reported here were obtained using version 114 of CMSIM and a previous LCT simulation. Efficiency studies based on ORCA were carried out with a sample of positive and negatively-charged single muons generated flat in the  $\phi$  coordinate ( $0 < \phi < 2\pi$  rad), flat in pseudo-rapidity ( $|\eta| < 2.4$ ), and flat in  $p_T$  ( $5 < p_T < 100$  GeV/c). The trigger rate studies have been performed on the minimum bias samples described in Chapter 8.

An object-oriented description of the CSC Track-Finder has been written to exactly mimic the functionality of the current prototypes. In fact, this code has been used to validate the hardware, where perfect agreement has been achieved for 200,000 single muon events.

The  $p_T$  measurement in the Sector Processor is based on the sagitta of the track induced by the magnetic bending. The sagitta is determined from the difference in the azimuthal angle  $\phi$  of the track segments measured in different CSC stations. It is possible to determine the  $p_T$  of a track from the difference in  $\phi$  values measured in any pair of CSC stations at a given pseudorapidity. However, a more precise  $p_T$  measurement can be obtained if more of the track's sagitta is used; *i.e.* using track segments measured in three CSC stations. The techniques for the two-station and three-station  $p_T$  measurements used by the CSC Track-Finder are described in Ref. [12.1].

Figure 12.15 shows the resolution of  $p_T$  for the 2-station measurement as a function of  $\eta$  for several  $p_T$  values as determined in the earlier CMSIM 114 studies of Ref. [12.1]. The  $p_T$  was reconstructed from  $\Delta\phi$  measured between MB2/1 and ME1/3 for the overlap region, and between ME1 and ME2 in the endcap region. The resolution is defined to be the width of the residual distribution  $(1/p_T^{\text{meas}} - 1/p_T^{\text{true}}) \times p_T^{\text{true}}$ , where the measured  $p_T$  is reported at 50% efficiency. The resolution improves with decreasing  $p_T$ , and tends to stay at about 30% at low  $p_T$ . At high  $p_T$  the resolution is dominated by the width of the cathode strip, whereas at low  $p_T$  the resolution is dominated by multiple scattering. The resolution gets worse (~70% at low  $p_T$ ) if MB1 is excluded in the overlap region or ME1 is excluded in the endcap region.

Figure 12.16 shows the resolution of the  $p_T$  for the three-station measurement (ME1–ME2–ME3) as a function of  $\eta$  compared to the two-station (ME1–ME2) measurement at



**Fig. 12.15:** The resolution of  $p_T$  for the 2-station measurement as a function of  $\eta$  for several  $p_T$  values. The  $p_T$  was reconstructed from  $\Delta\phi$  measured between MB2/1 and ME1/3 for the overlap region, and between ME1 and ME2 in the endcap region.

$p_T=5 \text{ GeV}/c$ . There are no results for the three-station  $p_T$  measurement for  $|\eta| < 1.2$  because this method of measurement was only implemented in the endcap region during this study. The figure shows that the three-station  $p_T$  measurement provides a significant improvement in the  $p_T$  resolution for low  $p_T$  muons in the endcap region as compared to the two-station  $p_T$  measurement.

These results are confirmed with the ORCA simulation. In particular, Fig. 12.17 shows the residual distribution of  $(1/p_{T_{meas}} - 1/p_{T_{true}}) \times p_{T_{true}}$  for reconstructed muons with segments in at least two CSC stations, including ME1, with  $5 < p_{T_{true}} < 50 \text{ GeV}/c$  and  $1.2 < |\eta| < 2.0$ . The distribution is centered at zero with a width of 29%.

The CSC trigger efficiency, as estimated from ORCA, for single muons as a function of  $\eta$  is shown in Fig. 12.18 for two sets of criteria applied to tracks found by the CSC Track-Finder. The solid line shows the efficiency for tracks that have segments in at least two muon stations, of which one must be ME1 for the endcap region ( $|\eta| > 1.2$ ) in order to maintain satisfactory  $p_T$



**Fig. 12.16:** The resolution of  $p_T$  for the 3-station measurement (ME1–ME2–ME3) as a function of  $\eta$  compared to the 2-station (ME1–ME2) measurement at  $p_T = 5 \text{ GeV}/c$ .

resolution. Any two stations are allowed for the DT/CSC overlap region ( $1.05 < |\eta| < 1.2$ ) to keep the efficiency high. In contrast to this loose set of conditions, the dashed line shows the efficiency for tracks when segments in at least three muon stations are required, including ME1 in the endcap region and MB1 in the DT/CSC overlap, so that the best  $p_T$  resolution is achieved. The efficiency is reduced in this case, particularly in the DT/CSC overlap, but maximum background rejection will be achieved. The configuration used in the trigger is a trade-off between efficiency and increased rate from low quality tracks. At high luminosity, we expect that three stations will be required if the CSC system is used standalone, without coincidence with the RPC system. It should be noted that in this efficiency plot, and those that follow, that all four CSC stations (ME1–ME4) are assumed to be present.

The CSC trigger efficiency for single muons as a function of  $\phi$  is shown in Fig. 12.19. The solid line corresponds to the loose set of track requirements just described, and the dashed line corresponds to the tighter set. The overall CSC trigger efficiency is 92% for muons generated in



**Fig. 12.17:** Residual distribution of the inverse transverse momentum measured by the CSC Track-Finder for single muons generated flat in  $5 < p_T < 50 \text{ GeV}/c$  and  $1.2 < |\eta| < 2.0$ . A track segment in ME1 is required.



**Fig. 12.18:** The CSC Track-Finder efficiency as a function of  $\eta$  for single muons. The solid line corresponds to loose requirements on the track quality, the dashed line corresponds to tight requirements.

the range  $1.05 < |\eta| < 2.4$  using the loose set of criteria, and is flat in  $\phi$ . For the tight track criteria, the efficiency drops to 70% because of the geometric holes in the  $\eta$  coverage seen in Fig. 12.18.



**Fig. 12.19:** The CSC Track-Finder efficiency as a function of  $\phi$  for single muons. The solid line corresponds to loose requirements on the track quality, the dashed line corresponds to tight requirements.

Figure 12.20 shows the trigger efficiency turn-on curves as a function of the true  $p_T$  for several trigger thresholds (defined at 90% efficiency) for the loose set of track criteria in the endcap region. The sample of single muons used for the study are generated flat in pseudorapidity,  $1.2 < |\eta| < 2.4$ , so the  $p_T$  resolution is an average over this interval.

The CSC trigger rate has been studied using the samples of minimum bias events described in Chapter 8. In particular, an LHC luminosity of  $10^{34} \text{cm}^{-2}\text{s}^{-1}$  is assumed, which implies 17.3 minimum bias events are piled-up on average every beam crossing. Figure 12.21 shows the single muon trigger rate from the CSC trigger as a function of the  $p_T$  threshold applied (defined at 90% efficiency according to the binning shown in Table 14.1) for the two sets of requirements on the track quality shown in the efficiency plots. The solid line corresponds to the loose set of criteria, which yields good trigger efficiency but high rate ( $>10$  kHz for any threshold). This rate can be reduced to acceptable levels by the Global Muon Trigger, however, when the CSC tracks are combined with RPC tracks, as discussed in Chapter 14. On the other hand, the CSC trigger alone can reduce the rate to acceptable levels when the requirement on the tracks reported by the CSC Track-Finder is tightened, as shown by the dashed line in Fig. 12.21. A single muon rate of about 5 kHz is achieved when the single muon threshold is set to 25 GeV/c because of the improved  $p_T$  resolution of high quality tracks, at the expense of some efficiency loss.



**Fig. 12.20:** CSC trigger efficiency turn-on curves as function of the true  $p_T$  for several trigger thresholds. The  $p_T$  thresholds are defined at 90% efficiency.

## 12.10 Prototypes and Tests

### 12.10.1 Sector Receiver

In order to test the Sector Processor with a full complement of inputs, three prototype Sector Receivers for the CSC Track-Finder have been built and tested. A photograph of one of them is shown in Fig. 12.22. The 12 optical receiver blocks are visible on the left followed (left to right) by de-serializers, the 6 front FPGAs, two columns of 16 memory LUTS with buffers in between, 6 Back FPGAs, and more buffers and Channel-Link drivers. The VME interface FPGA is in the upper right corner. Front-panel LED's display the status of a number of internal and VME bits. At the top of the board are Xilinx EEPROMs which load all FPGAs upon power-up. The board has 10 layers and about 9400 vias.

The (thirteen) Front, Back, and VME Interface FPGAs were implemented in Xilinx Virtex FPGAs, part number XCV50-6BG256C. The memory LUTs were implemented in 36



**Fig. 12.21:** Single muon trigger rate from the CSC system as a function of the  $p_T$  threshold (defined at 90% efficiency) for a luminosity of  $10^{34} \text{ cm}^{-2}\text{s}^{-1}$ . The solid line corresponds to loose requirements on the track quality, the dashed line corresponds to tight requirements.

identical 256Kx16 synchronous static RAMs, GSI part number GS74116TP. The use of SRAM allows the output of the second-stage RAMs to appear as soon as the inputs to the first-stage RAM propagate freely through the chips. Two of the SR prototypes were built with 8-ns RAM, the fastest available, to guarantee that the entire propagation through the two chips could take place within one cycle at 40 MHz. The third prototype was built with 10-ns RAM, which has worked equally well in all tests thus far. Further tests will determine if the (cheaper, more readily available) 10-ns RAM has a sufficient safety margin for reliability in production boards.

For tests involving the Sector Processor, ORCA-simulated CMS data and corresponding LUT contents are used. Thus, many of the LUTs contain the same contents, and the LUT addresses checked reflect the population of tracks in CMS. For further testing of a stand-alone Sector Receiver, random numbers are used for all LUT contents and event data, allowing more stringent tests of the addressing. Events are loaded into the Front FPGA in groups of 252, passed



**Fig. 12.22:** Picture of the Sector Receiver prototype board

through the SR at 40 MHz, and read out of the storage buffer in the Back FPGA. With random data, a typical test run of 100000 cycles of 252 events per cycle showed perfect agreement between the expected and observed data, for all three SR prototypes. In some longer trials, we have recently observed some rare discrepancies which are under investigation.

### 12.10.2 Sector Processor

A prototype Sector Processor for the CSC Track-Finder has been completed and tested, and first results are reported in Ref. [12.8]. Differences with respect to the design described in this Chapter have to do with the specification of the DT interface. In particular, the prototype design accommodates track segments from both MB2/1 *and* MB2/2 chambers (rather than MB2/1 only), and up to two track segments per chamber can be delivered on the same bunch crossing (rather than serialized across two bunch crossings).

The prototype receives its input from a custom point-to-point backplane operating at 280 MHz. The signals are transmitted and received using Channel-Link LVDS from National Semiconductor. The hardware implementation of the Sector Processor trigger logic is listed below.

**Bunch Crossing Analyzer:** The logic of the Bunch Crossing Analyzer is partitioned across 7 moderately-sized FPGAs from the Xilinx Virtex series (XCV50-6BG256C).

**Extrapolation Units:** The extrapolation logic, as well as the global FIFO which stores the information for the Assignment Unit, occupies 4 large Xilinx Virtex FPGAs (XCV400-6BG560C).

**Track Assembler Units:** As discussed in the text, the Track Assembler Units are realized with nine 256Kx16 SRAM memory chips from Integrated Device Technology.

**Final Selection Unit:** The Final Selection Unit is implemented in one Xilinx Virtex FPGA (XCV150-6BG352C).

**Assignment Unit:** The Assignment Unit is implemented in three Xilinx Virtex FPGAs (XCV50-6BG256C) and three 2Mx8 SRAM memory chips from Toshiba.

In addition to the fast trigger logic, a Xilinx Virtex FPGA makes up the VME interface for the board, and a parallel-to-serial interface chip (SCANPSC100F) from National made up the JTAG interface. A front panel connector provides the SP output in parallel LVDS.

In total, 17 Xilinx FPGAs with a ball-grid array footprint, 12 memory chips, and 25 32-bit buffers were used. The entire board was routed using 12 layers and approximately 10,000 vias. A picture of the prototype is shown in Fig. 12.23.



**Fig. 12.23:** Picture of the Sector Processor prototype board

### 12.10.3 Track-Finder Crate Test

The CSC Track-Finder prototypes (SR, SP, and CCB), as well as the MPC (described in Chapter 11) which sends CSC trigger primitives, underwent crate tests during the summer and fall of 2000. Figure 12.24 shows the arrangement of the prototypes in a single 9U VME crate. All critical functions of the boards were tested. The CCB prototype was used to distribute clock and

control signals to each board. In particular, the CCB issues a BC0 to initiate the start of a chain test between two or more prototypes.



**Fig. 12.24:** Photograph of the CSC trigger prototypes undergoing system tests in a 9U VME crate. From right to left: two Muon Port Cards, the Clock and Control Board, a Sector Receiver, the Sector Processor, and two other Sector Receivers. The Port Cards send trigger primitives to the Sector Receivers via optical cables, and the Sector Receiver and Sector Processor communicate through a custom Channel-Link backplane. Data was successfully transmitted and processed in this chain test with no errors.

**VME Interface:** Downloading of FPGA programs, LUT contents, and test data was achieved with a PCI to VME interface (Bit3 Model 917 from SBS Technology) connected to a PC. Each board contained an FPGA for the VME interface that was automatically loaded on power-up. The data for the rest of the FPGA JTAG chain was sent in parallel over the VME bus and de-serialized at up to 25 MHz on each board.

**SR Functionality:** The data conversion FPGAs and memories of the Sector Receiver have been successfully tested dynamically at 40 MHz using pseudo-random input data as well as simulated muon data from ORCA. Perfect agreement was achieved between the hardware and simulation for 30,000 cycles of 256 bx. The latency of the FPGA and SRAM conversion is 2 bx, excluding de-serialization/serialization on the input/output. In particular, data are successfully sent through two consecutive SRAMs within 1 bx.

**SP Functionality:** All the track-finding algorithms of the Sector Processor have been tested dynamically at 40 MHz, except for the Bunch Crossing Analyzer which was configured as an input FIFO for the tests. In particular, perfect agreement was achieved between the ORCA simulation and the SP hardware for about 200,000 single muon events. Moreover, agreement also was achieved when the same events were piled-up three at a time to mimic triple-muon events, which is a stronger test of the track assembly and sorting functions on the board. Each sub-processor on the board was tested separately and in concert. The maximum clock frequency of the extrapolation and track assembly logic was measured to be 63 MHz. The latency of the SP prototype is determined to be 15 bx, excluding the Channel-Link de-serialization.

**MPC to SR Optical Communication and Data-Flow Tests:** After an SR passed its stand-alone tests, a Muon Part Card was connected to it with 100-m optical fibers driven and received by HP GLinks. Unsorted tracks were put into a buffer in the front of the MPC. These were clocked through the MPC, with the sorted highest-priority tracks going on to the SR, through the SR, where they were recorded by the Back FPGA. In October 2000, success was first achieved in a test in which 1.6 million random events were processed without error before the test was stopped. Since then, these boards and a second MPC have been tested together in a different crate. No errors were encountered using random data or simulated muon data in the tests. The overall latency from the input of the MPC to the input of the SR is 28 bx, broken down as follows: 5 bx for MPC processing, 1 bx for HP GLink serialization, 20 bx for transmission through 100 m of optical fiber, and 2 bx for HP GLink de-serialization.

**SR to SP Backplane Communication:** The integrity of the data sent from two SR prototypes (corresponding to stations ME2–ME4) to the SP prototype through the custom Channel-Link backplane operating at 280 MHz has been verified. A FIFO in either the input or output of each SR can be loaded with data 256 bx deep. The overall latency from the input of the SR to the output of the SP is 20 bx, broken down as follows: 2 bx for SR processing, 4 bx for Channel-Link serialization/de-serialization and transmission over the custom backplane, and 15 bx for SP processing.

**MPC to SR to SP Chain Test:** The entire trigger path from the MPC to the SP has been tested successfully using simulated muons from ORCA as input. Two MPCs representing ME2 and ME3 of one trigger sector sent data to one SR, which in turn re-formatted and transmitted the data to one SP. The SP successfully reconstructed tracks in agreement with the ORCA simulation.

## 12.11 Maintenance and Operation

We plan to have spare boards and components sufficient for 10 years of LHC operations. Each board will have a VME-readable register that labels the particular trigger configuration that was loaded into the board. The entire set of LUT contents and FPGA programs will be given one unique identifier. There are many FPGAs and LUTs in the design, so it is important to verify that the correct patterns were loaded. We do not foresee generating LUT contents on the boards from a VME-addressable register. Rather, we will have precompiled FPGA programs (and LUT contents) that are certified to work properly and that meet all timing specifications.

## 12.12 Status and Schedule

Prototypes of all boards for the CSC Track-Finder, except the CSC Muon Sorter, were constructed and tested by October 2000. The schedule for the CSC trigger development is shown in Fig. 11.29. Final design of all boards should be finished by mid-2002. Final production is expected to be completed by the end of 2003, with full system testing beginning in 2004.

### References

- [12.1] S.M. Wang and D. Acosta, “Simulation Studies on the Transverse Momentum Resolution of the CSC Track-Finder”, CMS IN 2000/026.
- [12.2] G.M. Dallavalle *et al.*, “Issues Related to the Separation of the Barrel and Endcap Muon Trigger Track-Finders”, CMS IN 1999/015
- [12.3] D. Acosta *et al.*, “Coverage of the DT/CSC Overlap Region by the Level-1 Track-Finders”, CMS Note in preparation.
- [12.4] J. Christiansen, A. Marchioro, P. Moreira, TTCrx Reference Manual, RD12 Working Document.
- [12.5] D. Acosta *et al.*, “The Track-Finding Processor for the Level-1 Trigger of the CMS Endcap Muon System”, CMS Note 1999/060.
- [12.6] D. Acosta *et al.*, “The Track-Finder Trigger for the Level-1 Trigger of the CMS Endcap Muon System”, Proceedings of the Fifth Workshop on Electronics for LHC Experiments (LEB’99), Snowmass, CERN 99-09, CERN/LHCC/99-33.
- [12.7] CMS, “The Muon Project”, Technical Design Report, CERN/LHCC 97-32.
- [12.8] S.M. Wang *et al.*, “A Prototype Track-Finding Processor for the Level-1 Trigger of the CMS Endcap Muon System”, Proceedings of the Meeting of the Division of Particles and Fields of the American Physical Society (DPF2000), Columbus, to be published in the International Journal of Modern Physics A.



# 13 RPC Trigger

## 13.1 Requirements and design considerations

The RPC trigger system is required to identify and locate high- $p_t$  muons in both the barrel and endcaps. The requirements on RPC physics performance are summarized in Chapter 8. The main technical requirement is that the system deliver to the Global Muon Trigger (GMT) up to four muon candidates from the barrel and up to four from the two endcaps, for every LHC bunch-crossing. The maximum latency allowed for this operation is 81 bx at the GMT input. The RPC chambers are dedicated to triggering, and so the trigger electronics must also facilitate the readout of the detector data via the CMS DAQ system.

## 13.2 System Overview

The RPC Pattern Comparator Trigger (PACT) is based on the spatial and temporal coincidence of hits in four RPC muon stations. We shall call such a coincidence a candidate track or a hit pattern. Because of energy loss fluctuations and multiple scattering there are many possible hit patterns in the RPC muon stations for a muon track of definitive transverse momentum emitted in a certain direction. Therefore, the PACT should recognize many spatial patterns of hits for a given transverse momentum muon. In order to trigger on a particular hit pattern in the RPCs left by a muon, the PACT electronics performs two functions. It requires a time coincidence of hits in at least 3-out-of-4 muon stations (a so-called “3/4 candidate”; of lower quality than a 4/4) in a given direction and assigns a  $p_T$  value. The coincidence gives the bunch crossing assignment of a candidate track that matches the spatial distribution of these hits with one of many possible pre-defined patterns for muons with defined transverse momenta. The  $p_T$  value is thus given. The pre-defined patterns of hits have to be mutually exclusive i.e. a pattern should have a unique transverse momentum assignment. The patterns are divided into classes with a transverse momentum value assigned to each of them. PACT is a trigger which recognizes muons of transverse momentum greater than some  $p_T^{\text{cut}}$ ; it gives a momentum code if an actual hit pattern is straighter than any of pre-defined patterns with a lower momentum code. Generally, the patterns depend on the direction of a muon i.e. on  $\phi$  and  $\eta$ . The segmentation of the RPC Muon Trigger in pseudorapidity is shown in Fig. 13.1. It was optimized for the best possible performance i.e. high efficiency and acceptable trigger rates within the constraints from the detector geometry. There are 33 trigger towers in  $\eta$ , each of them is subdivided in azimuth into 144 logical units called segments. Details of the simulated performance of the RPC system are given in Section 13.7.

The aim of RPC trigger electronics is to deliver to the Global Muon Trigger 4 high- $p_T$  muon candidates from the barrel and a total of 4 from the two endcaps combined. Because the RPCs are used only for triggering, the trigger electronics has to provide a normal readout of the chamber information as well, and deliver the data to the CMS DAQ system.

The system is designed in such a way as to provide the necessary diagnostics and calibration tools (see Section 13.9). A large degree of flexibility is provided in the system in order to guarantee safety in case e.g. of unexpectedly high backgrounds or partial failures. This is

achieved through an extensive use of programmable logic (FPGAs, ASICs) and lookup tables (LUTs).



**Fig. 13.1:** Segmentation of the RPC trigger into towers in  $\eta$

Figure 13.2 shows the functional layout of the RPC trigger system. The individual components are described in detail in the following Sections. The system can be conceptually divided into two parts:

- The Optical Communication System (OCS)
- The main trigger electronics, in the underground counting room.

The input - information from the RPC - is processed and delivered to the trigger system by Front End Boards (FEBs). These are part of the CMS Muon System and their properties are described in the Muon Technical Design Report [13.1]. For completeness, a summary of the functionality is given in Section 13.3 and Subsection 13.9.2 of this document.

The OCS (see Section 13.4) is responsible for bunch-crossing assignment of RPC data, and for the transfer of data to the trigger electronics. RPC data from the FEBs are sent over a distance of typically 5m to Link Boards, mounted on the detector. There, the data are synchronized with a TTC clock, providing bunch-crossing identification. The data are then compressed and time-multiplexed in order to reduce the number of optical links. The data are transmitted to the underground counting room, approximately 90 m distant, and fanned out to a number of Trigger Crates. The LBs are the only components of the RPC trigger electronics which must be radiation tolerant. The choice of components on the LBs is dictated by irradiation test results (see Section 13.10).



**Fig. 13.2:** General layout of the RPC Muon Trigger system

The main trigger electronics is divided into Trigger Crates (TCs), which in turn contain Trigger Boards (TBs), Readout Boards (RBs), Timing Distribution Boards and VME crate controllers. The TCs also house the sorting and ghostbusting system, implemented partially on the TC backplanes and partially on separate Sorter Cards. There is one TC for each trigger tower in  $\eta$ . Muon identification is performed by Pattern Comparator (PAC) ASICs on the TBs; the identification algorithms are described in Chapter 8, and the implementation in Section 13.5. The resulting muon candidates are processed in order to remove ghost tracks, and sorted in order to reduce the number of candidates presented to the GMT; these parts of the system are explained in Section 13.6.

## 13.3 RPC Front End

### 13.3.1 Overview

The RPC Front End electronics consists of about 12000 Front End Boards (FEB) handling 16 channels each. The general scheme of the Front End Control system and the data transmission interface to an optical link is shown in Fig. 13.3. Each FEB contains two Front End Chips (FEC). The FEC is an 8-channel ASIC, consisting of amplifier, discriminator, monostable and differential line driver.

The RPC readout strips are connected to FEBs by kapton foils, whose shapes are optimized to obtain the right channel characteristic impedance and to have the same arrival time

for all channels on the FE input. In this way the same FEB design can be used for all chambers, and only the shape of kapton foils will be different for different chambers. To simplify things we have decided to keep the same number of strips for each FEB belonging to the same layer, requiring this number is less than or equal to 16. This is obtained provided that the FEB has a minimum width. The implemented FEB has a size of 110.6 x 228.6 mm<sup>2</sup>.

A detailed study has been done for each RPC type, allowing us to reach a good compromise between complexity and cost, while still preserving good trigger performance [13.8].



**Fig. 13.3:** General RPC Front End layout

The RPC parameters to be controlled consist of two branches:

1. RPC Detector Control (RPC-DC). This includes HV and LV control for the chamber and FEB, setting and checking of discriminator thresholds on the FEB, and temperature and pressure control.
2. RPC Data Transfer Control (RPC-DTC). This controls different elements of trigger data path used in Muon Trigger diagnostics, including FEC test pulses, LV control for the Link Board, Synchronization Unit control, LMUX test pattern control, test histograms of various units.

While the first group of parameters will be controlled directly on the FEB, the second will be set and verified at Link Board level. Two different I2C network interface channels are foreseen to control the RPC-DC branch. The I2C network has its own LV power supply, independent of the Front End electronics LV supply.

Details of the Optical Communication System are given in [13.9] and are summarized in Section 13.4.

### 13.3.2 RPC Electrical Characteristics

The RPC detectors are gaseous detectors that will be used in CMS experiment for the Muon Trigger system. They combine good spatial resolution with an excellent time resolution, which means that the RPC Muon Trigger can identify a muon track, measure its transverse momentum and relate it to the correct bunch crossing. The RPC proposed for CMS consists of two 2-mm gaps with common pick-up readout strips in the middle and will be operated in avalanche mode, for sustaining event rates up to 1000 Hz/cm<sup>2</sup>.

The chambers will be filled with a freon-based gas mixture, having electron drift speed in the order of 130 μm/ns. The shape of the current signal induced by a single cluster is described by the function

$$I(t) = I_0 e^{-t/\tau}$$

where  $0 < t < 15$  ns, and  $\tau$  (gas time constant)  $\sim 1$  ns at the nominal working point of the detector. This can also be considered a good approximation of the real signal, since almost the whole induced current originates from the first two clusters.

In the barrel RPC, the current signal comes from a strip-line 1.3 m long. The characteristic impedance  $R_0$  of a strip, for an RPC with 2-mm double-gap geometry and a strip width ranging from 2 to 4 cm, ranges from 40 to 15 Ω. The corresponding strip capacitance ranges between  $\sim 160$  pF and  $\sim 420$  pF. The propagation delay is  $\sim 5.5$  ns/m.

The rise time of the induced signal is about 1ns, shorter than the propagation delay along the strip. The strip must therefore be treated as a transmission line and properly terminated at both ends. If the termination is poor, reflected signals can increase occupancy. Further, if the termination resistance is greater than  $R_0$ , a fraction of the input charge is reflected and lost so that the effective threshold is increased. Each strip is terminated at one end by the input impedance of the preamplifier, and on the other by an ohmic termination. This is a less expensive option and consumes less power than other, active types of termination, at the cost of a small increase in noise.

Since the termination resistance has a small and variable value, AC coupling is required at the amplifier input. Simulations and past experience show that if the threshold is set at around 20 fC, the detector is fully efficient with small streamer probability. This means that a noise  $< 4$  fC is tolerable. In an RPC working in avalanche mode, the avalanche pulse is often accompanied by an after-pulse with a delay ranging from zero to some tens of nanoseconds: any possible second trigger must be masked. Therefore the discriminator must be followed by a one-shot, and the length of the pulse should be chosen based on the trade-off between the possibility of a second trigger and dead time. A length of 100 ns, giving a dead time of 2%, has been considered a good compromise.

### 13.3.3 Front End Chip Characteristics

The FEC has been designed and manufactured in AMS 0.8 μm BiCMOS technology. It has eight identical channels, each consisting of an amplifier, zero crossing discriminator, a one-shot monostable and a LVDS driver. Some measured characteristics of the FEC circuit are presented in [13.5]

The FEC measurements were made in order to establish the parameters of the analog chain. There are two main parameters of interest: the charge sensitivity of the amplifier, measured to be 2 mV/fC for input charges below 100 fC, and the timing spread of the zero crossing discriminator (ZCD). The latter is of primary importance for bx determination. As shown in Fig. 13.4, it is found to be below 1 ns and nearly independent of the amplitude of the input pulse.



**Fig. 13.4:** ZCD output jitter

### 13.3.4 FEB Functional Description

The Front End Board (FEB) is directly connected to the RPC and contains 16 channels of RPC front end electronics. The FEB block diagram is shown in Fig. 13.5. The main components are two FECs of 8 channels each, one DAC (AD5301) for threshold settings, an ADC (AD7416) to read back the analog threshold settings, a temperature sensor, an LVDS receiver for test patterns, various I2C components proving a DSC interface, and voltage regulators.

Each FEC device has four test inputs, one for every two readout channels. For simplicity, however, we intend to use only one test input per four channels, connecting 2 test inputs together at board level. In this way we have to inject only a four-bit test pattern, using only one quad LVDS receiver per board. The test inputs will be used to check the channel connectivity and to monitor dead channels. Additionally, procedures will be established to measure the delay between the FEB and Link Board and so to use the right clock phase for synchronization.

The threshold input will fix the equivalent charge threshold value applied to the discriminator, ranging between 10 fC and a maximum of 300 fC.



**Fig. 13.5:** Front End Board block diagram

## 13.4 Optical Communication System

### 13.4.1 Overview

The CMS RPC muon detector system has almost 200,000 channels, but a rather low occupancy. Significant savings on the number of high speed data links needed are possible using data compression and multiplexing.

As illustrated below in Fig. 13.6, data communication from the on-detector electronics to the counting room is organized as follows. Several FEBs are connected to a Link Board (LB), which contains synchronization and data compression functions and the optical link transmitter. In the counting room, data from one link has to be split to several destinations. A Trigger Board (TB) forwards data to the Level-1 trigger, while Readout Boards (RB) handle pipeline storage for the DAQ. Dedicated FPGA designs have been developed for the data transmission system. The compression on the LB is handled by the LMUX chip, while the demultiplexing and decompression are performed on the TB and RB in the LDEMUX chip. These latter boards contain other functionality and are described in detail in sections 13.5.2 and 13.5.4, respectively.

The chambers will be served by different numbers of links according to their occupancy [13.9]. Up to three chambers are connected to one link in the barrel, but in the endcap some

relatively high occupancy chambers have two links per chamber. The occupancy in the RPCs is mostly due to neutron background and the intrinsic RPC noise.



**Fig. 13.6:** Block diagram of the data transfer system

At the transmitting end, the data transfer system is based on an arrangement of Master Link Boards (MLB) and Slave Link Boards (SLB). Only MLBs have a fibre optic link out to the counting room. Each SLB compresses 96 channels of RPC data into 24 bits and passes it on to an MLB; a maximum of two SLBs are connected to one MLB. The connections are 1.5 m long and use LVDS signalling at 120 MHz, reducing the number of lines needed to 8 data and 1 clock. This arrangement was arrived at in order to minimize the number of links and to ease the input pin count demands on the LMUX chip. The detailed mapping of physical chambers to fibre optic links is described in [13.9]. Due to the large variation in the occupancy and even the physical size of the chambers, the optimum arrangement is not very straightforward. Moreover, the chamber data have to be delivered to many destinations; to several Trigger Boards, because of the overlapping cones for PAC processors (see Section 13.5.2), and to the Readout Boards. The number of Splitter Boards and LDEMUXs has to be optimized as well. The current numbers of Link Boards, Splitter Boards and LDEMUX devices are given in Table 13.1.

The LBs are mounted close to the chambers on the detector. The radiation levels at the muon detectors are rather moderate on the LHC scale, between 1 and 100 Gy total ionizing dose for the first 10 years of LHC operation, and  $10^{11}$  neutrons/cm<sup>2</sup> annually at the maximum even though the average values are much smaller than this, it is necessary to make sure that all components used can tolerate these amounts without causing unacceptable degradation in the performance of the electronics system. Test procedures recommended by CERN experts [13.15] supplemented with other standard test procedures, will be used to validate each part to be used on the detector.

**Table 13.1:** Link Boards inventory

|                          | barrel | endcap | total |
|--------------------------|--------|--------|-------|
| MLBs (fibre optic links) | 300    | 432    | 732   |
| SLBs                     | 500    | 432    | 932   |
| Splitter Boards          | 300    | 432    | 732   |
| Trigger LDEMUX           | 912    | 1800   | 2712  |
| Readout LDEMUX           | 300    | 432    | 732   |

There is also a connection from the RPC detector to the CSC, which helps to resolve ghosts in the CSC data. Data from the RPCs are ORed in sections of eight neighbouring channels and sent to the CSCs via LVDS cables. Two LBs running synchronously send out 12 bits of ORed RPC data and 2 bits of their bunch crossing counters each. However, only 26 signals at the maximum are received by the CSCs, since both bx counters should give out the same value.

### 13.4.2 Data Compression Scheme

The customized data compression scheme we have developed is illustrated in Fig. 13.7. The scheme has been implemented using ALTERA FPGAs as the LMUX/LDEMUX devices, and test results are discussed in Subsection 13.10.2. The final FPGA will be one of the Xilinx devices, which are more suitable for use where SEU is present than ALTERAs.

**Fig. 13.7:** Main functional components of the LMUX and LDEMUX

## LMUX

The data compression scheme in the LMUX FPGA is based on dividing the RPC chambers into partitions of 12 strips, and sending only those containing hits. New data are introduced into the device for each bunch crossing. A block of input latches synchronizes the RPC data with an internal clock, and stores them in RPC Output Register (ROR). Data possessing at least one non-empty partition are stored in a FIFO memory. The maximal length of the FIFO memory is equal to the maximal delay value. Data from the FIFO memory are shifted to the PACKET CONSTRUCTOR, which selects non-empty partitions and supplies them with partition numbers and delay values. Sending of the current data word is aborted when the maximal delay value is reached. In this case, the last partition has an overload flag set to indicate that the data word being sent is not complete. When the PACKET CONSTRUCTOR does not have a non-empty partition to send, an empty partition is sent with a non-existent partition number. The data from LMUX are sent to the Serializer and Optical Fiber Transmitter described in the next Subsection.

LMUX is also equipped with a PACKET MULTIPLEXER (not shown in Fig. 13.7 for simplicity) for reception of the data from the SLBs. Its function is to merge data from neighbouring RPCs (two at maximum) and to send their data onwards to a single optical link. Each output data packet, therefore, also contains the RPC chamber number. The PACKET MULTIPLEXER is preceded by a block of latches, which equalize the relative delays between streams of packets from different RPCs.

## LDEMUX

The LDEMUX FPGA implements the data decompression algorithm. The data from an optical link are received, deserialized and split into several identical streams going to one or more Trigger Boards and one Readout Board. Each LDEMUX circuit demultiplexes the data from one chamber only. Partitions from this selected chamber are the only ones accepted into the device (the chamber number is compared to the BASE NUMBER). Accepted partitions are fed in to the SHIFT REGISTERS consisting of D+1 positions, where D is the maximum delay value. Each position of the SHIFT REGISTER has a DELAY INDEX assigned to it, with values from 0 to D. If the partition's delay value is equal to DELAY INDEX then the partition data are shifted, via the demultiplexer, to the PAC Input Register (PIR), according to the partition number. The partition that was sent by LMUX with zero delay value is fed into LDEMUX after the full SHIFT REGISTER delay, while the partition with the maximum LMUX delay value is sent to LDEMUX immediately. The method provides compensation of delays between the LMUX and LDEMUX circuits.

When LDEMUX receives a broken data packet with the overload flag set, the output data are modified according to a special algorithm (for example, all bits of the data word are zeroed).

### 13.4.3 The Link Board

A functional block diagram of the Link Board is shown in Fig. 13.8. It receives data from the Front End Boards, synchronizes and compresses them and sends them to the trigger system via optical fibre. Master Link Boards and Slave Link Boards are otherwise identical, except that the latter do not have a fibre optic link, and consequently consume less power. The board needs the TTC receiver chips (TTCrx) to provide timing signals.

All components that will go onto this board must be tested for performance and for radiation tolerance, including SEU. The main items to be tested are the LVDS receivers, Xilinx Virtex FPGA used to implement the LMUX compressor, TTCrx, fibre optic transmitters (and receivers) capable of transmitting at 1.2 Gbit/s or more, and serializer-deserializer (serdes) devices. The necessary tests are underway.



**Fig. 13.8:** Functional diagram of Link Board

### FEB Interface

Data from the FEB are sent to the LB via twisted-pair LVDS cables of up to 5 metres in length. One 25-pair cable from each FEB carries 16 data lines, four test pulses and ground lines. The 96 channels from an RPC chamber are read out by six FEBs, which are all connected to the same LB. Based on previous results with LVDS components, and our own preliminary tests, no major problems with the radiation tolerance of the LVDS connection are expected.

### Serdes Device and Fibre Optics

The current baseline design uses the AMCC 2061B serdes device and any commercial VCSEL transceiver<sup>1</sup>. The speed of the serdes device is variable and determined by the source clock, which has to be between 100 and 125 MHz. To run synchronously to the 40.08 MHz LHC clock, we have to multiply it by three. In this case the data rate of the link will be 120.2 Mbyte/s. The serdes device includes 8b/10b encoding, giving a serial rate through the fibre of 1.202 Gbit/s. The latency of the link is less than 2 bx, excluding the cable delay.

<sup>1</sup>. In the test we have used the Siemens V23826-K305-C73 VSEL device.

Passive irradiation test results have shown the components to tolerate doses of up to 300Gy gamma irradiation and  $5 \times 10^{12}$  protons/cm<sup>2</sup>. No significant deterioration of the operating parameters of the components was observed for these doses. The SEU rate for the complete link (AMCC S2061B and Siemens V23826-K305-C73 were used for the test) was found to be of the order of one upset per week during LHC operation [13.14]. It has to be noted, however, that in this test we could not distinguish between errors generated in the transmitter and errors generated in the receiver. More detailed tests will be carried out in the near future.

Components having similar functionality to our baseline solution are available from several manufacturers. Alternative solutions, which would preferably be designed for a simple connection, are continuously being investigated. But since we have to freeze the design before the end of the year, it looks unlikely that we will find anything else of comparable performance in the same price range.

The length of the fibres is approximately 90 meters. The combination of multimode fibre and short wavelength can be rather prone to radiation damage. However, the radiation levels at the muon detectors are not so severe as to make finding a satisfactory fibre particularly difficult, and this solution is lower in cost than to use 1300 nm and single mode. The radiation tolerance of pure Si core has been shown to be better than that of fibres with a Ge-doped core. For example, the Fujikura fibre tested by an ATLAS group [13.15] would satisfy our requirements.

### 13.4.4 Splitter System in the Counting Room Electronics

The RPC trigger logic is naturally divided into 12 sectors of 30° each, with one sector of one  $\eta$ -tower treated by one Trigger Board. Since a bent muon track does not respect chamber boundaries, the cones in which PAC processors find the candidates overlap. Therefore, most of the chambers have to be connected to several Trigger Boards, and the data received from one MLB have to be distributed to several destinations to ensure proper functioning of the trigger. One link has to be split to different trigger boards serving up to two sectors in  $\phi$  and up to four towers in  $\eta$ , and to the readout board as well. This makes nine connections at maximum. The task of distributing the signal to several destinations is performed by a Splitter Board, which is a simple module having a fibre optic receiver, a deserializer and nine LVDS connections to transfer the data to the Trigger and Readout Boards. We plan to use the same AMCC serdes device and a commercial VCSEL transceiver also at this end of the link. The output of the deserializer device is directly fed to the LVDS drivers, giving a data format of 120Mbyte/s, 24-bit words. Not all LVDS interfaces are used on all Splitter Boards. The Splitter Boards will be located in the 8U extra space left in the RPC trigger crates, as described in the next section, making the distance to the TBs very short.

## 13.5 Trigger Crates

### 13.5.1 Overview

The RPC Muon Trigger task is performed in Trigger Crates (TC) which will be located in the underground CMS counting room. These crates house Trigger Boards, which contain the Pattern Comparator track finding logic; Readout Boards, used to capture and readout the RPC data to the DAQ; and the sorting tree, which delivers the 4 highest  $p_T$  muon candidates for barrel and

endcaps separately. Each crate receives the data from the link subsystem for a  $360^\circ$  ring sector – a tower in  $\eta$  – and uses them to find the 4 highest  $p_T$  muon candidates. The total number of crates, and so the number of  $\eta$  towers, is 33. The readout and sorting functions will be divided among the 33 crates, giving three different crate designs as detailed in Table 13.2. All three types of crate

**Table 13.2:** RPC Muon Trigger crates

| Crate version       | Crate number |
|---------------------|--------------|
| Trigger and Readout | 18           |
| Trigger and Sorting | 3            |
| Trigger only        | 12           |
| Total               | 33           |

contain a trigger processing part, plus a Timing Distribution Board and VME crate controller. The layout of the Trigger/Readout crate type is shown in Fig. 13.9.



**Fig. 13.9:** Trigger/Readout crate type - Front Panel view

The trigger processing function proper is implemented in two types of module. Each crate contains 12 Trigger Boards (TB), which find muon track candidates. Candidates are assigned an 8-bit code comprising  $p_T$ , sign and quality bits. The TB functionality is described in Subsection 13.5.2, and the track finding ASIC in Subsection 13.5.3. Muons from the 12 TBs in a crate are sent to a Sorter Board (SB) located in the back of the crate. The Sorter Board receives 4 muon candidate codes and locations ( $4 \times (8+4)=48$  bits) via the backplane connector from the Trigger Boards and finds the 4 highest  $p_T$  muon candidates from the incoming  $12 \times 4=48$  muon codes. It also performs the  $\phi$  ghostbusting.

The tasks of sorting and ghostbusting are implemented in a set of FPGAs or ASICs arranged in a tree structure. More details about the algorithm and its implementation may be found in Section 13.6. The codes and addresses of the 4 highest  $p_T$  muon candidates ( $4 \times (8+8)=64$  bits) in each crate are sent by the backplane SB to the appropriate SB located in the Barrel or Endcap Trigger/Sorter crates. A back view of this crate type is shown in Fig. 13.10. These last SB perform



**Fig. 13.10:** Trigger/Sorter crate type - Back Panel view

the final selection, so that the final output from the RPC subsystem to the Global Muon Trigger consists of the 4 highest  $p_T$  muons from the barrel and 4 from the endcaps. The last sorter in the tree also contains the selective readout engine.

The Trigger/Readout crate type contains Readout Boards which receive the compressed RPC chamber data from the data link subsystem and merge them into event segments. Master Readout Boards are connected to the DAQ system. The data arriving on each link are sent to several trigger boards, but only to one readout board. The readout subsystem is described in Subsection 13.5.4.

A VME Controller in each crate is used to control the boards. It performs the downloading of programmable devices on different TC boards, the status control of different boards, and selective readout (VME) control. Finally a Timing Board, located in the middle of the crate, distributes five timing signals from the TTC system via the special P3 backplane: 40 MHz LHC clock, L1Accept, BC0, Reset, and Test strobe.

### 13.5.2 Trigger Board

The Trigger Board (TB) is where the 4 highest momentum muon candidates from a sector,  $30^\circ$  in  $\phi$  and  $0.1$  in  $\eta$ , are found. Twelve TBs, covering one tower in  $\eta$ , are contained in one TC. The functional scheme of a TB is shown in Fig. 13.11. It can be divided into 4 parts: input circuit, segment finding logic, local ghostbusting and sorting tree, and services. A cartoon view of the TB design is shown in Fig. 13.12.

The input part of the TB receives the coded data from the RPC data link subsystem and uses them to define the input vectors for the PAttern Comparator (PAC) matrix. It is built of a set of FPGAs. These FPGAs perform the following tasks:

- decode the link data (LDEMUX),
- select the required data, since the input link can contain the data from several RPC chambers, and some of them may not be needed on an given TB,
- re synchronize the data to the local clock,
- perform calibration and selective readout tasks (Section 13.9),

- build ORs of 4 consecutive strip (OR4) signals needed for the PAC inputs,
- multiplex single strip and OR4 signals into both edges of 40 MHz clock. This last function is needed to reduce the number of PAC input lines.



**Fig. 13.11:** Trigger Board - functional scheme

A TB receives up to 10 links in the barrel and overlap regions, and up to 11 links in the endcaps. The optimal organization of the RPCs readout into links is vital to minimize the number of links needed on a TB. The details of the present configuration of links are given in [13.9]. The link naming conventions are also given there.

The code finding block on a TB is built out of a set of 12 PAC ASICs, which together implement the algorithm described in Subsection 8.3.3. Each PAC finds the highest  $p_T$  muon in a segment, based on eight strips in the reference plane,  $2.5^\circ$  in  $\phi$  and  $0.1$  in  $\eta$ . In the barrel, the data from 6 layers of RPC chambers are used, while in the endcaps data from only 4 layers are used. The PAC ASICs are described in detail in the next Subsection.

The local ghostbusting and sorting FPGA takes care of selecting the 4 highest  $p_T$  muons from the 12 PAC outputs. All ghosts in  $\phi$  (see Subsection 13.6.2) are eliminated. To perform ghostbusting across the boundaries of the  $30^\circ$  sectors, two muon codes are needed from neighbouring TBs. These codes are provided via backplane connection.



**Fig. 13.12:** Trigger Board - cartoon view

### 13.5.3 PAC Trigger Processor

#### PAC Overview

The PAttern Comparator (PAC) ASIC is a programmable, full custom device for finding one highest  $p_T$  muon candidate in the segment area. The segment is the region of detector covering  $\Delta\eta = 0.1$  and  $\Delta\phi = 2.5^\circ$ . The algorithm which is embedded in the PAC was described in detail in Section 13.2. Here we concentrate on the technical realization and optimization of the PAC. In the following discussion it is important to recall that different algorithms are to be implemented in the barrel and endcaps. In the barrel we have six RPC planes, to provide low and high  $p_T$  candidates, whereas in the endcaps we have only four RPC planes to work with.

In Fig. 13.13, two neighbouring segments are shown. The segments are defined at the level of the reference muon station 2, where the strip assignment to a segment is unique. The reference segment contains eight strips. The segment areas in the other, non-reference stations overlap. The set of strips in non-reference stations connected, within a PAC processor, to one strip in the reference station is called a cone. The definition of the cone size is obviously a matter of optimization: a large cone will require many inputs to a PAC processor and may be unrealizable in practice, while a small cone may not contain all the hit patterns needed and will result in loss of trigger efficiency.

For every recognized muon, PAC furnishes five bits of momentum (bits 0:4), one sign bit (bit 5), and two quality bits (bits 6:7) packed into an 8-bit code. The quality bits distinguish



**Fig. 13.13:** Matching tracks with patterns - the idea behind the PAC processor. In this Figure two neighbouring segments in  $r\phi$  are shown. They are uniquely defined in reference station 2. In other muon stations the RPC strips are connected to several segments, giving rise to overlapping cones, ghosts and ambiguities. The strip widths are exaggerated w.r.t distances between muon stations.

between muon candidates with different numbers of layers hit, with a higher value indicating a better quality candidate. Muons with hits on all four layers have the quality bits set to 11. For high transverse momentum candidates with only three hit layers out of four (3/4), different codes are defined depending on which layer has a missing hit. The next highest quality (code 10) is given to 3/4 with a hit missing in station 3 or 4, the next (code 01) to 3/4 with the missing hit in station 1, and the lowest (code 00) is 3/4 with no hit in the reference station 2. All low-momentum candidates with 3/4 hits only are flagged with code 00.

In general the PAC can be considered as a special Content Addressable Memory (CAM) – for every input pattern of 124 signals, a well-defined 8-bit output value is produced. The PAC is a synchronous, pipelined device. It produces a valid output code on every clock cycle, corresponding to the value of the input pattern from three clock cycles earlier.

The PAC processor contains 9 logical track finding blocks. Eight of these are used to recognize high transverse momentum candidates (tracks) passing through the eight strips in the reference station, while the last one finds low transverse momentum candidates. For high momentum candidates we use full granularity of the RPCs in the optimal way. For low momenta, however, the strip signals are ORed to reduce the number of programmable patterns.

Programmable pre-defined patterns are loaded into a PAC through the Boundary Scan data registers. At the experiment startup, the lists of pre-defined patterns for each PAC processor will be generated from the RPC Muon Trigger simulation; the current procedure is explained in

Section 13.7.2. Latter, these simulated lists of patterns will be, most likely, improved by the analysis of the data and modified.

Optimization of the high transverse momentum algorithm, which is implemented in blocks 1-8 of the PAC, requires the use of the dynamical cone concept briefly explained in Chapter 8. This idea is used to reduce the number of programmable patterns needed. In the dynamical cone, the full granularity of the RPCs is used for nearly non-bending track candidates in the middle of a cone, subtended on a strip in the reference station. If the track approaches the cone boundary in any non-reference station, we assume that this track is of lower transverse momentum; so that we do not need the full granularity and the strips may be logically ORed together. By extensive simulation we have proved that this concept allows us to maintain sharp efficiency curves with a reduced number of programmable patterns.



**Fig. 13.14:** General view of PAC ASIC input and output signals

Each of the high and low  $p_T$  blocks may deliver up to 80 muon candidates of each charge. At the same time, the quality bits are assigned as discussed above. The candidate muons are assigned a 5-bit momentum code, with the assignment of codes to tracks done in a programmable way. The candidate with the highest quality bits and momentum code will be selected. There is an option to adjust low and high momentum codes to a common scale, again in a programmable way, at the PAC output.

### Input signals

The PAC inputs from RPCs are grouped into 7 signal vectors. In order to reduce the number of pins in the chip, the input signals are multiplexed onto both edges of the 40 MHz input clock. The grouping of RPC input signals and use of both leading and trailing edges of the 40 MHz clock results from a compromise between reducing the number of connections on the TB, and

simplifying the PAC design. The programming of the track patterns and codes inside the PAC structure uses the Boundary Scan interface.

Fig. 13.14 shows the PAC inputs used in the barrel. Here the input vectors correspond to the strip signals from the 6 RPC planes in the muon stations RB4, RB3, RB2out, RB1in, RB2in and RB1out. As described in Section 13.5.2, the Trigger Board input FPGA (LDEMUX) prepares the input signals from RPC data taken from links. The first four input vectors are the individual strip signals from RB4, RB3, RB2out, RB1in. These signals are input to the eight track finding blocks devoted to high  $p_T$  muon recognition, one block for each strip in the reference plane. The signals from RB1in, RB1out, RB2out and RB2in (single strip or OR4 of 4 consecutive strips) are input to the track finding block 9, which is used to recognize low  $p_T$  muons. In the endcap regions, four RPC chambers ME4, ME3, ME2, ME1 strip signals and corresponding OR4 are input to the blocks 1-8 and block 9 of its PACs.



**Fig. 13.15:** Track programming in blocks 1-8 and block 9

### Pattern logic

In each of the nine blocks, 80 positive and 80 negative tracks can be programmed. In blocks 1-8, positive and negative tracks have exclusive asymmetric strip cones except in the reference layer. In block 9, all strip connections are symmetric

Fig. 13.15 shows the track programming in blocks 1-8 and block 9. The logic for one track is a coincidence unit which selects 3-out-of-4 or 4-out-of-4 coincidences of hits from four RPC planes that conform to a pre-programmed spatial pattern. For each track pattern, the strip to be hit in each of the three non-reference layers can be selected from a defined subset of the input strips in that layer. Programming is done by writing the selection codes into multiplexers MX1, MX3, MX4.



**Fig. 13.16:** Asymmetric strip connections for blocks 1-8

As an illustration of the mapping of the input strips to these multiplexers, Fig. 13.16 shows the connections for muon station 1 (MS1) to the high- $p_T$  blocks 1-8. A total of 32 strips from this station are connected to the PAC ASIC. For any given strip in station 2, the reference layer, 15 of these strips are sent to the positive or negative track pattern logic. Strips close to the reference strip in  $\phi$  are sent individually to the logic. For larger  $\Delta\phi$ , corresponding to lower  $p_T$  tracks, the MS1 inputs are ORed together in groups of 2, 3 or 4 strips. The number of MS1 signals for a given reference strip is thereby reduced to 9, as shown for the MX1 multiplexer in Fig. 13.15. The middle strip in Fig. 13.16 lies just below the reference strip. Thus the track going through the middle strip in MS1 and through the reference strip is a non bending one. The Figure clearly shows the dynamic cone, where the arrangement of MS1 strips connected to successive track finding blocks, corresponding to successive reference strips, is shifted by one strip to the right each time.



**Fig. 13.17:** Symmetric strip connections for block 9

In track finding block 9, all the input data from the non-reference layers is connected to every pattern logic block. This is illustrated in Fig. 13.17. Again, Fig. 13.15 shows the numbers of input signals from each of the three non-reference layers.

The track finding logic generates one of four quality codes for each track found. The subsequent circuitry looks for the highest quality code out of 1280 possible high  $p_T$  tracks and 160 low  $p_T$  tracks separately. Only these highest quality tracks are allowed to pass to the next stage of the logic, which consists of high  $p_T$  and low  $p_T$  momentum encoding circuits.

### Encoding of track momentum and output track selection

The total number of tracks which can be programmed into the PAC is  $9 \times 160 = 1440$ . These tracks are then classified into at most 31 different codes. For blocks 1-8, these are 15 for positive and 15 for negative tracks, with code 0 reserved for the case where no muon candidate is found in this PAC. For block 9 there are 8 codes for positive and 7 for negative tracks. To achieve the reduction from  $2 \times 80$  tracks, these are first ORed into groups – 1 group of OR16 tracks, 2 groups of OR8 tracks, 4 groups of OR4 tracks, 8 groups of OR2 tracks and 16 single tracks. This reduces the number to  $2 \times 31$  groups which can then be programmed into any of  $2 \times 15$  codes. Fig. 13.18



**Fig. 13.18:** Code programming circuit for block 9 tracks -low  $p_T$

shows the corresponding code programming circuit for block 9 (low  $p_T$ ). Programming is done by writing the selection codes into programmable demultiplexers dmux7. Only selected input signals are used in the track signal definition.

Fig. 13.19 shows the final selection circuit of the PAC ASIC. This logic selects the highest code from those delivered by the track recognition and code programming circuits. The selection is based only on the momentum code, and the corresponding sign bit is then joined to it. The 4-bit code of high  $p_T$  muons can be expanded into 5-bit code using a LUT5 circuit, similarly the 3-bit code of low  $p_T$  muons can be expanded into a 4-bit code. Depending on the flag value



**Fig. 13.19:** Code selection circuit

(endcap or barrel region) and the value of the corresponding quality bits, either the high or the low  $p_T$  code is sent to the output.

#### Boundary Scan implementation

The PAC ASIC is equipped with (BST) Boundary Scan Test (JTAG) circuit defined by IEEE 1149.1 standard. Three mandatory (EXTEST, BYPASS, SAMPLE/PRELOAD) and two optional (INTEST, IDCODE) boundary scan functions are implemented. The BST circuit is used to test the pins of PAC and also to program PAC functions. 9 BST user\_data\_registers are used to program patterns and codes corresponding to the 9 track recognition blocks, and one to program input mask registers, output sign codes, the barrel/endcap flag bit and high and low  $p_T$  LUTs.

#### 13.5.4 The Readout System

All single strip data used as input to the RPC track finding logic are read by the readout system. The link subsystem (see Section 13.4) delivers the data via splitters to the RPC readout subsystem. The zero suppression realized in the link subsystem by the LMUX devices means that only non-empty data are read. The data collected by the readout system are delivered to the CMS DAQ by standard Readout Dual-Port Memory (RDPM) modules.

The design of the readout system has been determined in part by considering the rate of zero-suppressed data input. The coding/decoding algorithm and the structure of the link system define and limit this data bandwidth. The required bandwidth has been found based on the results of simulation studies. The quality and safety of the trigger system depend on the link system bandwidth. The results of rate analyses for station RE1/1, where the most severe background conditions are expected, are presented in Table 13.3. Fig. 13.20 shows the simulated distribution of number of transmitted data packets.



**Fig. 13.20:** Histogram of event size expressed in number of packets



**Fig. 13.21:** Readout System structure in the Trigger rack

**Table 13.3:** Analysis results of packets quantity from RE1/1 stations connected to 48 optical links (worst rate case)

| number of packets/event    | no noise | 100Hz/cm <sup>2</sup> noise |
|----------------------------|----------|-----------------------------|
| average/link               | 0.04     | 0.065                       |
| average/crate              | 3.14     | 4.59                        |
| max/crate ( $10^6$ events) | 14       | 17                          |

The results of this analysis allow us to accept the following constructional parameters:

- each data packet sent to an RDPM consists of a single data word,
- the maximum event size reaches about 1kB, while the average is 300 bytes. Both sizes are considerably less than the 2kB page size fixed for CMS experiment, so that a large margin of safety exists

The accepted solution for the readout system design is presented in Fig. 13.21. A single module of the readout system serves 40 optical links and cooperates with one RDPM. This large number of links dictated the division of the system into two functional parts:

1. Slave Readout Boards (SRB) accept the compressed data streams from the links. One SRB serves 8 optical links. The incoming data are stored for the pipeline delay, buffered synchronously with a L1Accept and derandomized. All SRBs work in parallel. There are SRBs in all 18 Trigger/Readout crates.
2. A Master Readout Board (MRB) in every Trigger/Readout crate collects the data from the SRBs; the MRBs work in pairs to transfer the data to the CMS DAQ. The MRBs work in two stages: first, each one takes the data stored in the buffer memories of the SRBs within its crate; then subsequently the DAQ MRB executes the final data concentration from two crates, and makes these data available to the RDPM.

All readout boards are located in the trigger crates, and communicate by a local bus. The positions of the readout boards, and the back-plane connections in the trigger crate, are shown in Fig. 13.22. The Slave Readout Board derandomises compressed data streams delivered from the RPC by splitters and optical links. The principle of the derandomisation for a single data stream from an optical link is shown on the left-hand side of Fig. 13.23. Data packets corresponding to a given L1A trigger are stored in a fixed page of the buffer memory (DPM) together with information about the number of event packets. In this way, data from the same event are stored in the same address space of the DPMs. The right-hand side of Fig. 13.23 presents the structure of an SRB. Each of 8 channels is served by a single RLDEMUX (PLD or ASIC) and one buffer memory (DPM). The VME (or PCI) interface steers the board. The SRB is equipped with a JTAG interface dedicated to test purposes.

Each Master Readout Board concentrates data from the buffer memories on the SRBs in its crate. The first stage of concentration is executed simultaneously in all MRBs, and relies on the creation of a “crate data event” for each crate separately. An example of this process is shown on the left side of Fig. 13.24. Then the DAQ MRB, which is shown in the central crate on Fig. 13.21, merges both “crate data events” into a common “rack data event” and sends this out



**Fig. 13.22:** Readout boards positions and back-plane structure in the Trigger Crate



**Fig. 13.23:** The principle of operation (left) and the functional scheme (right) of the Slave Readout Board



**Fig. 13.24:** The principle of operation (left) and the functional scheme (right) of the Master Readout Board

over its interface to the RDPM. A local readout bus internal to each crate, shown in Fig. 13.21, is used for the synchronous transmission of data in pipeline mode between SRBs and MRB.

## 13.6 Sorting and Ghostbusting

The muon track finding algorithm described in the previous section creates ghosts in segments close to the one containing a true muon. A multiplication of candidates in the  $\phi$  plane results from the use of a three-out-of-four hits algorithm and the overlapping segment definition, whereas  $\eta$  ghosts are a consequence of the geometry of the of RPC system and of the definition of  $\eta$  towers. Ghosts should be eliminated as early as possible, preferably before any sorting of muon candidates from some detector region. Since the track finding for one  $\eta$  tower is contained within one crate, it is possible to eliminate ghosts in  $\phi$  locally within the crate, before any sorting. The algorithm for that is hard-wired into the Trigger Board logic, as described in Subsection 13.5.2. The remaining candidates in an  $\eta$  tower are then sorted, and four best ones are retained. The elimination of  $\eta$  ghosts is more complicated, since it requires connections between neighbouring  $\eta$  towers, and has to use candidate addresses. In this Section we discuss in turn: ghostbusting in  $\phi$ , ghostbusting in  $\eta$ , sorter ASIC, and finally the sorting overview.

### 13.6.1 Ghostbusting

A situation in which more than one segment returns track candidates for a single muon passing through the RPC system gives rise to "ghosts". In fact almost all ghosts are due to the use



**Fig. 13.25:**  $\phi$ -ghosts are due to the overlap between hourglass shaped sets of strips (cones) connected to the nearby PACs.

of 3-out-of-4 logic, which means that we allow a missing hit in one layer. Such 3/4 candidates are called "low quality muons" in contrast to "high quality" 4/4 ones. The quality code takes preference over the  $p_T$  code in the sorting of candidates. High quality candidates always win over low quality ones during the sort operations inside the PAC as well as the subsequent, external sorting. Allowing for 3/4 muons in order to increase trigger efficiency, we create ghosts.

Ghosts in the same  $\eta$  tower, called  $\phi$ -ghosts, occur because of the overlap between the hourglass-shaped sets of strips connected to neighbouring PACs, as shown schematically in Fig. 13.25. The ghosts in separate  $\eta$  towers, called  $\eta$ -ghosts, are due to the splitting of the signals from non-reference layers to two or even three towers. This is illustrated in Fig. 13.26. Large rectangles in the Figure are PACs. One row corresponds to an  $\eta$  tower, with only four out of the 144  $\phi$  segments shown. On the right part of the drawing, thin vertical rectangles represent the  $\eta$  coverage for strips. Each strip of the second, reference RPC layer is connected to only one  $\eta$  tower. The strips of non-reference layers connected to the towers 19, 18, 17 and 16 are marked with black, dark grey, middle grey and light grey thin vertical bars respectively.

The majority of ghosts can be eliminated by a simple algorithm, rejecting low-quality track candidates with only 3/4 layers hit if a high-quality track is found in any adjacent segment. Studies have shown, however, that a significant rate of ghosts remains after applying this procedure [13.29]. More sophisticated algorithms have therefore to be developed to reduce the ghost rate further. Fig. 13.27 illustrates two mechanisms which can give rise to additional ghosts. Clusters of hits in the reference layer, RPC station 2, that spread across more than one 8-strip segment can result in 4-layer tracks being found in adjacent PACs. Also ghost tracks can be found in non-adjacent segments, due to the requirement to accept low-momentum tracks and the reduction in precision from ORing strips together. The ghostbusting algorithm has been developed [13.22] to reject ghosts from these sources, based on the observation that ghosts will in general be found with a lower  $p_T$  code than the true muon.



**Fig. 13.26:**  $\eta$ -ghosts are due to splitting of the signals from non-reference layers to two or even three towers.



**Fig. 13.27:** The effect of enlarged cones in  $\phi$  and the presence of clusters. The PAC marked "OR-2" returns a 3/4 ghost because of the cone enlargement. A cluster of hits in the reference layer generates a 4/4 ghost in the segment adjacent to the true muon track.

### 13.6.2 Ghostbusting within a tower

As described in Subsection 13.5.2, the  $\phi$ -view ghostbusting is performed locally on the Trigger Boards. Nearest neighbour segment information is passed between adjacent boards in  $\phi$  to enable the algorithm to cover all segments in one  $\eta$  tower. The algorithm is based on searching for a local maximum in the  $p_T$  codes of tracks, with one additional complication to cope with the case where equivalent codes are found in two or more adjacent segments. The best local candidate is retained, and its neighbours are suppressed.

Fig. 13.28 shows a sketch of several typical configurations of outputs from neighbouring PACs. Each vertical bar in the Figure represents the  $p_T$  code returned by a PAC. For the first three groups of responses to a single muon, the local maximum algorithm selects the best candidate. For the last group, where several segments return equivalent codes, the algorithm must select one muon candidate to go forward to the subsequent sorting stage. Simulation has shown that such series, where two, three or even more consecutive segments contain equally good muon candidates, are detected quite often. The series will be discarded as a whole by the ghostbusting logic if there is a better candidate at either edge. If the neighbouring segments on both sides produce candidates of lower quality, however, the algorithm must select one candidate. The number of equivalent candidates in a series is usually three or fewer, according to our simulation studies. Rather than select the “centre” of the series, which is not well-defined if the number of segments is even, the algorithm picks the last-but-one. This gives a sensible result up to series lengths of four.



**Fig. 13.28:** Schematic display of the PAC trigger response.

This  $\phi$ -view ghostbusting algorithm is implemented on the Trigger Boards in FPGAs, together with the first level of the sorting. The algorithm logic is illustrated in Fig. 13.29. The muon candidate detected in segment A can only pass through the ghost buster chip if one of the following requirements is fulfilled:

- there is a local maximum in the segment A. The muon candidate in A is better than candidates in the next (B) and preceding (Z) segments. In the Figure, this condition is tested by the upper AND gate in the A channel,

- segment A is the last-but-one candidate. It is no worse than the preceding one (Z), and the next one (B) is equally good, but better than next-to-next (C). This condition is tested by the lower AND gate in the A channel in the Figure.



**Fig. 13.29:** Implementation scheme for the  $\phi$ -view Ghost Buster.

The shaded rectangles in the scheme on Fig. 13.29 represent separate building blocks. Only one block (AB) is shown in full. The arrows at the block borders indicate the information to be shared between neighbouring blocks: one data bus plus one bit are sent to the previous building block, with equivalent data received from the next one; and two bits are received from the previous block, with equivalent data sent to the next one. A building block can contain any number of channels, although only two are shown. The number can be increased by inserting additional channels along the horizontal dashed line drawn between channels A and B. It is, however, expected that the number of channels per block will remain even. This is because the first stage of sorting follows the same block structure. This sorting stage is indicated by the darker shaded box attached at the right of the Figure.

### 13.6.3 Ghostbusting between towers



**Fig. 13.30:** Layout of the veto and sorter algorithm.

After the  $\phi$ -view vetoing we do local sorting, so as to get no more than 4 muon candidates per  $\eta$ -tower. It is not reasonable to sort them further without  $\eta$ -view ghost elimination. This can be done by comparing all candidates from the given  $\eta$  tower with those from the neighbouring towers, as shown for the case of three neighbouring towers in Fig. 13.30. A possible FPGA implementation of the algorithm has been investigated. This FPGA should comprise 2 building blocks, GB\_comparator and GB\_gate. For each muon candidate, the “gate” is opened if the corresponding bits from the “comparators” are on. These bits are switched off if there is a better muon candidate, not separated in  $\phi$ , in the neighbouring  $\eta$ -tower. As in the  $\phi$ -view ghostbusting case, the algorithm is not left-right (forward-backward) locally symmetric in the case of equally good muon candidates in neighbouring towers. It is possible, however, to keep mirror symmetry between positive and negative  $\eta$ .

Extrapolation from the three towers shown in Fig. 13.30 to the situation with  $2k+1=33$   $\eta$  towers is straightforward. Each  $\eta$  tower logic has its own  $\phi$ -view veto part and sorter part. The  $\eta$ -view veto algorithm described here is implemented in FPGAs on Sorter Boards located in sorter area of the 3 special Trigger/Sorter Crates. The system contains one of these special crates for the barrel and one for each endcap.

### 13.6.4 Sorting

#### Overview of Sorting

The PAttern Comparator Trigger provides a total  $33 \times 12 \times 12$  muon candidates, equal to the number of RPC muon segments. This number of candidates must be reduced to two sets of four, for the barrel and endcap areas separately. The sorting task is divided into three separate levels:

1. Trigger Board level,
2. Trigger Crate level,
3. area (i.e. barrel or endcap) level.

The Trigger Board level sorting circuit reduces the number of muon candidates from 12 to 4 and is preceded by the ghostbusting circuit, as described in Subsection 13.6.2. As a result, four muon candidates are found for every TB, covering  $30^0$  in  $\phi$  and 0.1 in  $\theta$ . Both ghostbusting and sorting tasks are implemented in one FPGA, with latency of 5 bunch crossings. The sorter by itself can also be realized using ASICs, with a latency in this case of 6 bunch crossings. After this level of sorting, a four-bit address is added to the candidate codes (bits 8:11). The address identifies the segment in the reference plane corresponding to the PAC processor in which each candidate was found.



**Fig. 13.31:** TC level sorting tree on the TC backplane

The crate level of the sorting tree comprises the task of finding the 4 highest code muons from the 12 groups of 4 muon candidates, i.e. in one tower in  $\eta$ . This task is performed on a sorter board located in the back portion of the trigger crates. After this stage an additional four bits of address (bits 12:15) are added to the candidates codes to identify the  $30^0$   $\phi$  sector. Fig. 13.31 shows the crate sorting circuit. The sorting tree is realized in set of FPGAs. The last FPGA also contains, diagnostics and the selective readout circuit as described in Section 13.9.

**Fig. 13.32:** Endcap (barrel) area sorting tree

The area sorting circuit, shown in Fig. 13.32, fulfills the task of selection of the 4 highest momentum muons for the complete barrel or endcap area. The function is accomplished in the sorter part of the Trigger/Sorter crate. The inputs and outputs of this circuit can be selectively routed to the diagnostics and selective readout. Four further bits of  $\eta$  address within barrel or endcap (bits 16:19), and a 2 bit area code (bits 20:21) are added to the code to identify a muon candidate unambiguously to the Global Muon Trigger. The final muon candidate code and address bits are given in Table 13.4.:

**Table 13.4:** Muon candidate code and address bits

| area       | tower number within an area | $30^0$ sector in $\phi$ | eight strip segment (PAC processor) | PAC output code |
|------------|-----------------------------|-------------------------|-------------------------------------|-----------------|
| out[21:20] | out[19:16]                  | out[15:12]              | out[11:8]                           | out [0:7]       |

### The Sorter Chip

The main component of the Sorting Processor is the Sorter Chip. The function of this device is to reorder and provide the 4 highest words among 8 input words, in decreasing order. An ASIC implementation was developed in AMS 0.8  $\mu\text{m}$  BiCMOS technology in 1996. In the final version of the sorter we will, most probably, use one of programmable Xilinx Virtex devices which have since then appeared on the market, and provide the required functionality at a modest price. The remainder of this Section describes the ASIC but we believe that transfer to an FPGA will be straightforward.

The input format is defined by the 8-bit PAC output word described in Section 13.5.3. As was already mentioned, we always prefer high quality (i.e. four-out-of-four) candidates over low quality (three-out-of-four). For the candidates with same quality bits the comparison is effectively performed only between the first 5 bits (momentum code bits), representing the effective magnitude of the words. The 6th bit (“sign bit”) is not used. All the other bits (sign bit and address bits) just propagate through the network.

The general scheme of the sorter ASIC is shown in Fig. 13.33. The design takes advantage of the fact that after the first level of sorting the candidates are ordered, and they do not have to reordered but only merged. This allows a latency saving in the sorting network, since the first stage (Sorter4 in the Figure) takes 2bx, while the second one (Merger) only 1 bx.

The circuit is also provided with JTAG Boundary Scan circuitry, in order to test the chip at the board level, once it will be installed into the apparatus. Moreover, an additional internal scan path circuitry improves the on-line testability of the chip making use of the same JTAG control signals and protocol.



**Fig. 13.33:** Sorter general scheme

### Timing Analysis

The timing measurements made on the SORTER chip placed on a test board running at 40 MHz, show that the typical propagation time of the longest step of the pipeline (Sorter4 or Merger), is about 10 ns (Fig. 13.34). In the worst process case, this time is still less than 20 ns, guaranteeing the right behaviour of the circuit with a 40 MHz system clock, as required in CMS. Tests made on the 10 pre-sample chips received by the foundry show that the ultimate sustainable rate, with the present version of the Sorter, is 66 MHz. A new version of the circuit, implemented

on some new programmable devices such as Virtex-E by Xilinx, will allow us to achieve a circuit running at 80 MHz or more.



**Fig. 13.34:** Timing of sorter/merger

## 13.7 Simulation Results

### 13.7.1 Overview

The simulation results presented in this Section were, unless otherwise indicated, obtained ORCA C++ software, and using the latest RPC geometry [13.32]. The subjects addressed here are: acceptance and efficiency curves of the RPC Muon Trigger, single muon trigger rates from various sources, and trigger robustness. Special section is devoted to simulation of the Optical System transmission and its optimization.

#### Software Packages

The MTRIG software in the CMSIM 118 and ORCA is written in a way which closely mimics the hardware structure with PAC processors, ghostbusting and sorting blocks; each of the software blocks imbeds an algorithm very similar if not identical to that realized in its hardware counterpart. The input signals- digitized RPC hits- are connected to the PAC processor blocks exactly in the same way we foresee the connections between RPCs and Trigger Crates. During the digitization, the crossing of a muon track through an RPC gives rise to a cluster of hit strips. In the simplest case there is only one strip which is hit, but it is possible to generate on demand clusters of varying sizes, in particular one can use the cluster size parametrization of the test data. The sorting algorithm described in Section 13.6 is also embedded in the software.

### 13.7.2 Simulation of the Pre-defined Patterns

The RPC Muon trigger software MTRIG needs two types of input: the simulated digitized muon hits in the RPC stations and the list of predefined valid patterns of hits, used by the PAC processor. The latter have to be obtained from the dedicated simulation runs. Creation of the valid patterns lists is of importance not only for the simulation of the RPC Muon Trigger but also for its operation at the startup of the experiment, as explained in Section 13.5.3.

#### List of Predefined Patterns

The list of predefined patterns (sometimes called masks) is an ordered one: it consists of classes of patterns, each class is assigned to a definite  $p_T^{\text{cut}}$ . Every hit pattern (track) appears only once on it. To generate a list<sup>2</sup> of valid patterns approximately 25000 muons for every of 24 values of transverse momenta (corresponding to the transverse momenta codes) were generated flat in  $\eta$  and  $\phi$ , and their hits were digitized. The latest geometry and MTRIG package were used. The procedure of  $p_T^{\text{cut}}$  assignment to a pattern was an involved one and required two passes through the generated list, starting from the highest momentum. During the first pass, only the unambiguous 4/4 hit patterns were selected, forming a pre-set of valid patterns<sup>3</sup>. At this stage, the patterns are ordered in transverse momentum, but no  $p_T^{\text{cut}}$  values are assigned to them yet. During the second pass, the patterns from the list were compared to that from the pre-set<sup>4</sup>. The purpose of this was comparison and matching of possible 3/4 patterns with unambiguous 4/4 ones. During this pass, the comparison was done by running the PAC algorithm on all patterns from the list treated as the RPC hits, with programmed patterns from a pre-set. As the result, the rate of matching between all generated patterns and patterns from the pre-set was obtained. For a given pattern from a pre-set we knew the rate of its matching with all generated patterns of a given transverse momentum and pseudorapidity. Then, for a given  $\eta$  tower we assigned the  $p_T^{\text{cut}}$  values to patterns in the pre-set, starting from the highest  $p_T$ , keeping 95% of accumulated rate in the patterns above the  $p_T^{\text{cut}}$  and minimizing the rate below that value. The pre-set become the set of valid masks. The last stage was probably the hardest to algorithmize, and some degree of arbitrariness remained there. By construction, this arbitrariness was at the level of percent of the total rate.

The construction of the list of predefined patterns described above depends on the PAC structure and algorithm embedded in it. It is also dependent on the detector geometry used in the simulation, on the RPC segmentation, and connections between RPCs and PAC processors. The algorithm described above produces predefined list of patterns, which may be, in principle, different for every PAC processor, reflecting the detector structure.

- 
- 2. This list uses the relative strip numbers i.e. the differences between a strip number in station 1, 3, or 4 and that in the reference station 2. Thus, the approximate rotational symmetry of the RPC system may be profited from. It was, however, quite difficult technically to properly take into account the complicated RPC geometry, with all gaps etc. when calculating the strip difference.
  - 3. For any  $\eta$  tower, there are gaps in azimuthal coverage of four RPCs. The 4/4 selections meant that the pre-set list was formed from tracks generated in the regions of complete azimuthal coverage in four stations. These tracks represented the response of an idealized RPC system.
  - 4. The appropriate rotation in azimuth was performed before this comparison. It was facilitated by the list structure, which contained only the relative strip numbers.

### 13.7.3 Geometry and acceptance



**Fig. 13.35:** CMSIM 118 implementation of the RPC geometry.  
The Transverse view of the barrel is shown.

Important part of the simulation is the detector geometry embedded in the software, in this case it is Geant geometry in CMSIM 118. The results shown in Section 13.7 were obtained using the latest optimization of the RPC geometry from June 2000, including optimization of the forward RPCs [13.26]. Transversal view of the barrel is shown in Fig. 13.35, the longitudinal segmentation was already discussed and is shown in Fig. 13.1. The mechanical structure of RPCs is properly taken into account, together with all gaps and holes between the chambers. The resulting strip geometry is only approximately projective, since the strip widths are constant for a given (flat) RPC. There are gaps between chambers in the Barrel, and there is no azimuthal overlap in the RE2/1 forward chambers. The acceptance of the RPC Muon Trigger system obtained with this geometry is shown in Fig. 13.36, for several selected towers in  $\eta$  and for muons with different transverse momenta. The acceptance for at least one, two, three or four hits in different stations are shown. In all but one  $\eta$  towers the acceptance for 3 or more hits on a straight muon track exceeds 90%, while the acceptance for 4 hits is about 80%. The exception is the second tower ( $0.06 < \eta < 0.25$ ), where the

3-hit acceptance drops down to 80%, while the 4-hit one is about 40%. This degradation is due to relatively large gap between the barrel central wheel and the next one.

The acceptance shows no azimuthal dependence in all  $\eta$  towers, but the second one.

The 3/4 and 4/4 acceptance, relevant for PACT performance, is shown in Fig. 13.37 as a function of pseudorapidity  $\eta$  and for several transverse momentum ranges.



**Fig. 13.36:** Acceptance of the RPC trigger system: probability of finding at least 1, 2, 3 or 4 hits from a muon track of a given  $p_T$  in RPC planes in selected pseudorapidity towers.

### 13.7.4 Simulation of RPC Performance

Realistic simulation of the trigger response, and especially of its robustness should include detector effects such as noise, cluster size, cross-talks, inefficiencies and time jitter. These are included in the software packages. Any combination of these can be switched on allowing for systematic studies of varying complexity.

The effect of RPC noise and inefficiencies are relatively easy to parametrize and simulate. The later is basically a multiplicative, while the former turns out to have little impact on the trigger performance (see below). The timing properties of RPCs were thoroughly studied in the test beams and the resulting time jitter is adequately parametrized by a gaussian with value of  $\sigma = 2.5$  ns [13.1].



**Fig. 13.37:** Acceptance of the RPC trigger system: probability of 3 out of 4 and 4 out of 4 hits as a function of pseudorapidity and for four selected transverse momentum intervals..

The cluster size and cross-talk are much harder to simulate realistically, largely because there are not enough data on these from RPC tests with final electronics, and none in final data acquisition environment. It is unfortunate since it turns out that their influence on the trigger performance is strong (see below).

Results presented in the following sections use the parametrization of the cluster size, taken from the data[13.26]. The probability distribution of having a cluster of  $N$  strips on a muon track is given by the exponential formula:

$$P(N) \propto \exp(-N/(cs))$$

where, for the RPC with 3 cm wide strips the parameter  $cs$  has the experimental value of 1.3, giving the average cluster size of 1.8 strips. Varying the value of  $cs$  we can study the effect of the cluster size on the trigger performance. The cluster size distributions for three values of  $cs$  are shown in Fig. 13.38. The cross-talk, in the absence of input data, was not simulated.

### 13.7.5 Main Results



**Fig. 13.38:** Cluster size distributions from the exponential parametrization for three values of the cs parameter

The transverse momentum spectra of muons fall very rapidly. The rate of the muon trigger system is governed by the steepness of the efficiency curves near the threshold. To keep low trigger rates and obtain high purity of the trigger the efficiency dependence on the muon transverse momentum should be as steep as possible.

The efficiency curves as a function of muon transverse momenta for several selected rapidity towers are given in Fig. 13.39. Different curves on one plot show the efficiencies for a given  $p_T^{cut}$ , defined as a  $p_T$  value for which the efficiency crosses the 90% level. As explained in Section 13.7.2, each of the  $p_T^{cut}$  curves is associated with a set of predefined patterns, used in the PAC processor. The plots were obtained for the muon track generated with the cluster size of one; no noise and inefficiency were included. The curves represent therefore the best case; they become markedly less sharp when the average cluster size increases above 3-4 strips [13.29]. There is no appreciable effect of noise. The RPC inefficiencies basically result in multiplicative decrease in the trigger efficiency for high  $p_T$  muons and have very little effect on the sharpness of the threshold curve. From the above figure it is clear that there are some towers in  $\eta$  with lower efficiencies at higher transverse momenta and/or less steep thresholds. One of such towers with lower overall efficiency is the second one ( $0.06 < \eta < 0.25$ ); the explanation for that was already given in Section 13.7.2. Low efficiency and flatter efficiency curves in the 10th tower ( $1.24 < \eta < 1.36$ ) are due to a different reason: there is no RPC in the endcap in the region of large magnetic field at the

end of HE (outside of RE1/1 coverage). Hence the momentum resolution of the trigger system and overall acceptance are both impaired there.



**Fig. 13.39:** Examples of efficiency curves for selected  $\eta$  towers as a function of transverse momentum  $p_t$ .

The stability of the trigger rates due to uncertainties in the background and noise levels were specially studied [13.31]. Trigger rates for single muons from prompt sources are shown in Fig. 13.40 [13.31]. Solid line shows the prompt muon rate at the vertex. Three dashed curves show the influence of cluster size, RPC noise, and neutron background as predicted by [13.27]. The rates are strongly influenced only by the cluster size; going from single strip ( $cs=0$ ) to 1.8 strip ( $cs=1.3$ ) changes rate at high  $p_T$  by factor 2. The summary is shown in Fig. 13.41. For the nominal average cluster size, noise and background rates, the rate of fake triggers at high  $p_T$  is about one tenth of that from the prompt muons given on Fig. 13.40. To make the fake rates comparable with that from real muons the noise and background rate increase by a factor of five is required.

### 13.7.6 Transfer Losses in the OCS

Optimization of the Optical Communication System (OCS), and in particular the compression/ decompression scheme realized by LMUX/ LDEMUX chips [13.13] required dedicated simulation.

In first approximation, the OCS transfer loss influences the trigger's performance multiplicatively just like another mechanism leading to chamber inefficiency. Thus, the 1% loss is equivalent to 1% inefficiency. In reality, we would require losses at least factor 10 to 100 smaller.



**Fig. 13.40:** Single muon rates - within the  $|\eta| < 2.1$  as a function of  $p_T$ .

The dominant hit rate in the RPCs, which has to be transferred via OCS comes from the machine induced background (neutrons/ gammas, charged particles etc.). We have used the background rates from [13.27], and to test the system robustness we also applied a safety factor of two to them. The RPC noise of 10 Hz/cm $^2$  was added as well. The RPC performance, in particular cluster size, cross-talk between adjacent  $\eta$  regions of a chamber<sup>5</sup> and any correlations in time, is another vital ingredient of this simulation.

We decided to err on the cautious side, and, if anything, rather overestimate the transferred number of bits. Thus instead of using the cluster parametrization described in Section 13.7.2 we estimated the maximum cluster multiplicity from the RPC prototype irradiated in the Gamma Irradiation Facility (GIF) in 1998/99 [13.30]. The data from GIF have the advantage over muon test beam data; they contain clusters caused by gammas, which might be different from

<sup>5</sup>. This is of particular importance in the endcaps, where one RPC subtends several towers in  $\eta$ .



**Fig. 13.41:** The fake trigger rates due to RPC noise and neutron background. Different curves correspond to nominal and five times bigger source rates and are calculated for 3 different cluster sizes.

those from minimum ionizing muons. The gamma hit rate available covers the whole range expected in the CMS experiment. Thus we had a pessimistic estimate consistent with the GIF data from one chamber prototype. It has to be stressed that we have deliberately chosen the worst case we found during the beam tests.

The lay-out of the OCS, in particular the envisaged number of strip/ link in various regions of the detector, were described in [13.9]. This scheme was optimized for lowest number of links i.e. low cost, assuming simplest possible Link Boards. It is important to note that the presently envisaged scheme of link connections is based on physical chambers; each of the chambers has a Link Board with a number of links connected to it (from 1 link per 3 RPCs in the barrel to 2 links per RPC in the endcaps). For the barrel this causes no problem since strips in one RPC are not divided (in  $\eta$ ). For the endcap RPCs, which span several towers in  $\eta$  and only 10 or 20 degrees in  $\phi$ , it leads to the data from different towers and from the same chamber going through the same link. Thus, the cross-talk between  $\eta$  towers within an RPC becomes important. The alternative would be to complicate the Link Boards and send the data from the same  $\eta$  tower, but from several adjacent RPCs through the same link. This solution is more costly, since it leads to number of links larger by ci. 30%, and it also requires more complicated Link Boards.

Careful analysis of OCS performance including all possible effects mentioned above (such as RPC cross-talk) [13.30] lead to the following conclusions:

1. Transfer losses in the barrel are negligible (typically smaller than  $10^{-4}$ , everywhere below  $10^{-3}$  per link) even under pessimistic assumptions on clusters and cross-talk in the RPC, and assuming machine background factor two higher than nominal.
2. Transfer losses in the endcaps are also negligible if there is no cross-talk between the  $\eta$  towers. For the 100% cross-talk between adjacent towers, the losses reach unacceptable 2% at the outer regions of RE4 for noise of  $10 \text{ Hz/cm}^2$  and background rates factor two higher than nominal. Clearly, cross-talk probability has to be studied in more detail in full scale prototypes RPC now being constructed.
3. Cluster size depends on the RPC running conditions (HV, gas mixture). It has to be studied and optimized for full scale endcap RPC prototypes, and possibly individually adjusted during running.

Thus we conclude that the present OCS design is robust enough, but the RPC performance, especially in the forward region, still needs to be studied and optimized.

### 13.7.7 Robustness of the RPC Muon Trigger

In this Section we gather together and summarize the studies on the trigger robustness. To quantify the subject we have studied the influence of various parameters on the trigger rates:

1. Probably the most important single factor influencing the single rates is the RPC cluster size. Large cluster sizes result in flatter efficiency curves, and in promoting low transverse momentum muons to the higher  $p_T$ , which leads to lower trigger purity. We have checked that up to average cluster sizes of order of 3 strips the rates at high  $p_T$  are kept on the acceptable level of few kHz, while the rate dependence on the  $p_T^{\text{cut}}$  shows reasonably sharp fall between 20 and 100 GeV/c. This is important for adjusting the threshold level in the GMT.
2. The rates are stable (within 10%) against varying, by factor five with respect to the nominal, simulated or measured, values, the RPC noise and machine induced background.
3. Geometrical misalignment of the RPC chambers up to 0.5 cm in  $r\phi$  direction has practically no influence on the pre-defined PAC patterns, hence no influence on the trigger rates. There are also procedures foreseen to detect misalignment effects from the RPC data and appropriately correct the pre-defined patterns, when necessary.
4. We have identified several detector regions (see e.g. Section 13.7.5) which are more sensitive or less robust than the others, and we understand the reasons behind that, even when we cannot improve the performance there. In case of unacceptable trigger performance in these regions we are preparing strategies, which will allow us to run with decreased efficiency but acceptable rates in these regions.
5. The simulation results were mostly performed with a noise rate of  $10 \text{ Hz/cm}^2$ . However, the RPC trigger design has significant built-in flexibility to handle rates that are much higher. Recent technology evolution permits the use of higher bandwidth optical links of 2.4 GHz, in place of the 1.2 GHz links foreseen. This higher link bandwidth would prevent much higher noise and background rates from saturating the hit transmission system. The Trigger Board that processes the RPC hits has an FPGA along side the PAC chip. This FPGA can flexibly implement additional coincidences of chamber planes to

suppress false triggers from noise hits. The evolution of FPGA technology allows the implementation of even more sophisticated and flexible logic on the RPC Trigger Board than originally planned. In addition, for a small decrease in efficiency, an increase in chamber threshold also significantly reduces the noise rate. Other fully programmable adjustments include narrowing the time gate for and selecting more restrictive patterns of RPC hits.

## 13.8 Latency, Synchronization, BX Identification

### 13.8.1 Latency

The main contributions to the latency are given in Table 13.5.

**Table 13.5:** Main latency contributions

| Subsystem                                                   | latency in bx. |
|-------------------------------------------------------------|----------------|
| synchronization and data compression on the detector        | 8              |
| optical fiber delay (90 m)                                  | 18             |
| Trigger Board (splitters, demultiplexing, PAC, GB, sorting) | 29             |
| Final sorting tree, link to the GMT                         | 24             |
| Total                                                       | 79             |

The latency due to the multiplexing and demultiplexing in the OCS is now fixed at 8 bx. This must be considered as an upper limit since the LB design is not yet frozen. In view of rapid progress in the optical telecommunications it is not excluded that we will finally decide on the 2.4 Gb/s links, thus freeing some latency (2-3 bx).

The latency of the trigger decision proper - 29 bx is largely due to the sorting and ghostbusting, the trigger decision proper inside a PAC takes only 3 bx. With the appearance of large and relatively cheap FPGAs, which may replace the sorting ASIC described in Section 13.6.4, the sorting can be carried out more quickly (by as much as 8 bx). The cost implications of that are being continuously studied. The same comment applies to the final sorting latency of 24 bx.

To summarize, the latency contributions given in Table 13.5 are the best conservative estimates obtained from the studies of the prototypes. The final latency is likely to be lower than 79 bx.

### 13.8.2 The Synchronization Unit

Good detector time properties are important for the performance of the Muon Trigger, where the short bunch crossing time separation (25 ns) impose strict requirements on time resolution and “time walk”. The CMS Trigger System has a tree-like structure and the data flow through the entire chain is synchronous, driven by the 40.08 MHz clock.

The LHC Control system determines the interaction moment by driving RF and magnet currents. Its clock is distributed by the TTC system [13.2] to Front End and Trigger electronics. Particles created at the Interaction Point have non-negligible time of flight to the detectors. Generated detector signals are sent from the Front End boards to the Trigger Processors. The following requirements have to be fulfilled:

- Incoming data should be in phase with the local clock (clock phase adjustment)
- Data from different sources should correspond to the same bunch crossing bx (relative bx synchronization), and the data should correspond to the bx given by the local TTCrx (absolute bx synchronization)
- Phase adjustment at digitization: analog signals from the detectors have to be of a correct phase with respect to the clock in order to be correctly digitized and processed. The signals often have jitter due to time of flight, detector response and signal propagation.
- Phase adjustment of digitized signals: once the signals are digitized they have only electronics jitter which is usually below 1 ns. However, when the signals are sent to another board they might have a constant shift in phase in respect to the local clock at the destination. The rule to be followed is to synchronize the phase of the signal at the destination to the local clock. In order to synchronize the system real data or test data can be used. The first iteration, however, should be done without data, by measuring and calculating all the delays in the system. Test data provide an efficient way for partial synchronization of the system. However, final synchronization can only be done with real data, because there is no other way to measure precisely the path between the LHC Control master accelerator clock and the Front End synchronizers.

The synchronization procedures are extensively explained in [13.3], here we show only the various sources of possible signal indetermination and how they may be adjusted them to allow the correct bx alignment.

The heart of the RPC Trigger system is the PACT [13.4] which identifies tracks and assigns to them the right transverse momentum value. Before PAC, other components make up the entire system which have to be analyzed first.

The block diagram of the RPC PACT is shown in Fig. 13.3, the data flow and functionality of its various components are briefly described in Section 13.2. The TTC network is not shown on the Figure. There is one TTCrx per Link Board and one per each crate in the counting room.

The main components relevant for synchronization are SU and Synchronization Buffer (SBUF) residing on the Link Board. The SU performs WINdow phase adjustment. SBUF is used for bx synchronization. Synchronization with test data is performed by the test data generator (TEST DATA) and error detection circuit (CHECK).

The total delay time of the signals  $T_{\text{tot}}$  between the collision time and the signal arrival at the output of a FE has 4 contributions:

$$T_{\text{tot}} = T_{\text{flight}} + T_{\text{RPC}} + T_{\text{prop}} + T_{\text{FE}}.$$

$T_{\text{flight}}$  is due to the time of flight of the particle from the interaction point to detector.  $T_{\text{RPC}}$  depends on the processes of signal generation inside the RPC and on the drift towards the

electrodes and  $T_{\text{prop}}$  represents the signal propagation time along the read-out strip towards the FECs.

Finally,  $T_{\text{FE}}$  represents the delay time introduced by the amplification and discrimination process which is affected by three main sources of uncertainty. The first one is related to the “time-walk” introduced by the discrimination method used. In our case, using a zero-crossing discrimination technique, we reduce to nearly zero this error source, exploiting the fact that RPC signals have the same rise time and the same peaking time. The second one due to the limited gain-bandwidth product of the discriminator, which introduces the dominant contribution of error at small charge overdrive. The last is the “time-jitter” due to the electronic noise which in our case, in terms of equivalent noise charge (ENC), is less than 2 fC. The contribution from the preamplifier and discriminator jitter is usually much smaller than 1 ns, in fact a final value of  $\sigma = 0.7$  ns for  $T_{\text{FE}}$ , is obtained, by weighting the signal time distribution with the probability of occurrence of each charge value given by the charge spectrum (Fig. 13.42) [13.5]. Moreover, it is difficult to distinguish  $T_{\text{FE}}$  experimentally from the intrinsic RPC jitter, and therefore, the measured value of  $\Delta T_{\text{RPC}}$  often contains  $\Delta T_{\text{FE}}$ .



**Fig. 13.42:** 2 mm gas gap RPC signal time distribution

The total jitter allowed ( $\Delta T_{\text{Tot}}$ ) must be lower than 25 ns in order to recognize the bunch crossing. The two major contributions to the signal uncertainty as said, is given by ( $T_{\text{flight}} + T_{\text{prop}}$ ) and  $T_{\text{RPC}}$ . Assuming 1-3 ns for the setup time of the synchronization electronics and taking 5-7 ns as maximum variation for the first contribution, one gets 15-18 ns remaining for the jitter introduced by the RPC. In the case of gaussian distribution this would correspond to  $\sigma_{\text{RPC}} < 3.0$ -3.5 ns at 99% of efficiency. This value has been confirmed by the measurements made at test beam.

From the point of view of the detector, test beam results have shown [13.1] that one can reach an average value of  $\sigma = 1.6$  ns for a 2 mm gas gap RPC of  $130 \times 120 \text{ cm}^2$ . The 2 mm gas gap RPC chamber efficiency is plotted in Fig. 13.43 as a function of the time (with arbitrary zero) at which a 25 ns gate is opened. To achieve efficiencies higher than 95%, the time plateau widths are limited to 15 ns.



**Fig. 13.43:** Chamber efficiencies versus start time window

### The Synchronization Unit

The Synchronization Unit (SU) is a circuit which shapes the detector signals and aligns them with the LHC clock. It is performed in the following way (see Fig. 13.44). From the LHC clock (denoted as CK) provided by the TTCrx, a WINdow signal is derived. Its width and phase can be adjusted from 0 to 25 ns in 1 ns steps. The width should be adjusted in such a way that the rising edge of INPUT from the detector is always within the high level of WINdow. The WINdow should be wide enough to contain the jitter  $\Delta T_{\text{tot}}$ . The coincidence of the WINdow signal and the rising edge of the INPUT generates the OUTPUT signal which is 25 ns wide and is in phase with the clock. The OUTPUT might be delayed by 0-3 clocks using Synchronization Buffer (SBUF).

The SU is implemented in the programmable devices (FPGAs) that will sit on the Link Board, and the electrical scheme of one channel of the Synchronizer, is shown in Fig. 13.45. Inside the same device as the Synchronizer, one 16x128 words deep FIFO has been included, to allow histogramming of the occupancy of the RPC on a channel-by channel basis. This is very useful in order to assure correct synchronization with the LHC clock, and to reveal any dead channels. This FIFO is filled on any LHC clock during the test run, to adjust both the width of WINDOW signal, and its phase with clock. Test data patterns are pre-loaded into another FIFO residing on the Link Board and devoted to this function. The histogramming FIFO may be filled with real data on any bunch crossing, to continuously check the status of the entire electronics and to detect any malfunctioning.



**Fig. 13.44:** Alignment of RPC signals



**Fig. 13.45:** Synchronizer channel implementation scheme

## 13.9 Diagnostics and Calibration

### 13.9.1 General description

The Diagnostics capabilities of the RPC Muon Trigger System are based on monitoring and tests, as described below.

**Monitoring** allows the observation of the activity of the whole electronics system, detecting and registering malfunctions as they appear. Monitoring information is transferred via a

commercial local optical bus (e.g. CAN standard) to a Computer Diagnostic Manager System. The monitoring system provides information on the quality of the Muon Trigger signals, and delivers essential physical information (e.g. strip rates in the RPC chambers).

**Tests** allow the analysis of electronics function on the basis of test vectors (as e.g. digital sequencer) introduced by means of programmable simulators. Diagnosis of the tests is through the monitoring system.

Efficient diagnostics demands that diagnostic modules are placed at all critical points, and that RPC Muon trigger electronic channels should be divisible into sections which provide unique performance signatures which may be analyzed separately and/or hierarchically. The common functional structure of diagnostic modules is shown in Fig. 13.46



**Fig. 13.46:** Common functional structure of diagnostic blocks

Each diagnostic block surrounds processing layer with two diagnostic layers.

A **simulation layer** which provides tests by means of programmable simulators. Input simulators allow the introduction of programmable input data sequences to a process through an input selector. The DPMs and FIFOs may be used to input the test vectors. Output simulators send fixed data to further parts of RPC Muon Trigger system via output selector.

**A monitoring layer** is built of two elements:

1. An analyzer, a block for detection and analysis of discrepancies. The functional structure of an analyzer is dependent on the analyzed process, and is implemented in a programmable electronic chip (ASIC, PLD etc.)
2. Readout registering statistical information (e.g. quantity of detected irregularities, histograms etc.) and all information relevant to the appearance of the discrepancy (e.g. input and output data for 3 consecutive bx).

There is a difference between working with the CMS experiment global trigger (L1Accept) and the local diagnostic trigger. The global trigger is of a single-source type and its bx number is always known. The minimum separation of consecutive events (bx) is guaranteed.

Diagnostic trigger processes may, however, have many sources and appropriate bx numbers must be associated with data from every source separately, as registered events may overlap.



**Fig. 13.47:** Functional structure of Diagnostic Readout

The functional structure of Diagnostic Readouts is shown in Fig. 13.47. It consists of

- Pipeline memory which delays the data stream for a fixed number of clocks in order to synchronize with trigger signals
- Derandomizer, which selects data from the pipeline and stores it to a DPM together with information about trigger type and bx number,
- Statistical analyzer, which provides statistical information (e.g. chamber rate). It consists of programmable registers and counters,
- Synchronizer, which synchronizes Diagnostic Readout timing with TTC signals (clock, L1Accept etc.),
- Communication interface, which connects Diagnostic Readout to Computer Diagnostic Manager System. Monitoring is done by a local optical bus.

The presently envisaged diagnostic tasks of the RPC Muon Trigger system are described in detail in [13.7]. We envisage a system giving out statistical information on rates and data transfer quality (e.g. empty or lost data packets). The system will work in the monitoring mode during normal data taking. The diagnostic information from different levels may be requested in the dedicated runs, when corresponding input simulators will be switched on.

### 13.9.2 Diagnostics and Calibration on the Detector

The trigger electronics on the detector consists of FEBs and LBs. The former contain the tools to introduce test pulses to the inputs of FECs, but there is no monitoring and diagnostic

readout foreseen there. These, together with the control of FEB test pulses, are provided by the LBs, which also contain many tools for analyzing the FEB data quality. The LBs contain also monitoring and diagnostic tools for synchronization circuits and LMUXes, and are equipped with dedicated diagnostic readout.

FEB testing procedures are crucial and have a threefold purpose:

- a) to check the FEB connectivity,
- b) to monitor dead front end channels,
- c) and to set the correct phase between the LHC clock and the WINdow signal for synchronization.

The test pulse is differentiated inside the FE chip, in order to provide a negative delta-like pulse simulating the RPC signal, corresponding to the trailing edge of the test pulse. The test pulse width should be of 25 ns and the repetition rate should be less than 10 MHz.

The Synchronization Unit (SU) sitting on the Link Board (one per 96 channels at most) stores the FECs output data if they fit to a pre-defined time window within a bunch crossing period and synchronizes them to the next bunch crossing period. Each SU contains its own rate histogramming in every data channel.

To test the trigger electronics chain after the Synchronizer we use a special memory running at 40 MHz at the Link Board level. In the synchronization part of the Link Board the WIN pulses re-synchronize the IN pulses coming from the RPC. Taking into account the many sources of delay introduced in the signal transmission (FEC internal delay, board delay, cable delay, etc.) we must calibrate everything in such a way that signals coming from the same RPC fall in the same window interval. The test patterns sent to FEBs are stored into one FIFO sitting on the Link Board.

### 13.9.3 Diagnostics and Calibration in the Counting Room

Diagnostics and monitoring of the Counting Room electronics is divided into six functional parts, corresponding to the physical division of the system: Splitter, LDEMUX, Phi Ghostbuster, Eta Ghostbuster, Last Sorter layer, and Master and Slave Readout Boards. In accordance with the general philosophy described above, the tools at a given level are used to debug the preceding level.

At the Splitter, the first element of the Counting Room electronics, special diagnostic triggers are foreseen to capture the different aspects of the data received from the link. They can operate on the normal data or on the simulated test pulses. Test data to the Splitter may be introduced at its input. We foresee the monitoring histograms for all links, which may be selected for further analysis after their transmission through the dedicated diagnostics readout.

The LDEMUX, located on the Trigger Board, receives the compressed link data. At that level we can monitor the same data, real or simulated (test patterns), as at the Splitter level. The test patterns introduced at the Splitter may serve for the TB debugging.

The Phi GB diagnostics are mainly used to debug the preceding array of PAC processors. Test patterns injected at this level are useful for the debugging and monitoring of the sorter chain. To address and localize specific problems which may occur within this chain, we foresee two other levels, the Eta GB and the Last Sorter - where the test patterns may be injected and/or results read out. At the Last Sorter a final check of the RPC Trigger system is possible.

The Master and Slave Readout Boards are the branch of the data stream connected to the main DAQ. Input level diagnostics are similar to those at the LDEMUX level.

## 13.10 Milestones, Prototypes, Test Results

### 13.10.1 Important Passed Milestones:

The first version of PAC and Sorter ASIC prototypes were successfully tested on a test bench (D303). The second, final version of the PAC ASIC will be tested before Nov. 2000. This will lead to the completion of the D 311 milestone (Final choice of PAC technology) in 2000.

The Trigger Board (D305) and Readout Board (D306) are conceptually designed.

The Trigger Board passed the Interim review (D307).

The tests of various prototypes are described in more detail below.

### 13.10.2 Tests of Data Compression

LMUX and DEMUX tests were realized in Altera CPLD devices in 1997. Descriptions of the compression and decompression algorithms were realized in AHDL (Altera Hardware Definition Language). Using Altera's MAX PLUS II development system, compilation and simulation of algorithms were performed for different Altera devices. The Altera EPF8820-AGC192-4 was selected to construct the prototype of compression/decompression test circuit. Because of the size of the selected Altera circuits, the test project assumed the following compression/decompression set parameters:

- \* 20-bit length data word,
- \* 4-bit partition word (it is 5 partition in total),
- \* maximum delay set to 4.

The compression efficiency was not very high (approximately 2) in this case. However, the aim of the tests was to check the correct function of the algorithm, and its timing.

The test circuit was built as the a VME board. Test data were generated by an FDPM card (CERN, RD12). Results were checked with a HP LSA 1662C. Typical results of the tests are presented in Fig. 13.48.

The total latency of compression and decompression circuits is 9 clock periods. LMUX and LDEMUX circuits add one period additional latency each. The maximum frequency of the LMUX was 19.7 MHz, and of the LDEMUX - 29.6 MHz. The total maximum frequency of LMUX/DEMUX test circuit (with Flex 8820AGC-4) was 24 Mhz.

The synchronous compression/decompression algorithm was investigated. The simulation results for the Altera were confronted with the measurements on a test device. The agreement between the simulation and measurements was fair. This is encouraging for further work on compression and decompression circuits.



**Fig. 13.48:** Domain screenshot for one bx data containing 5 non-empty partitions

### 13.10.3 Irradiation Tests for the Link and FEB Components

#### FEB Irradiation Tests Results

In most parts of the muon system the integrated dose in over 10 years of LHC operation is less than 1 Gy. An FE board with two chips, has been irradiated with 11 mCi of  $\gamma$  source ( $^{60}\text{Co}$ ) using an irradiation facility available at the Bari University, achieving an integrated absorbed dose of 1.6 Gy after one year of operation. The performance was checked after irradiation to spot possibly damaging effect. The same has been performed using the LENA reactor at Pavia University. At the position where the test FE board was placed, an energy spectrum such that a neutron total flux of  $5.9 \times 10^5 \text{ n/cm}^2 \text{ s}$  with different energies ranging from less than 0.4 eV up to 10 MeV, was given, according the values shown in Table 13.6.

**Table 13.6:** Fluxes and total doses of neutrons in the FEB irradiation test

| Neutron Energy                          | $F_n(\text{n/cm}^2\text{s})$ | Total( $\text{n/cm}^2\text{s}$ ) |
|-----------------------------------------|------------------------------|----------------------------------|
| $E_n < 0.4 \text{ eV}$                  | $1.6 \cdot 10^3$             |                                  |
| $0.4 \text{ eV} < E_n < 10 \text{ KeV}$ | $3.4 \cdot 10^5$             | $5.9 \cdot 10^5$                 |
| $10 \text{ KeV} < E_n < 10 \text{ MeV}$ | $2.5 \cdot 10^5$             |                                  |

After these tests we conclude the following: for neutrons having energies between 0.4 eV and 10 MeV we reached a total dose of  $4.9 \cdot 10^{10} \text{ n/cm}^2$  corresponding to 10 LHC years in the Barrel and for neutrons having an energy less than 0.4 eV, we obtained a total dose of  $3.6 \cdot 10^{13} \text{ n/cm}^2$  corresponding to more than 10 LHC years of operation in the endcaps. After these

tests no permanent damage and no significant variations in the chip performances (gain sensitivity and noise, timing degradation and power absorption variation) has been observed. At the same time, on another two Front End boards a total dose of about 24 Gy of glorification was given, observing no appreciable changes in operation. Proton irradiation were also done at the Jyvaskyla University in Finland, where a 75 MeV proton beam with a current value of 120 pA was used to measure SEU and latch-up effects, using, in the last case a special circuit to count their occurrence and to prevent it. The total dose (fluence) received by the FE chip was  $5 \cdot 10^{12}$  protons/cm<sup>2</sup> and no evidence at all of the mentioned effects was seen during the first run February 2000.

### 13.10.4 LINX - the Data Transfer Test System

A versatile prototype LINX module has been designed to carry out real-time tests of the RPC data transfer. LINX is a PCI board that can also be used in stand alone mode through either an RS-232 or a USB interface. The key component on the board is a Xilinx Virtex XCV300 FPGA. Initial synthesis results show that the RPC data compression and decompression algorithm designs take up only a fraction of this FPGA.

The LINX includes an on-board fibre optic interface based on the AMCC S2061B serial backplane device and a commercial short wavelength optoelectronic transceiver component (VCSEL+PIN diode). Interfacing and configuration tasks are handled by a Motorola DSP56301 [13.12].

A test readout chain for RPC detectors has been realized with LINX modules. For carrying out development work without real RPC chambers, one to three LINX modules can be used to emulate data from an RPC. Data are transferred via LVDS cables to a LINX module functioning as an LB prototype. On this board the data are compressed and sent via the fibre optic link to another LINX serving as a Splitter and transferring the data to several destinations via LVDS cables. The chain is completed by a LINX module receiving the LVDS data and housing the LDEMUX. From there the data can be sent back to the FEB emulator boards via the fibre optic link that would otherwise be idle on these boards and compared with the data that was originally sent [13.9].

Many different aspects of the data transfer system have already been studied with the LINX prototype modules. Some encouraging results have been obtained in irradiation tests, where the LINX module was used to monitor SEUs (Single Event Upsets) in the link components. Synchronization issues will be thoroughly studied in the May 2000 LHC-like test beam at CERN, where a real RPC chamber will be read out using the prototype readout chain.

### 13.10.5 PAC Prototypes 1 and 2

#### Tests of the PAC ASIC Version 1

Fig. 13.49 shows the general scheme of the first version of the PAttern Comparator (PAC) processor. The PAC processor is described in Section 13.5.3. Because of the 0.7 μm ES2 technology limitations, version 1 of PAC contained only 4 track programming blocks, one quality bit only, and a cascading circuit to built the segment processor from 5 PAC ASICs. The final version of PAC in 0.35 μm will contain the logic necessary to built the segment in one ASIC, and cascading circuit will not be needed. The quality bit calculation circuit has been designed using dynamic logic. Programming of PAC was made by writing into JTAG internal registers. JTAG

internal registers 2-5 were used to program 4 track and code programming blocks. JTAG internal register 1 was used to program input and output masks and configuration of the cascade circuit.



**Fig. 13.49:** PAC version 1 - General scheme

Fig. 13.50 shows the layout of the PAC testing setup. It contains the PAC test board, JTAG Controller VME board, FDPM generator, HP 8110A Pulse Generator and HP1662C LSA Analyzer. Tests are driven by the VME controller running under Sun Solaris. Fig. 13.51 shows the scheme of the PAC test board, and Fig. 13.52 is its photo.



**Fig. 13.50:** Layout of the PAC version 1 test setup

Ten PAC version 1 devices have been delivered and tested. A frequency limitation of 30 MHz was detected, due to the dynamic logic implemented in the quality bit calculations.



**Fig. 13.51:** The functional scheme of PAC test board



**Fig. 13.52:** PAC test board

### Tests of the PAC ASIC Version 2

The test environment for the version 2 PAC differs from that for version 1 mostly in the more sophisticated online software. The test board is similar to that shown in the preceding section. The general outline of the testing setup has not changed, except that we use customized CMS fast pattern units instead of FDMPs. The online packages are closer to what we envisage for the final diagnostic software, and are based on the database of predefined patterns.

The tests are not yet finished (as of November 2000), but up to now the PAC version 2 prototype performs according to specifications. Some minor errors in the design have been detected, and will be corrected in the final design.

#### 13.10.6 Sorting ASIC Prototype

Preproduction and testing of the sorting ASIC prototype was one of the milestones for 1997. We received 15 Sorter prototypes in a ceramic PGA packages, having a total of 256 I/O pins.

The Test Board is a VME module consisting of a VME slave interface, some control logic implemented into a FPGA, 14 FIFOs and 1 Sorter. The FIFOs are used as the interface between the user and the Sorter: the input FIFOs are loaded via VME with a set of 512 test patterns at the VME clock frequency (16 MHz). Then, a Start command enables the Sorting processing: the input FIFOs are read at 40 MHz (or 66 MHz) and fed to the Sorter. The outputs are written into the output FIFOs and, after 512 clock cycles, the sorting is disabled. Finally, via VME, the results are read and compared with the expected results. In Fig. 13.53 the photograph of the VME board designed to test the Sorter is shown.



**Fig. 13.53:** Sorter Test Board

The sorter ASIC prototypes performed well, the design was fully validated.

### 13.10.7 Readout Board Prototype

Readout system tests were realized in Altera CPLD devices in 2000. Description of the Slave Readout (SR) and Master Readout (MR) algorithms were realized in AHDL (Altera Hardware Definition Language) using Altera's MAX PLUS II development system. The principle of the Readout System is based on two successive stages:

1. - data packet derandomisation in RS blocks. Each RS block receives data stream Both SR blocks work in parallel.
2. - building of common event in the RM block. The data for this event are read successively from SRs via a local bus. The MR block steers the DAQ-SLAVE block via a dedicated local bus: clock, trigger and addresses signals are distributed that way.



**Fig. 13.54:** Slave and Master Readout test boards

Fig. 13.54. shows the Slave Readout test board. There are two SR modules on the test board. LENGTH PIPELINE, DATA BUFFER and DATA LENGTH BUFFER are realized in DPMs. The DATA ANALYZER, in which synchronous data decompression is realized, is implemented in an Altera EPF10K20RC240-3 chip. The input signals from the compressed data stream may be fed in, in two ways:

- Electrically, by front connectors, used for test connection with the FDMP,
- Optically, through a fiber optic link.

Fig. 13.54 shows the Master Readout test board. There is one MS module on the test board. EVENT DATA BUFFER is implemented in a DPM. The COMMON DATA PACKER, which builds common event packet, is implemented on an Altera EPF10K30RC240-3 chip. The board may work autonomously (with an internal clock external trigger signal) or in conjunction with a TTC circuit; in the latter mode it additionally stores the event number and bunch crossing number). The event packet, stored in the event data buffer, is accessible via:

- - a VME interface for computer reading system,
- -a standardized DDU interface (implementation work is in progress)

The Readout test system assumes nominal parameters for CMS RPC trigger and works with a nominal clock 40 MHz (data transmission from Slave boards to Master board via internal bus is performed with 20 MHz).

**Table 13.7:** The foreseen schedule of the RPC Muon Trigger design and construction in the years 2000-2004



## 13.11 Status and Schedule

The foreseen schedule for the design and construction of the RPC Muon Trigger electronics is shown in Table 13.7. The items on the critical path are marked in red. They are Link

Board and Splitter Board design, prototype tests and validation, which have to be completed by mid 2001.

The LB design has to fit in with the FEB production which is now in the tendering stage. The FEB-LB interface has been agreed upon but some minor points concerning the monitoring and control functions of the LB, which acts as a slow control master for the FEBs connected to it, remain to be settled during November 2000.

The SB design and testing must be completed before the design of a Trigger Board is finalized. This puts a severe constraint on the SB design schedule.

The LB production must be coordinated with the mounting of muon chambers on the detector, which is foreseen to start in June 2nd, 2002. By that time, a fraction of LBs should be ready for mounting on the iron. This implies, as a minimal requirement, an ESR for the LBs at the end of 2001, immediately followed by LB production. It is only natural to demand at the same time the ESR for the underground counting room end of the Optical Communication System - the Splitter Board.

The PAC processor version 2 tests are now under way (see Section 13.10.5). The schedule allows some time for the eventual version 3 design and tests, but the final choice of PAC technology must be made by August 2001. Once this is done the PAC production should start as soon as possible. This implies ESR for PAC at the same time as for the LB/SB

## References

- [13.1] The Muon Project Technical Design Report, CERN/LHCC 97-32.
- [13.2] TTC distribution for LHC detector", IEEE Trans.Nucl.Science, Vol.45, nr.3, June 1998, pp. 821-828.
- [13.3] G. Wrochna, "Synchronization of the CMS Muon Detector", CMS CR 1998/017, CMS-IN 1998/007.
- [13.4] E. Piwowarska et.al. "PAC prototype ASIC for the CMS Muon Trigger", CMS IN-1999/021.
- [13.5] C. Binetti et.al. "A new Front-End board for RPC detector of CMS", CMS Note-1999/047.
- [13.6] M. Abbrescia et.al. "Local and global performance of double gap resistive plate chambers operated in avalanche mode" NIM A 434 (1999) 90-95.
- [13.7] M. I. Kudla et al. "Diagnostics and Calibration of the RPC Muon Trigger electronics ", CMS Note in preparation.
- [13.8] G. Iaselli et.al. "Study of Detailed Geometry of Barrel RPC Strips", CMS Note 2000/044.
- [13.9] K. Banzuzi, E. Pietarinen, M. Konecki, J. Krolikowski, M. I. Kudla, M. Gorski, G. Wrochna and P. Zalewski, "Layout of the Link System for the RPC Pattern Comparator Trigger", CMS IN-2000/043.
- [13.10] F. Loddo "A prototype Front end chip for the CMS Resistive Plate Chambers", CERN/LHCC/99-33.
- [13.11] K. Banzuzi et.al., "FEB-LB interface" CMS Note in preparation.
- [13.12] K. Banzuzi and E. Pietarinen, "LINX: prototyping environment for the CMS RPC fibre optic links", HIP Internal Report HIP-1999-07/I.

- [13.13] Maciej Górski, Ignacy M. Kudla, Krzysztof Pozniak, "High Speed Data Transmission and Compression for the CMS RPC Muon Trigger", Proceedings of the 3rd Workshop on LHC Electronics, London 1998.
- [13.14] K. Banzuzzi, E. Pietarinen, "Irradiation test results for the LINX test board components", CMS Note in preparation.
- [13.15] M.L. Andrieux, B. Dinkespiler, G. Evans, L. Gallin-Martel, J. Lundquist, O. Martin, M. Pearce, J. Ye, "Irradiation Studies of Gb/s Optical Links Developed for the Front-end Readout of the ATLAS Liquid Argon Calorimeter", to be published in Nuclear Physics, presented at the Como conference, October 5th-9th, 1998.
- [13.16] M. Kudla, private information based on "RPC Trigger Crate", Trigger Internal Review CERN-TRIDAS Meeting, 8-9 November 1999.
- [13.17] "PAC Prototype ASIC for the CMS Muon Trigger", CMS IN-1999/021.
- [13.18] Z.Jaworski et al., "RPC - Pattern Comparator (PAC) ASIC", NIM A 419 (1998), 707.
- [13.19] K. Pozniak private information based on "RPC Muon Trigger - Readout System", CMS IN Note in preparation.
- [13.20] "TTC distribution for LHC detector", IEEE Trans.Nucl.Science, Vol.45, nr.3, June 1998, pp. 821-828.
- [13.21] "RPC Muon Trigger - Global Muon Trigger Interface", CMS IN Note in preparation.
- [13.22] A. Fengler, M. Kudla and P. Zalewski, "Ghosts Buster for the RPC Based Muon Trigger", CMS TN /98-012.
- [13.23] M. Kudla and P. Zalewski, "ALTERA Implementation of the PACT Ghosts Buster", CMS IN Note in preparation.
- [13.24] G. De Robertis et.al. "The sorting processor project", CMS TN 95/28.
- [13.25] G. De Robertis et.al. "A high speed Sorting Processor ASIC for the RPC trigger system of the CMS experiment", NIM A436, (1999) 394-400.
- [13.26] G.Iaselli et.al. "Endcap RPC Mounting and Trigger Performance, CMS IN-2000/018.
- [13.27] M. Huhtinen et.al. "Radiation Environment simulation for the CMS detector", CMS TN/ 95-198, and CMS Note 2000/068
- [13.28] A. Colaleo et. al. "Performance of the First RPC Station Prototype for the CMS Barrel Detector", CMS CR-2000/008, and private information from G. Bruno.
- [13.29] A. Fengler, M.Sc. thesis (Warsaw University, 1995), unpublished.
- [13.30] M.Cwiok et. al., "Summary of transfer losses in the Optical Communication System of the RPC Muon Trigger", CMS IN-2000/ in preparation, and private information from M. Gorski.
- [13.31] Private information from G. Bruno, M. Konecki and P. Zych.
- [13.32] M. Konecki et.al., "Simulated Geometry of the RPC System in CMSIM 118-120 and ORCA 4.2", CMS IN-2000/054.

# 14 Global Muon Trigger

## 14.1 Requirements

### 14.1.1 Functional Requirements

The L1 CMS Muon Trigger identifies muons, determines their transverse momenta and locations and assigns the trigger data to the correct bunch crossing (bx). CMS has a sophisticated muon system consisting of special dedicated trigger chambers, the Resistive Plate Chambers (RPC) and normal tracking chambers, the Drift Tube (DT) and Cathode Strip Chambers (CSC). All three systems are used in the trigger. The RPCs produce a pattern of hits, the Drift Tube and Cathode Strip Chambers deliver track vectors at each muon station. Regional triggers of the three subsystems, the PAC Trigger of the RPCs and the DT and CSC Track Finders, independently deliver muon tracks. In most cases a muon will be found by both the RPC Trigger and the DT or CSC Track Finders. Due to different detector types, different trigger algorithms and slightly different geometry, the DT/CSC and the RPC trigger systems show different efficiencies in  $\eta$  and  $\phi$ , different precision in momentum measurement and different response to background.

The task of the Global Muon Trigger (GMT) is to combine the results of all subsystems by finding the best four muons in every bunch crossing and to transmit them to the Global Trigger (GT). The two extreme cases of such a combination would be the logical OR, which is optimized for efficiency, and the logical AND, which is optimized for background rejection. Making use of the quality information associated with the tracks from the regional triggers the GMT applies a more sophisticated algorithm which can be generalized as follows: A candidate will be forwarded if it was seen by both RPC and DT/CSC system regardless of quality. If the candidate was seen by only one system quality criteria are applied to decide whether to forward it. Specific quality criteria can be applied depending on detector types, detector regions and transverse momenta. Using this algorithm the GMT achieves high overall efficiency, a smoothing effect in problematic regions such as cracks and a powerful background rejection. It also reduces the number of fake muons (ghosts) found in any single regional trigger.

In addition to finding the best four muons the GMT appends two bits set by the Calorimeter Regional Trigger, a MIP bit and a Quiet bit (Isolation bit) to every muon data record. The MIP bit is set if the calorimeter energy is consistent with the passage of a minimum ionizing particle, the ISO bit is set if a certain energy threshold in the calorimeter trigger towers surrounding the muon is not exceeded. Both bits are used in the GT to suppress background and to improve selectivity.

### 14.1.2 Input Requirements

The GMT expects input data from the regional muon trigger systems to fulfill the following criteria:

- a) Common and sharp  $\eta$  boundaries: The DT and the CSC regional muon trigger systems deliver muons from the barrel and the endcap regions, respectively. The GMT expects the boundary in  $\eta$  between the two systems to be as sharp as possible to avoid double counting

of muon candidates found in both systems. It also expects the RPC Trigger to choose the same fixed limit between its barrel and forward parts to avoid double counting of muons found once in the barrel part and once in the forward part of a regional trigger.

Both cases of double counting can be cured by requiring an AND-condition between DT/CSC muons and RPC muons in the small region of  $\eta$  where the above mentioned criteria cannot be fulfilled. As this practice reduces efficiency it should however be limited to as small a region in  $\eta$  as possible by optimizing the  $\eta$  boundaries at the regional trigger level.

- b) Bunch crossing corrected data: The GMT requires the bunch crossing assignment or correction to be already done since it does not correct data in time and matches only candidates from the same bunch crossing. It does, however, synchronize the different latency subsystems to each other.
- c) Ghost suppression: The GMT checks only if muons from the RPC and DT/CSC systems are related to each other. Checks for identical muons (ghosts) found two or more times within the same trigger subsystem are not made. Ghosts can however be suppressed by cancelling out muons based on their quality bits if they are not confirmed by the complementary system.
- d) Identical reference frames: All input data should use the same coordinate system and scale for  $\eta$  and  $\phi$  and the same definition and scale for  $p_T$  to save latency and to avoid loss of precision due to conversion.

This does not mean that the  $p_T$  scale cannot be adapted according to changing physics requirements, it only requires all systems to change their scales simultaneously. Heavy ion and initial low luminosity proton-proton runs will in general require a finer scale for the lower momentum range than high luminosity “discovery” runs.

For the  $\eta$ -coordinate a subsystem-specific scale can have advantages over the common scale. The GMT might for example need to perform a certain action only for specific RPC towers. To facilitate this the  $\eta$ -coordinate from the RPC trigger can be coded as a tower number and converted to a common scale at a later stage in the GMT.

## 14.2 System Overview

The Global Muon Trigger logically belongs to the L1 Muon Trigger System. Physically, however, it is part of the Global Trigger (GT) described in Chapter 15. It consists of two logic boards and three pipeline synchronizing input boards mounted in the left part of the Global Trigger crate as shown in Figure 14.1.

The two logic boards receive muons from all regional muon triggers: the barrel logic board receives four muons from the barrel Drift Tube Trigger Track Finder and four central muons from the Resistive Plate Chamber Trigger; the forward logic board receives four muons from the Cathode Strip Chamber Track Finder and four forward muons from the RPC Trigger. A candidate muon is described by its transverse momentum, the sign of the charge, its location expressed in  $\eta$  and  $\phi$ , information about its quality and several technical bits used for synchronization and error-detection. The 252 MIP and 252 Quiet bits denoting compatibility with a minimum ionizing particle and calorimeter isolation from the Regional Calorimeter Trigger are received by the synchronizing boards - the Pipeline Synchronizer and Buffer (PSB) modules. All bits are sent via the GT backplane to the GMT logic boards.



**Fig. 14.1:** The Global Muon Trigger in the Global Trigger Crate

In a first step, synchronization circuits on each of the two GMT logic boards and on the PSB boards align all input channels to the LHC orbit and to each other in time. Then the matching logic spatially compares the CSC with the forward RPC muons and concurrently the DT with the barrel RPC muons to detect candidates found in both systems. In parallel to the matching procedure a rank is determined for each muon candidate. Ranks are functions of  $p_T$ , quality and  $\eta$  and may depend on the detector type. If matching muons are found, the muon merger logic merges the parameters of the two candidates and forwards the resulting new candidate with an increased rank. In a last step the resulting matched muons and the remaining unmatched muons are sorted by their rank. Unmatched muons with low rank can be suppressed to reduce possible fake muons (ghosts). The selection logic then selects the best four muons to be sent to the Global Trigger. In parallel to the matching and ranking procedures the GMT determines for each muon if it passed through a calorimeter region with MIP or Quiet bits set and appends the corresponding bits to the muon data. The complete output data are sent via the custom-designed backplane to the Global Trigger Logic (GTL) boards.

Figure 14.2 shows the two GMT logic boards. Each board contains two Input FPGAs to synchronize all muons and one GMT Logic FPGA containing the match logic, pair logic and muon merger logic as well as the first stage of the sorter. The Final Sorter FPGA chip, which is only activated on one of the boards, contains the final sorting circuits. It receives four muons from the first stage of sorting on the same board and four muons from the first stage of sorting on the other board through a board-to-board connector. For each muon channel a Dual Port Memory (DPM) chip contains a Single Rank LUT which assigns a rank based on the  $p_T$ , quality and  $\eta$ -coordinate of the muon. The MIP and Quiet bits are received on Channel Links via the backplane and are then interchanged between the two logic boards via a second board-to-board connector. The projection

FPGA contains the logic to assign the MIP and Quiet bits to the muons. For each muon channel a DPM chip contains a Projection LUT which is used to obtain the corresponding calorimeter region by backwards extrapolation. Each of the logic boards contains Ring Buffer DPMs to save all muon data for readout requests due to L1 Accepts or for monitoring tasks. They are located in the Input, GMT Logic and Sorter FPGA chips. Besides using standard PSB input boards the GMT shares the timing and clock circuits and the readout logic with the Global Trigger.



**Fig. 14.2:** Global Muon Trigger Logic Boards

## 14.3 Input Processing of DT, CSC and RPC Data

### 14.3.1 Input channels

The regional muon trigger data arrive at the logic boards on Shielded Twisted Pair (STP) flat cables, in parallel at 40 MHz. The LVDS receivers convert the signals to low voltage TTL level. A muon data word contains information about  $p_T$  (5 bits),  $\phi$  (8 bits),  $\eta$  (6 bits), quality (3 bits), charge (1 bit). Table 14.1 shows the non-linear  $p_T$  scale that is common to all subsystems and the output of the GMT. The  $p_T$  scale is defined at 90% efficiency. Each regional trigger delivers four muons in the barrel and the forward regions sorted by rank. If fewer than four muons are found then the channels with the lower ranks are empty. Empty channels contain all zero or at least  $p_T = 0$  and quality = 0.

**Table 14.1:** 5 bit coding of the transverse momentum assignment.

| Bit Code         | 0          | 1   | 2   | 3   | 4   | 5   | 6   | 7    | 8    | 9    | 10  |
|------------------|------------|-----|-----|-----|-----|-----|-----|------|------|------|-----|
| $p_T$<br>(GeV/c) | Null Track | 0.  | 1.5 | 2.0 | 2.5 | 3.0 | 3.5 | 4.0  | 4.5  | 5.0  | 6.0 |
| Bit Code         | 11         | 12  | 13  | 14  | 15  | 16  | 17  | 18   | 19   | 20   | 21  |
| $p_T$<br>(GeV/c) | 7.0        | 8.0 | 10. | 12. | 14. | 16. | 18. | 20.  | 25.  | 30.  | 35. |
| Bit Code         | 22         | 23  | 24  | 25  | 26  | 27  | 28  | 29   | 30   | 31   |     |
| $p_T$<br>(GeV/c) | 40.        | 45. | 50. | 60. | 70. | 80. | 90. | 100. | 120. | 140. |     |

### 14.3.2 Input synchronization

The GMT synchronizes all input muons to the LHC orbit and then to each other. All input bits are sampled four times per bunch crossing at 160 MHz. The phase sampled furthest from the switching time of the input word is selected and sent to the following delay. To save latency, pipeline registers are used as programmable delays as shown in Figure 14.3. A delay can be as small as 1 bx and is defined by switching data from one of the pipeline registers to the output drivers. A distributed multiplexer circuit keeps all delays between flip-flops as small as possible and even allows connection of an input sampling flip-flop to the output. The pipeline delays compensate different latencies and synchronize different channels to each other.

A start register synchronizes the input data to the LHC orbit defining the start time for writing data of the first bx into the first address of the monitoring DPM. The address of the memory is equal to the bunch crossing number modulo the memory size. The content of the start register has to be the same for all muon channels.

## 14.4 Input from the Global Calorimeter Trigger

The Regional Calorimeter Trigger compares the transverse energies, deposited in  $\Delta\phi \times \Delta\eta = 0.35 \times 0.35$  calorimeter regions, with thresholds reflecting calorimeter isolation and minimum ionizing particles and sends the results as 252 Quiet and 252 MIP bits to the Global Calorimeter Trigger (GCT). The GCT combines all bits and sends them on 36 Channel Link cables to the GMT.

3 PSB modules receive and store the bits and send them after a programmed delay via the backplane to the GMT boards. The synchronization procedure for PSB boards is described in Chapter 15. The delays for the MIP and Quiet bits are selected to append them correctly to the muon data. The adjustment can be checked in software by comparing the content of the PSB modules for the GMT with the GMT output data.



**Fig. 14.3:** Synchronization pipeline in the input FPGA chips

#### 14.4.1 ISOLATION and MIP bit logic

The GMT correlates a muon with a quiet region fulfilling an isolation criterion and adds the ISO (or Quiet) bit to the trigger data. The ISO bit can be used to suppress the rate of background and prompt muons from heavy quark decays when triggering on muons not accompanied by jets. The MIP bit is added to a muon if it passed through a calorimeter region with an energy deposit compatible with that of a minimum ionizing particle.

Each input muon is projected back to the calorimeter. The calculation uses the charge and  $\eta$ ,  $\phi$  and  $p_T$  with a reduced precision. If the projection points to an area with a Quiet or MIP bit set the corresponding bit is set in the muon data word. If a muon hits a boundary both calorimeter regions concerned contribute to the bit assignment.

### 14.5 Matching, Merging and Sorting Logic

#### 14.5.1 Match and Pair Logic

The GMT compares the DT with the barrel RPC muons and in parallel the CSC with the forward RPC muons to find muon candidates close to each other in  $(\eta, \phi)$ -space. Match Qualities for all possible pairs are obtained from LUTs to digitize the likeness of two candidates. The Match Quality is a programmable function of the differences in  $\eta$  and  $\phi$ . For example it can be chosen inversely proportional to a weighted sum of the squares of the differences in  $\eta$  and  $\phi$ .

Figure 14.4 shows the algorithm of the pair logic. A matrix of match qualities is calculated for all possible pairs based on proximity in space. The pair logic finds pairs with maximum match quality and takes care that a muon is only used in one pair. The results are stored in a Pair Matrix. A matrix element  $\text{PAIR}(i,k)=1$  means that the DT muon ( $i$ ) and the barrel RPC muon ( $k$ ) are measurements of the same physical muon. For each possible pair the combinatorial equations are calculated in parallel. Empty channels are removed by the condition that the  $p_T$  value of a candidate has to be greater than zero.



**Fig. 14.4:** The pair logic for DT and RPC muons. A second pair logic pairs the CSC and RPC muons.

### 14.5.2 Rank assignment

In parallel to the matching logic a LUT assigns a rank for each input muon taking into account  $p_T$ , quality and to some extent  $\eta$  and the detector type. The quality of a muon candidate indicates the number of stations involved and the precision of the  $p_T$ -assignment method used for the candidate. Normally the rank increases with  $p_T$  and quality of the muon. The rank is used by the muon merger logic and sorter to combine and sort muons. The rank can also be used to indicate that an input muon should be suppressed in the sorter if it is not confirmed by the complementary system.

### 14.5.3 Muon Merger Logic

The muon merger logic receives as input the parameters and single ranks of a matched pair of muons. It merges the parameters to form a new muon candidate with new parameters and an increased rank. For the muon merger logic three different implementations are foreseen:

- winner/loser implementation
- parameter selection implementation
- parameter mixing implementation

The final choice of implementation will depend on the actual performance of the regional trigger systems and on latency restrictions. In winner/loser implementation the muon merger logic forwards all the parameters of the muon with higher rank (winner). It discards the parameters of the muon with lower rank (loser). The winner is forwarded with a final rank equal to its rank increased by a value that depends on the match quality of the pair.

In parameter selection implementation the muon merger logic composes a new muon candidate from the parameters of the two input muons. Each parameter is selected either to come from the DT/CSC input muon or from the RPC input muon. The criteria for choosing a parameter can a) be fixed (e.g. always take the  $\phi$ -coordinate from the DT muon), b) depend on the ranks of the two muons or c) depend on another condition made up from the input parameters and ranks of the two muons. In case c) the complexity of the condition is limited by the availability of RAM blocks for look-up tables inside the GMT logic FPGA.

In parameter mixing implementation the muon merger logic has all the functionality of the parameter selection implementation. Additionally, it can combine the two measurements of a parameter into a new measurement by a simple arithmetic function. For instance it can calculate a weighted average  $p_T$  from the two measurements of the  $p_T$  or take the minimum of the two measurements.

#### 14.5.4 Sorting Logic

Matched muon candidates from the muon merger logic and unmatched candidates found only in one system are forwarded to the sorting logic. The muons are sorted according to their final rank. For unmatched muons the final rank is equal to their respective single rank, for matched muons the final rank is assigned in the muon merger logic as described above.

A multiplexer switches the best muon to the first output channel, the next multiplexer the second best muon to the second channel and so on. The four muons with the highest rank are sent to the Global Trigger. The multiplexer is split in two sets of four multiplexers sorting the forward and barrel muons separately and a set of four multiplexers doing the final sorting. The first sets of multiplexers are integrated in the GMT logic FPGAs on the two GMT logic boards. The final set of multiplexers is implemented in a separate sorter FPGA which receives the pre-sorted muon candidates from both logic boards.

### 14.6 Latency

In order to keep the latency as small as possible the GMT receives muon candidates as parallel data, uses a fast synchronization circuit, does calculations in parallel and sends the output muons as parallel data via the backplane to the Global Trigger. The Calorimeter Quiet and MIP bits are received by standard PSB modules. In Chapter 15 the overall GMT-GT latency can be found. Figure 14.5 shows the details. Table 14.2 shows the distances of the electronics connected to the GMT, the corresponding delays and link technologies.

**Table 14.2:** Link environment of the GMT

| Connection         | Distance | Latency | Link Technology    |
|--------------------|----------|---------|--------------------|
| GCT to GMT         | 6 m      | 4 bx    | Channel Link       |
| RPC-barrel to GMT  | 8 m      | 3 bx    | parallel LVDS      |
| RPC-forward to GMT | 6 m      | 3 bx    | parallel LVDS      |
| CSC to GMT         | 6 m      | 3 bx    | parallel LVDS      |
| DT to GMT          | 6 m      | 3 bx    | parallel LVDS      |
| GMT to GT          | 0.3 m    | 1 bx    | par. on back plane |

**Fig. 14.5:** Estimated latencies in the Global Muon Trigger

## 14.7 Output Processing and Monitoring

One set of DPMs stores the records of the 16 input muon candidates and another one stores the records of the four output muons. The MIP and ISO bits are added to the in- and output muon records in order to be able to check the projection logic later. The Calorimeter Quiet/MIP bits are stored in the DPMs of the PSB boards. All DPMs run as ring buffers with a length of 2048

bx. Data from bx=0 are written into address=0 so that the desired bx-number of a readout request is equal to the memory address (modulo 2048).

In case of a monitoring or L1A readout request the standard Readout Processor (ROP) collects data from all DPMs, adds format words and sends the GMT-record(s) via the backplane to the readout board of the Global Trigger (GTFE). If the readout request cannot be fulfilled an empty record will be sent. The Detector Dependent Unit (DDU) on the GTFE board combines the GMT data with the GT data to send them as one event record to the Data Acquisition System.

## 14.8 Simulation Results

The performance of the GMT has been simulated using the detailed CMS simulation software. For previous studies ([14.1]) CMSIM has been used to simulate the whole trigger chain. In recent studies particles have been tracked through the detector up to the level of hits using CMSIM 118 (based on GEANT 3.21). The digitization, trigger primitive generation and simulation of the regional muon triggers and the Global Muon Trigger have been performed with the object-oriented simulation software package ORCA4.

### 14.8.1 Samples

For the study of efficiency, ghosts and turn-on-curves a sample of 100,000 single muon events has been used. The events are evenly distributed in charge, in  $\eta$  from -2.4 to 2.4, in  $\phi$  from  $0^\circ$  to  $360^\circ$  and in  $p_T$  from 3 GeV/c to 100 GeV/c. The background and trigger rate studies have been performed on the single muon minimum bias, W and Z samples described in Section 8.4.1.

### 14.8.2 GMT Algorithm

As described in the previous sections the GMT algorithm is very flexible and can be adjusted according to the actual performance of the subsystems by changing the contents of LUTs and choosing the right implementation for the muon merger logic. These LUT contents as well as the choice of implementation and tuning of the muon merger logic define the performance of the GMT. The LUTs and muon merger logic have been optimized according to the simulated performance of the regional triggers as described in [14.2] and will later be optimized according to their actual performance.

The optimized algorithm can be summarized as follows: the muon merger logic has been simulated in parameter mixing implementation. For pairs of matched muons the lower one of the two  $p_T$  measurements has been assigned to the output muon candidate. Quality criteria have been applied for muons not confirmed by the complementary system. The single rank tables have been computed to reject the following types of muon candidates if they are unconfirmed:

- a) RPC of quality code 1 in all towers
- b) RPC of quality code 0 in towers 5, 8-10, 14-16 ( $0.7 < |\eta| < 0.83$ ,  $1.04 < |\eta| < 1.36$  and  $|\eta| > 1.73$ )
- c) CSC of quality 2 or lower with  $|\eta| \leq 2.10$
- d) CSC of any quality with  $|\eta| < 1.06$  (to require confirmation by RPC in overlap region)
- e) DT of any quality with  $|\eta| > 0.91$  (to require confirmation by RPC in overlap region)

Table 14.3 shows a comparison of the performance of the optimized algorithm with the performance of a logical OR-algorithm that accepts all muons seen by at least one system and the

performance of a logical AND-algorithm that only accepts muons seen by two systems. The OR-algorithm results in the highest possible efficiency but also a high number of ghosts and very high trigger rates. The AND-algorithm results in a low number of ghosts and lowest possible trigger rates but causes gaps in efficiency in several regions of pseudorapidity. The optimized algorithm keeps the efficiency close to the highest possible efficiency while keeping the trigger rates close to the lowest possible rates. It also results in a very low number of ghosts. Unless otherwise indicated all the following results were obtained with the optimized algorithm.

**Table 14.3:** Performance of the optimized GMT algorithm compared to the performance of a logical OR and a logical AND-algorithm

| GMT Algorithm    | Efficiency    | Ghosts       | Trigger Rate<br>$p_T \geq 25 \text{ GeV}/c$ |
|------------------|---------------|--------------|---------------------------------------------|
| Logical OR       | 97.61%        | 1.60%        | 22.6 kHz                                    |
| Logical AND      | 89.70%        | 0.07%        | 7.2 kHz                                     |
| <b>Optimized</b> | <b>96.08%</b> | <b>0.18%</b> | <b>8.1 kHz</b>                              |

### 14.8.3 Efficiencies

The DT/CSC and the RPC trigger systems show different efficiencies in  $\eta$  and  $\phi$ , and different behavior in  $p_T$  due to cracks, construction properties and different trigger algorithms. The GMT can make use of these differences resulting not only in a higher efficiency overall but also a smoothing effect in the less efficient regions of  $\eta$  and  $\phi$ .

Figure 14.6 shows the efficiencies as a function of the pseudorapidity  $\eta$  for the Global Muon Trigger and the regional muon triggers. Efficiency in this and the following figures is defined as the probability to find at least one muon of any  $p_T$  for events in which one muon was generated. It can be seen that the GMT improves the efficiency over the whole range of  $|\eta| < 2.1$ . For  $2.1 < |\eta| < 2.4$  the efficiency becomes equal to the efficiency of the CSC system as RPC coverage only extends up to  $|\eta| < 2.1$ . The gaps in efficiency result from the geometric acceptance of the muon system. As the geometric acceptance of the RPC and the DT/CSC systems are slightly different the GMT can partly recover the efficiency loss in these gaps.

Figure 14.7a and Figure 14.7b show the efficiencies of the Global Muon Trigger and the regional muon triggers as a function of the  $\phi$  coordinate for the barrel ( $|\eta| < 1.04$ ) and endcaps ( $|\eta| > 1.04$ ), respectively. It can be seen that the GMT improves the efficiency over the whole range of  $\phi$ . The gaps in efficiency correspond to the  $30^\circ$  and  $20^\circ$  sector boundaries in the barrel and endcap region, respectively. As for the efficiency gaps in  $\eta$  the GMT can partly recover the efficiency loss in these gaps.

The efficiencies of the Global Muon Trigger and the regional muon triggers as a function of the transverse momentum of the simulated muons are shown in Figure 14.8a and Figure 14.8b. The GMT improves the efficiency with respect to the efficiency of each of the regional systems in either case.

Table 14.4 shows the overall efficiencies and percentages of ghosts for the three subsystems and the GMT. It clearly shows how the GMT improves overall efficiency.



**Fig. 14.6:** Efficiencies of GMT, CSC, DT and RPC versus  $\eta$



**Fig. 14.7a:** Efficiencies of GMT, CSC, DT and RPC versus  $\phi$ , barrel



**Fig. 14.7b:** Efficiencies of GMT, CSC, DT and RPC versus  $\phi$ , endcap



**Fig. 14.8a:** Efficiencies of GMT, CSC, DT and RPC versus  $p_T$ , barrel



**Fig. 14.8b:** Efficiencies of GMT, CSC, DT and RPC versus  $p_T$ , endcap

**Table 14.4:** Efficiencies and Ghosts of the Regional Triggers and the GMT

| System                    | Efficiency to find at least one muon | Percentage of Ghosts |
|---------------------------|--------------------------------------|----------------------|
| RPC $0 <  \eta  < 2.1$    | 94.4%                                | 0.99%                |
| DT $0 <  \eta  < 1.04$    | 93.3%                                | 0.10%                |
| CSC $1.04 <  \eta  < 2.4$ | 92.4%                                | 0.34%                |
| GMT $0 <  \eta  < 2.4$    | 96.1%                                | 0.18%                |

#### 14.8.4 Ghosts

Figure 14.9 shows the percentage of duplicated muons of any  $p_T$  (events in which one muon was generated and two were reported) versus  $\eta$ . Separate curves show the percentage of duplicated muons for the regional triggers and the GMT. It can be seen that the GMT reduces the rate of ghosts in regions where there is a high ghost rate at the level of the subsystems, by requiring low-quality candidates to be confirmed by the complementary system. It can also be seen that the required AND-condition in the overlap region improves the rate of duplicated candidates while still maintaining an acceptable efficiency (see Figure 14.6). The overall percentages of ghosts in the three subsystems and the GMT are given in Table 14.4.



**Fig. 14.9:** Percentage of fake double muons for GMT, DT, CSC and RPC versus  $\eta$

### 14.8.5 Turn-on curves

Figure 14.10 shows the turn-on curves as a function of  $p_T$  for several trigger thresholds (defined at 90% efficiency). For RPC and GMT the turn-on curves are shown separately for barrel and endcap regions. It can be seen that the turn-on curves at the output of the GMT are steeper and reach a higher plateau.

### 14.8.6 Trigger Rates

Figure 14.11 shows the integrated single muon trigger rates of the regional triggers and the GMT as a function of the  $p_T$  threshold applied (defined at 90% efficiency with the measured  $p_T$  being greater than or equal to the  $p_T$  cut) at a LHC luminosity of  $10^{34} \text{ cm}^{-2}\text{s}^{-1}$ . Two separate plots show the rates for DT, RPC and GMT in the barrel and CSC, RPC and GMT in the endcap. In these plots the generated muons as well as the reported muons have been restricted to the pseudorapidity ranges indicated in the plots. A combined plot shows the integrated single muon trigger rates for the whole detector. Each subsystem can be optimized to work in standalone mode and reduce the rate without confirmation of the complementary system. Standalone performance of DT, CSC and RPC Systems is given in Sections 10.10, 12.9 and 13.7, respectively. However, the best overall performance and robustness can only be achieved using the GMT. In this case the regional triggers do not apply any cuts on the reported muon candidates and may deliver a high rate. The GMT makes use of all the information from the regional triggers to reduce the total muon rate while still maintaining high efficiency. For the GMT Figure 14.11 also shows the trigger rate for two fictitious algorithms: a logical AND-algorithm (dashed) and a logical OR-



**Fig. 14.10:** Turn-on curves for DT, CSC, RPC and GMT as a function of  $p_T$  for several  $p_T$  thresholds. RPC and GMT plots are shown separately for barrel and endcap



**Fig. 14.11:** Integrated single muon trigger rates at  $L=10^{34} \text{ cm}^{-2}\text{s}^{-1}$  for the trigger subsystems and the GMT as a function of the  $p_T$  threshold

algorithm (dotted). It can be seen that the rate delivered by the (optimized) GMT is very close to the lowest possible rate that would result from an AND-algorithm. The efficiency on the other hand is very close to the highest possible efficiency that would result from an OR-algorithm (see also Figure 14.6 and Table 14.3).

## 14.9 Status and Schedule

The conceptual design of the Global Muon Trigger has been presented in November 2000 [14.3] and the corresponding milestone (Conceptual Global Muon Trigger design) has been fulfilled. As shown in Figure 14.12 the GMT logic FPGA chip design will be done during 2002 and the boards will be produced and tested by November 2003. A set of spare modules will be produced later (2004/2005). The software version for FPGA design will be preserved after the board tests to enable subsequent changes of algorithms.



**Fig. 14.12:** Global Muon Trigger schedule

## References

- [14.1] N. Neumeister, P. Porth, H. Rohringer, “Simulation of the Global Muon Trigger”, CERN CMS Internal Note 1997/023.
- [14.2] M. Fierro, H. Sakulin, “Studies of the Global Muon Trigger Performance”, CMS Note, in preparation.
- [14.3] M. Fierro, N. Neumeister, P. Porth, H. Rohringer, H. Sakulin, A. Taurok, C.-E. Wulz, “Conceptual Design of the Global Muon Trigger”, CMS Note, in preparation.

# 15 Global Trigger

## 15.1 Introduction

The purpose of the trigger system is to select all interesting events in the presence of an overwhelming background. The ultimate task of the L1 Global Trigger (GT) is therefore to decide whether to accept or to reject an event and to generate the corresponding L1 Accept signal (L1A). The GT environment is depicted in Figure 15.1. The Global Trigger Processor is a custom-built electronics system. The Timing, Trigger and Control (TTC) optical network sends the L1A signal to all readout units of the subsystems to move data of the current bunch crossing (bx) from their pipeline- or ring buffers into derandomizing memories. Later all or part of the bx-data will be fetched by the data acquisition, first to run Higher Level Trigger algorithms and finally to store accepted events.

A principal requirement of the GT is that it has to run dead-time free and provide a trigger decision every 25 ns, synchronously with the LHC clock. The GT is implemented using 40 MHz pipelined logic. It does not introduce any dead-time and could theoretically deliver a rate of up to 40 MHz. However, due to readout limitations of some subsystems, trigger rules permitting no more than a certain number of triggers within a given number of bunch crossings have to be established. These rules have to be set such that the overall dead-time stays within the order of one percent.

For physics runs only data from the calorimeter and muon L1 trigger systems are used



**Fig. 15.1:** Global Trigger environment

to take the L1A decision. The GT receives the best four of each of the following objects: muons, isolated electrons or photons, non-isolated electrons or photons, central jets, forward jets and isolated hadrons or  $\tau$ -jets. The trigger objects, also denoted below as “particles”, are ordered by rank, which is a function of transverse energy or momentum and quality. In addition, the GT receives the magnitude and the direction of the missing transverse energy as well as the total transverse energy and eight numbers of jets passing different  $E_T$  thresholds. In order to combine data from the same bunch crossing the GT compensates the different latencies of the Muon and

---

Calorimeter Trigger by synchronising all input channels to the LHC orbit and to each other. For technical runs such as synchronisation, calibration, setup and tests, special signals may directly be fed into the GT electronics.

A basic principle of the CMS L1 Trigger is that details of the highest rank trigger objects are always sent to the Global Trigger, rather than histogram or summary information. It is the job of the Global Trigger to apply thresholds and other selection criteria based on this detailed information. These criteria may be adapted as needed. There is no dependence on local or regional level thresholds, except for inherent ones like those necessary for the calculation of isolation information for electromagnetic clusters, the definition of a jet or the calculation of jet counts.

Another special and important feature of the L1 Global Trigger is that it not only receives particle energies or momenta but also location information, namely pseudorapidity and azimuth. For muon candidates charge information is also delivered. Trigger conditions based on event topology can therefore be applied already at L1. Furthermore, the space coordinates can be used in the Higher Level Triggers to select regions of interest. Spatial information is not only useful for physics triggers but also for calibration and trouble-shooting purposes.

The heart of the Global Trigger Processor is the logic performing the trigger algorithm calculations. An algorithm is a combination of trigger objects satisfying defined thresholds and spatial conditions. The GT logic can be programmed to calculate up to 128 different trigger algorithms in parallel for every bunch crossing. A final OR-function combines all active algorithms and generates the L1A signal. Rates can be kept under control by adjusting energy or momentum thresholds of physics objects or by prescaling algorithms selecting processes with large cross-sections.

The Trigger Control System (TCS) is located logically between the Global Trigger and the TTC and Data Acquisition Systems, but physically it is located in the Global Trigger crate. An important part of it is the Trigger Throttle System (TTS) which simulates the behaviour of the readout logic of predictable subsystems and reduces the L1A rate to avoid overflows in the readout chain. The parameters of the TTS rules are programmable. The GT can be programmed to run also test and calibration algorithms. This is explained in detail in Chapter 16.

The GT expects corrected Muon and Calorimeter Trigger data. All tasks concerning the data quality such as “ghost” suppression, correction of bx-assignment errors etc. have to be done locally in the subsystems.

The GT Processor will be built in FPGA technology. ASICs have not been chosen for several reasons: The most important aspects determining the choice were the required flexibility and adaptability of the algorithm logic calculations in order to be ready to fully exploit new physics. The number of identical chips is not large and the speed requirements can already be met by today’s gate arrays. Independence from ASIC manufacturing technologies was also a criterion for the decision.

The concept and the hardware implementation of the Global Trigger are described in detail in references [15.1] and [15.2].

## 15.2 System Overview

### 15.2.1 Functionality

The input data coming from the subsystems are first synchronised to each other and to the LHC orbit and then sent via the crate backplane to the Global Trigger Logic modules where the trigger algorithm calculations are performed. For each quadruplet of channels (4  $\mu$ , 4 non-isolated and 4 isolated e/  $\gamma$ , 4 central and 4 forward jets, 4  $\tau$ -jets) Particle Conditions are calculated. A condition for a group of up to 4 particles of the same type may require that  $E_T$  or  $p_T$  is above a threshold, that the particles are within a selected window in  $\eta$  or in  $\phi$  or that two particles are opposite or close to each other in  $\eta$  or/and  $\phi$  etc. In parallel Delta Conditions calculate relations in  $\eta$  and  $\phi$  between two particles of different kinds.

Several Particle and Delta Conditions are then combined by a simple AND-OR logic to form algorithms. Of course all Particle Condition bits can be used either as trigger or as veto condition. Each of the 128 algorithms represents a complete physics trigger condition and is monitored by a rate counter.

In the last step a final OR is applied to the algorithms to generate a L1 Accept signal that starts the Data Acquisition System and the Higher Level Trigger software. All algorithms can be prescaled to limit the overall L1 trigger rate. In fact several final ORs are provided in parallel to run subsystems independently for tests and calibration. More details can be found in Chapter 16.

In case of a L1A the Global Trigger is read out like any other subsystem. The readout requests arrive via the TTC network. The requests are queued, a bunch crossing number is appended and then they are broadcast to all Global Trigger boards, including those of the Global Muon Trigger. On each board a Readout Processor chip extracts data from the ring buffers, adds format and synchronisation words and sends the event record to a readout module, the Global Trigger Front-end board. The incoming data are checked and combined to one Global Trigger event record. According to an identifier the events are collected either in monitoring memories or are sent to the DAQ interface.

### 15.2.2 Implementation

The tasks of the Global Trigger include the synchronisation of all input channels, the algorithm calculations and the generation of the event accept/reject decision, the production of the data containing the trigger information and the interaction with the TTC and DAQ Systems. The Global Trigger electronics accordingly consists of several modules described in the next paragraphs. All modules are housed in a single 9U VME crate. The Global Trigger crate contains three different parts, the Global Muon Trigger, the Global Trigger and the Trigger Control logic. Figure 15.2 shows the layout.

The Pipelined Synchronising Buffer (PSB) input modules receive the data to synchronise all input channels to the LHC orbit and to each other. Global Trigger Logic (GTL) modules combine the input channels and calculate up to 128 trigger algorithms in parallel. The Final Decision Logic (FDL) module optionally downscals the algorithms and combines them by a final OR-function to generate the L1 Accept signal that starts the CMS readout. Rate counters for each algorithm and a dead-time counter to monitor the trigger system are also foreseen on this module.



**Fig. 15.2:** Global Trigger hardware overview

The Trigger Control System (TCS) module limits the L1A rate and provides calibration control signals for all readout and trigger crates as described in Chapter 16. The Global Trigger Front-end (GTFE) module collects the trigger data from all modules after a readout request (L1 trigger) and sends them to the Data Acquisition System like any other detector part. The Timing (TIM) module contains a TTCrx receiver chip linked to the common clock distribution system (TTC), provides the clock and other fast control signals for the crate and sends readout requests to all modules. The backplane is fully custom-built. The upper part carries all VME signals for 32 bit access. The lower part contains all point-to-point links between the GT, GMT and TCS boards.

The Trigger Control System receives status information from the readout and trigger electronics to either limit the trigger rate or to inhibit triggers for a time. Readout and trigger electronics are divided into so called subsystems. For example the RPC electronics consists of a forward and a barrel RPC subsystem. An additional set of two PSBs with 1 GTL module accepts fast control signals with status information from all subsystems. The Trigger Control System consists of 2 PSB boards, a GTL board and the TCS module and receives the Final OR decisions from the FDL board. A 6U VME crate contains receiver boards to convert the fast control signals to Channel Links. Both PSBs synchronise the fast signals to the Global Trigger clock and monitor all bits for every bunch crossing. The TCS board contains the circuits for the Trigger Throttle System and the calibration control. Free inputs on the PSB boards and free logic on the GTL modules can be used for future additions. In total the GTL logic board for the physics algorithms accepts 4 muons and up to 28 other input channels. 28 channels are occupied in the present design. The 4 free channels are available for future additions of trigger objects. The GT crate contains the following modules: 8 PSBs, 3 GTLs, 2 GMTs, 1 TIM, 1 FDL, 1 TCS and 1 GTFE.

## 15.3 Hardware Details

### 15.3.1 Input from Global Calorimeter Trigger

The Global Calorimeter Trigger (GCT) sends the total transverse energy  $E_T$ , the magnitude and the direction of the missing transverse energy  $E_T^{\text{miss}}$ , 4 non-isolated and 4 isolated  $e/\gamma$ , 4 central and 4 forward jets, 4  $\tau$ -jets and 8 numbers of jets above different thresholds two of which are reserved for forward jets to the Global Trigger. The objects are sorted by  $E_T$ , the one with the lowest  $E_T$  going to channel 1, the one with the highest  $E_T$  to channel 4. If fewer than four objects are found, the channels for lower  $E_T$  are empty. Empty channels contain all zeros. The bit pattern of each jet and  $e/\gamma$  channel contains  $E_T$ ,  $\eta$ ,  $\phi$  and several control bits. Missing and total transverse energies consist of 12 bits, whereas the numbers of jets above threshold are represented as 4 bits.

The GCT crate is located close to the Global Trigger crate and sends the trigger data on Ethernet cables using 21 bit Channel Links. Due to the very low error rate no error bit correction is foreseen. A simple parity check will detect faulty hardware channels. To check for synchronisation the GCT sends the two least significant bunch counter bits on each channel for comparison with the local GT bunch counter as explained below in the description of the synchronisation chip.

### 15.3.2 Input from Global Muon Trigger

The Global Muon Trigger (GMT) is mounted in the Global Trigger crate and sends the best four muons immediately to the GTL logic boards to keep the latency as small as possible.

The four muons are sorted by rank. The rank is determined by  $p_T$  and quality. The muon with the lowest rank goes to channel 1 and the one with the highest rank to channel 4. If fewer than four particles are found, the channels for lower ranks are empty. Empty channels contain all zeros or at least  $p_T=0$ , quality=0.

As well as  $p_T$ ,  $\eta$  and  $\phi$ , a charge or sign bit, an isolation bit, a MIP bit and 3 quality bits define the properties of muons.  $p_T$  is represented by 5 bits with a non-linear scale. The scale does not have to be identical for all types of physics runs or luminosities. It can be optimized by the Muon Trigger subsystems according to physics requirements.

### 15.3.3 Summary of input bits

Table 15.1 and Table 15.2 show the input bits received by the Global Trigger from the Global Calorimeter Trigger and the Global Muon Trigger. The numbers in brackets are the bit numbers. P0, P1 are parity bits, SYN is a synchronisation bit, and B0, B1 are bunch counter bits. Bit 15 is always set to zero for electrons, jets and  $\tau$ -jets. OV represents an overflow bit for the total and missing transverse energies.

**Table 15.1:** Cable and bit assignment for GCT to GT links

| Cable numbers | Data                                                    | Bit assignment per cable                                                                                                  |
|---------------|---------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------|
| 1-4           | Non-isolated e/ $\gamma$ (1-4)                          | $E_T(0:5)$ , $\eta(6:9)$ , $\phi(10:14)$ ,<br>$P0(16)$ , $P1(17)$ SYN(18), $B0(19)$ , $B1(20)$                            |
| 5-8           | Isolated e/ $\gamma$ (1-4)                              | $E_T(0:5)$ , $\eta(6:9)$ , $\phi(10:14)$ ,<br>$P0(16)$ , $P1(17)$ SYN(18), $B0(19)$ , $B1(20)$                            |
| 9-12          | Central jets (1-4)                                      | $E_T(0:5)$ , $\eta(6:9)$ , $\phi(10:14)$ ,<br>$P0(16)$ , $P1(17)$ SYN(18), $B0(19)$ , $B1(20)$                            |
| 13-16         | Forward jets (1-4)                                      | $E_T(0:5)$ , $\eta(6:9)$ , $\phi(10:14)$ ,<br>$P0(16)$ , $P1(17)$ SYN(18), $B0(19)$ , $B1(20)$                            |
| 17-20         | $\tau$ -jets (1-4)                                      | $E_T(0:5)$ , $\eta(6:9)$ , $\phi(10:14)$ ,<br>$P0(16)$ , $P1(17)$ SYN(18), $B0(19)$ , $B1(20)$                            |
| 21            | Total $E_T$ , bits 3:5 of<br>$\phi-E_T^{\text{miss}}$   | $\Sigma E_T(0:11)$ , OV(12), $\phi-E_T^{\text{miss}}(13:15)$ ,<br>$P0(16)$ , $P1(17)$ SYN(18), $B0(19)$ , $B1(20)$        |
| 22            | Missing $E_T$ , bits 0:2<br>of $\phi-E_T^{\text{miss}}$ | $E_T^{\text{miss}}(0:11)$ , OV(12), $\phi-E_T^{\text{miss}}(13:15)$ ,<br>$P0(16)$ , $P1(17)$ SYN(18), $B0(19)$ , $B1(20)$ |
| 23            | Jet counts (1-4)                                        | Count #1 (0:3), Count #2 (4:7),<br>Count #3 (8:11), Count #4 (12:15)                                                      |
| 24            | Jet counts (5-8)                                        | Count #5(0:3), Count #6 (4:7),<br>Count #7 (8:11), Count #8 (12:15)                                                       |

**Table 15.2:** Bit assignment for the backplane GMT to GT link

| Data        | Bit assignment per line                                                                             |
|-------------|-----------------------------------------------------------------------------------------------------|
| Muons (1-4) | $\phi(0:7)$ , $p_T(8:12)$ , quality (13:15), $\eta(16:21)$ ,<br>sign(22), MIP(23), ISO(24), SYN(25) |

### 15.3.4 Synchronisation and Latency Buffer Hardware

The synchronisation hardware is located on PSB boards shown in Figure 15.3. The PSB boards are also used as input boards in the Global Muon Trigger. Channel Link receivers convert fast serial LVDS data to parallel words (21 bits at a rate of 40 MHz).

Each PSB module receives 12 input channels. Three PSB boards are foreseen for the GMT, another 3 PSB boards for the GT logic and 2 PSB boards to accept fast control signals such as ready, busy or error signals for the Trigger Control System.



**Fig. 15.3:** Pipeline Synchronising Buffer Board

### 15.3.5 Synchronisation FPGA

The Synchronisation Chip shown in Figure 15.4 contains the oversampling circuits, the synchronisation pipelines and the synchronisation control logic for two channels. The input data are sampled four times per bunch crossing and the sample furthest from the data switching time is stored in the following pipeline registers. Additionally four samples of one data bit are appended to the original data word. The data words are delayed in the programmable synchronisation pipeline and then sent to ring buffers and to the algorithm logic board.

Each Synchronisation Chip is locked independently to the LHC clock and to the orbit. For test purposes a simulated Bunch Crossing Zero (BC0) signal can be generated by VME to start artificial orbits at the same time on all channels of a PSB board. The pipeline delay and all registers can be preloaded and read by VME. At the end of every LHC cycle the contents of counters for parity and synchronisation errors and for state transitions of the input words are saved as well as the number of bunch crossings during the last LHC orbit to check for counting errors. The status and error registers can be read by VME.

Two parity bits are used to check the input data bits 15 - 00 for transmission errors. To detect synchronisation errors the two least significant bunch counter bits of an input channel are compared to the local on-chip bunch counter. An error condition is encountered if the offset (modulo 4) changes during the run. This check can be limited to run only during a certain part of the LHC orbit to avoid error messages during reset procedures.

### 15.3.6 Dual Port Memories for input data

The input data for each channel are stored in a Dual Port Memory (DPM) working as a ring buffer and wait for a readout request either from the DAQ or from a monitoring program. Every LHC orbit the bunch counter reset (BCR) signal resets the DPM address counters making the address equal to the bunch crossing number (modulo L, if the DPM of length L does not contain a complete LHC orbit). Data from the previous LHC cycle are overwritten. On the final PSB boards the DPMs are implemented inside the Synchronisation Chips. Their size is chosen to store data of more bunch crossings than other subsystems in order to keep the trigger history for a sufficient length of time. On the prototype board the DPM consists of discrete DPM chips and contains data of one complete LHC orbit. The DPMs can be read during a data taking run by the Readout Processors (ROP) or by VME for tests.



**Fig. 15.4:** Synchronisation Chip

## 15.4 Logic

Figure 15.5 shows the basic layout of the GTL board which calculates 128 trigger algorithms in parallel. Input channels are combined into groups of four objects (4  $\mu$ , 4 non-isolated e/ $\gamma$ , 4 isolated e/ $\gamma$ , 4 central jets, 4 forward jets, 4  $\tau$ -jets) denoted as “particles”. In addition the total transverse energy  $\Sigma E_T$ , the vectorial missing transverse energy  $E_T^{\text{miss}}$  and 8 numbers of jets above different thresholds are received and combined to a group. The first step in the Algorithm Logic consists of applying conditions to each group of objects. The chips performing these operations are



**Fig. 15.5:** Global Trigger Logic board

called Condition Chips. Due to the pin count limitations on the chips, each group of four input channels is only sent to four out of six Condition Chips.

In the Condition Chips, Particle Conditions between identical object types and Delta Conditions between different object types are calculated. The Particle Conditions are composed of conditions for single particles and correlations. The first consist of the application of  $p_T$  or  $E_T$  thresholds and windows in  $\eta$  and/or  $\phi$ . For muons the required isolation bits are also checked. The second calculate the differences  $|\Delta\eta|$  and  $|\Delta\phi|$  between two particles of the same type. For muons, the sign and MIP bit patterns are also checked. The Delta Conditions calculate the absolute differences in  $\eta$  and  $\phi$  between different types of particles and are discussed further below.

Different conditions may be required for each of the four particles. If for instance four muons are required in the trigger the  $p_T$  thresholds and other conditions may be different for all of them [ $p_T(\mu_1) > 60 \text{ GeV}/c$  **and**  $p_T(\mu_2) > 40 \text{ GeV}/c$  **and**  $p_T(\mu_3) > 20 \text{ GeV}/c$  **and**  $p_T(\mu_4) > 8 \text{ GeV}/c$ ]. Two muons might be requested to be in the forward and the other two in the central region. Other trigger algorithms could require two muons opposite to each other in  $\phi$  or a dimuon pair of opposite sign with at least one MIP bit set. Many different conditions are possible.

In the next step of the Algorithm Logic the condition bits are combined by a simple AND-OR logic to form a trigger algorithm. All particle condition bits can be used either as trigger or as veto condition. If complicated algorithms require conditions from particles not available on the Condition Chip, the creation of algorithms is deferred to the following Algorithm Chips.

To illustrate the trigger logic further, examples of the steps necessary for the calculation of an algorithm are given. The first example is the use of a Particle Condition with a space correlation necessary to trigger on two isolated electrons back-to-back in azimuth, a signal which might come from the decay of a heavy vector boson ( $Z'$ ). Figure 15.6 shows a pictorial representation and the cuts used. Physically reasonable values for the thresholds and other conditions are indicated, not those actually used in the hardware. Figure 15.7 shows the case of two opposite sign isolated muons back-to-back in azimuth, which could also come from a  $Z'$ . Here isolation conditions as well as MIP and sign templates are applied in addition to the threshold and spatial requirements as for the electron case.



**Fig. 15.6:** Particle Condition for back-to back electrons



**Fig. 15.7:** Particle Condition for back-to back opposite sign isolated muons with MIP bits

An algorithm requiring two muons or two electrons exceeding certain transverse momentum or energy thresholds in conjunction with missing transverse energy, as could be the case for a SUSY slepton, is depicted in Figure 15.8.

The algorithm bits are sent to the Final Decision Logic module where each algorithm can be pre-scaled by a programmable factor. Algorithms are not fixed a priori but can be fine-tuned to physics or operational needs. The  $p_T$  and  $E_T$  thresholds of existing conditions can be loaded into



**Fig. 15.8:** Algorithm for two muons or two electrons with missing transverse energy

registers using VME instructions. If a new algorithm has to be programmed, the new layouts for the FPGA chips are calculated first and then loaded as explained below.

### 15.4.1 Standard algorithms

#### Configuration of Condition and Algorithm Chips

To ease the design of a Condition Chip for a new trigger set-up several types of template circuits for each type of 'particles' are set up and used like building blocks to compose predefined Particle Conditions. In the Particle Condition circuit the input data are applied to a set of Single Particle and Correlation templates and the results are combined by an AND-OR function. The predefined Particle Conditions are tested with worst case values to ensure that the latency and space requirements are met. Delta Conditions as explained below between different 'particle' types are also predefined. The predefined Particle and Delta Conditions are used to compose Algorithms representing a physics trigger.

Actual Particle and Delta Conditions are created loading the look-up tables of all used templates with actual values and placing the circuits on the chip. Several Conditions are then combined either to complete Algorithms or still incomplete Pre-Algorithms and connected to an output pin. A set-up & layout program is used to define the parameters for actual Pre-Algorithms or Conditions and to place them on the FPGA chip. The program delivers VHDL files, which are appended to the more general VHDL code to complete the design for an actual Condition Chip. A similar set-up program defines the content of the following Algorithm Chips designed either to pass on already finished algorithms or to combine Pre-Algorithm bits to entire Algorithms.

## Particle Conditions

### a) Muon Conditions

The group of four muons is checked against a set of templates to generate a Muon Condition.

A *single muon template* checks for:

- $p_T \geq$  threshold (for tests also a fixed  $p_T$  may be required),
- Isolation bit (check may depend on  $p_T$ ),
- Inside  $\eta$  window(s) ('forward' designates actually two windows in hardware corresponding to the forward and backward  $\eta$ -hemispheres),
- Inside  $\phi$  window (for tests only),
- Quality bits: Find any of the allowed quality values.

A *4-muon template* consists of four single muon templates, a sign template and a MIP bit correlation template. A *dimuon template* consists of two single muon templates, a sign template and a MIP bit correlation template. A *dimuon template with space correlation* consists of two single muon templates and four correlation templates for  $\Delta\eta$ ,  $\Delta\phi$ , sign and MIP bits. The four muons are applied in all possible permutations (1234, 1243, 1324...) to a set of templates to find a permutation that fulfills all the conditions. If fewer than four or two muons are required for a particular algorithm the templates for the other muons are set to trivial values (e.g.:  $p_T \geq 0$  GeV/c,  $0^\circ < \phi < 360^\circ$  etc.). An  $\eta$  or  $\phi$  *correlation template* checks for two muons opposite or close to each other in  $\eta$  or  $\phi$ . Other values of space difference can be checked upon by the Higher Level Triggers. A *sign correlation template* checks for any possible sign pattern of up to four muons. A *MIP bit correlation template* checks for a MIP bit pattern of up to four muons.

With these facilities the following muon combinations can all be triggered upon at L1: two muons opposite to each other in  $\phi$ ; one muon with a selected sign; an opposite sign dimuon pair with two MIP bits set and many others.

### (b) Calorimeter Conditions (Electron/Photon, Jet, $\tau$ -Jet Conditions)

A similar logic as for muons is used for calorimeter trigger objects. A set of templates is applied to a group of four isolated or non-isolated e/ $\gamma$ , central or forward jets and  $\tau$ -jets. A calorimeter 4(2)-particle template consists of 4(2) single particle templates; a 2-particle template with space correlation consists of 2 single particle templates and 2 correlation templates for  $\Delta\eta$  and  $\Delta\phi$ . The four particles are applied in all possible permutations (1234, 1243, 1324...) to the set of templates to find an arrangement that fulfills all conditions. Each of the single particle templates can be programmed with different values. If fewer than 4(2) particles are required for a particular algorithm the unused templates are set to trivial values.

A *single particle template* checks for:

- $E_T \geq$  threshold (for tests also a fixed  $E_T$  may be required),
- Inside  $\eta$  window(s) ('forward' designates two windows in hardware as described for the muon case),

- Inside  $\phi$  window (for tests only)

An  $\eta$  or  $\phi$  *correlation template* checks for two particles opposite or close to each other in  $\eta$  or  $\phi$ .

### (c) Calorimeter Conditions (Total and Missing Transverse Energy Conditions)

Conditions for the total  $E_T$  and the 8 numbers of jets consist of simple threshold comparators. For the missing  $E_T$  vector a Condition consists of a threshold comparator and a  $\phi$  window comparator. The  $\phi$  window can be used for tests.

## Topological triggers and the Delta Conditions

The possibilities how to select particles in defined pseudo-rapidity windows and how to find pairs of particles with predetermined correlations in  $\phi$  or  $\eta$  are described in the following.

$\eta$  windows are important to find one particle in the barrel and another particle of the same or of a different type in the endcap region. If jets are involved, for example forward tagging jets, there is an additional possibility to use explicitly the forward jets sent by the Global Calorimeter Trigger. With the logic of the Particle Conditions the determination of  $\eta$  or  $\phi$  correlations between a pair of particles of the same type is possible. One can find objects opposite or close to each other in  $\eta$  and  $\phi$  within a programmable tolerance. For muons, correlations in  $\phi$  will be made with reduced resolution to save logic resources and to meet timing constraints inside the FPGA chips.

For space relations between different types of particles special “Delta Conditions” have been designed. The  $|\Delta\eta|$  and  $|\Delta\phi|$  differences between all possible pairs of two particle groups are calculated. The differences are compared to limits to trigger on objects opposite or close to each other in  $\eta$  or  $\phi$  within a programmable tolerance. Both conditions in  $|\Delta\eta|$  and  $|\Delta\phi|$  can be applied concurrently to trigger on a spatial relation. However, it should be remarked that the trigger does not calculate a true three-dimensional distance in space. If muons are compared with calorimeter particles the muon values, which have higher precision, are converted to calorimeter units using lookup tables. The participating particles have to fulfil also  $E_T$  and  $p_T$  requirements. An example requiring a Delta Condition would be a trigger on a jet and a muon opposite to each other in  $\phi$ .

### 15.4.2 Special algorithms

Several algorithm bits are reserved for special runs used for data checking, calibration, synchronisation and hardware testing purposes. The conditions cannot be completely fixed at this time since the running-in phase of the CMS detector at the start-up of the LHC will influence the needs for special procedures. The following special triggers will certainly be needed during the entire lifetime of the detector:

- A Random Trigger with programmable rate and start value for calibration and testing.
- A Synchronisation Trigger to trigger at a selected bunch crossing number, with a prescale option.
- A Single Channel Trigger for a specific input channel. The algorithm bits are used for synchronisation to the LHC orbit.
- A Minimum Bias Trigger to check trigger efficiencies.

- An external trigger generated by various sources.

## 15.5 Final Decision Logic Module

The FDL module depicted in Figure 15.9 contains the Final Decision Logic to combine all Algorithm Bits to a final L1A signal. If necessary, separate L1A signals can be provided for sub-systems. There is a number of final decision gates each selecting a programmable set of



**Fig. 15.9:** Final Decision Logic Board

algorithms to control subsystems either in common or separately (“partitions”). For standard physics data taking there is only one single partition consisting of all CMS sub-systems. For calibration and different test readout modes running with more than one partition may be necessary.

The FDL module receives 128 Algorithm Bits per bunch crossing from the GTL module and some signals from a GTL board of the TCS where external signals can be combined to generate special trigger signals. After optional downscaling, all Algorithm Bits are combined to make the final L1A signals which goes via the Trigger Control System TCS to the TTC system to save data from the front-end buffers. The L1 Accept is sent concurrently to the Event Manager to start the CMS data readout. Each algorithm can be prescaled by a programmable factor to keep the trigger rate under control. All Algorithm Bits and the final decisions of every bunch crossing are stored in DPMs used as ring buffers to be read by the DAQ.

## 15.6 Synchronisation Procedure

After the fine time adjustment each individual input channel has to be adjusted to the LHC orbit. Then all channels are synchronised to each other by starting the transfer to the GTL modules at the same time. All input channels except for muons are first synchronised in the PSB modules and then sent by Channel Links via the backplane to the GTL modules. The synchronisation of the muons is done at the input of the Global Muon Trigger. Then the muons are sent without any further delay via the backplane to the GTL modules to keep the overall L1 latency as small as possible. Therefore the latency of the Global Trigger has to be seen in context with the Global Muon Trigger.

### 15.6.1 Fine time adjustment of input channels

As software can change the clock phase of the GCT and Muon Trigger crates relative to the Global Trigger crate, input data from these crates cannot be captured safely by cutting the interface cables to an appropriate length. To find the best ‘cable length’ the Synchronisation Chip samples all parallel input bits four times per bunch crossing (160 MHz) and selects the best sample to store the input data into the following synchronisation pipeline as shown on the left side of Figure 15.10 and explained in detail in [15.3]. If the phase of an input channel changes, a different sample is selected to store the data bits. State transition counters help to find the best sample. The distribution as illustrated in the same Figure 15.10 shows the time stability of an input channel relative to the local clock and can be used to monitor input channels continuously.



**Fig. 15.10:** Fine Time Adjustment of input channels

### 15.6.2 Bunch Crossing Synchronisation

The principle of bunch crossing synchronisation is shown in Figure 15.11. Each



**Fig. 15.11:** Bunch crossing synchronisation

Synchronisation Chip contains a bunch crossing counter (BC) and receives a common “Bunch Counter Reset” (BCR) signal to lock the circuit to the LHC orbit. If BCR is not sent every orbit a LIMIT comparator resets the bunch crossing counter after 3564 (programmable) bunch crossings automatically. In any case a common periodic BCR signal guarantees that all local counters in the crate run synchronously.

First all synchronisation pipelines of all channels are set to a minimum delay. Starting at the time of  $BC=0$  data are written into the dual port memories. The content of the memories is compared to the LHC orbit structure to find data from the actual first bunch crossing. The start time is then changed until data from the first bunch crossing go into the first memory location. The start time found represents the relative latency of the input channel. The procedure is done in parallel for all channels. For the synchronisation of the channels to each other the start time of the latest channel is selected as the common start time. For all other channels with a smaller latency the delays of their synchronisation pipelines are increased accordingly. The procedure is finished when data of the first bunch crossing of all channels are stored in the first memory location using the same start time.

### 15.6.3 Synchronisation of Muon and Calorimeter Trigger data

Calorimeter Trigger data are expected to arrive early and therefore have to wait for the later coming muons. As described in Chapter 14 the Global Muon Trigger synchronises the Regional Muon Trigger data to each other and to the LHC orbit. The last channel from the Regional Muon Trigger passes through the synchronisation circuit in the fastest possible way to minimize the overall latency. After a constant and known delay, the GMT latency, the best four muons arrive at the GTL board. The common start time for the calorimeter channels is now chosen such that all trigger data arrive at the same time at the GTL module. The latency overview in Figure 15.12 shows the time relation between the two systems.

To verify that all channels arrive at the same time at the GTL board a simple test algorithm for each channel delivers a trigger for all non-zero  $E_T$  and  $p_T$  values as mentioned in Section 15.4.2 describing special trigger procedures. After this the position of all gaps in the LHC orbit should be identical for all channels. For adjustments without beam a synchronisation word can be used for a similar test.

## 15.7 Latencies of the GMT and the GT

To minimize the overall latency for muons the Global Muon Trigger is mounted in the Global Trigger crate and the synchronisation procedure is done at its entrance. It sends the best four muons via the backplane directly to the Global Trigger Logic boards.



**Fig. 15.12:** Latency of the Global Muon Trigger and the Global Trigger

The Global Calorimeter Trigger sends the trigger objects using Channel Links to the PSB input modules, which send the data also by Channel Links via the backplane to the GTL logic board. The Channel Links will run with an 80 MHz clock to reduce the overall latency. The latency numbers of the different steps of the Global Trigger can directly be derived from Figure 15.12.

## 15.8 Output Processing and Monitoring

Figure 15.13 shows the readout procedure and the GTFE module. The Timing (TIM) board



**Fig. 15.13:** Readout in the Global Trigger crate

board contains a TTCrx chip and starts a readout sequence after a L1A or a monitoring request. The readout request is broadcast to all modules (PSB, GTL, GMT, FDL etc.) in the GT crate via the readout bus on the backplane. On each module a Readout Processor (ROP) collects the data from registers and all DPMs and sends a formatted event record via the backplane to the GTFE readout module. A Device Dependent Unit (DDU) merges and checks data from all boards, builds a formatted GT event and sends it into an event buffer. The standard Readout Unit Interface (RUI) fetches the events from the buffer and sends them to the DAQ.

Readout requests for monitoring use the same circuits and are inserted during idle periods. The data are moved into separate monitoring memories on the GTFE module.

### 15.8.1 L1A requests

In case of a L1A an identifier word, the event number and the current contents of the bunch counter are moved into the L1 queue FIFO. If the TTCrx is programmed to run in a simpler mode omitting event or bunch crossing number, the missing words are provided by the local event and bunch counter. The readout bus (RO-bus) controller sends readout requests until the L1 queue FIFO is empty. If there is no L1A request pending, requests are taken from a monitoring queue FIFO. Readout requests are sent as long as the collecting GTFE board accepts new events. Between two consecutive L1A signals no minimum distance in time is required. The maximum event rate depends on the number of bunch crossings read by one request. For a length of 3 bx a rate of 200 to 300 kHz could be achieved. The length of the L1 queue FIFO is defined by the size of the FPGA memory. Before the L1 queue FIFO becomes full a “GT queue warning” signal is sent to the Trigger Control System.

### 15.8.2 Monitoring requests

Monitoring requests either read the DPMs like normal event triggers or read all counters for statistics purposes. The requests are generated by external signals or by VME instructions and can be sent as single or periodical requests. All monitoring requests are fed into the monitoring queue FIFO and are broadcast whenever there is no L1A request pending.

### 15.8.3 Readout Processors

All boards in the crate providing event data contain a Readout Processor (ROP). The ROP receives L1A and monitoring requests from the RO-bus (16 bits). The bunch crossing number of a request points directly to the correct address in the Dual Port Memories. The ROP collects data from several bunch crossings around the requested one, adds format words and sends the event record via the backplane to the GTFE readout module. The size of the event records is constant as well as the time between two consecutive requests. Queuing of requests is done on the TIM module. On the FDL module a readout request collects all Algorithm and Final Decision Bits. From the PSB modules all input data from three or more bunch crossings are collected to provide a trigger history, which is particularly useful to trace pile-up conditions. A special readout request for statistics data reads all error and phase transition counters from the PSB modules and all rate counters and the dead-time counter from the FDL module.

### 15.8.4 Interface to Data Acquisition

According to an identifier event records are transferred either into the monitoring DPMs or to the DPM arrays of the Device Dependent Unit logic (DDU).

#### Alignment check and resynchronisation

The DDU receives the readout request from the RO-bus as reference and compares the bunch crossing number and the least significant part of the event number to the current reference numbers for each channel. Two status bits flag a possible misalignment in order to be able to identify the bad channel later. A bad bunch crossing number is corrected automatically with the next LHC orbit except for a fatal hardware failure. In case of bad local event numbers on single boards the DDU returns an “Update Event Numbers” command to the TIM board. This command

adds an identifier bit to the next event request forcing all ROPs to align their short local event number to the general event number during the next request. The general event number on the TIM board can be updated by a synchronous TTC command. The new event number will then be propagated to all other boards as described above.

### Merging of channels, format generation

The DDU merger fetches all event fragments from the DPM arrays, checks them, adds format words and sends the complete GT event into the event DPM used as event buffer. Each readout channel can be removed separately from the merging process. Apart from this case the size of events is constant. The merging procedure runs as fast as the transfer from all boards to the GTFE module to avoid an additional event queue.

### Monitoring of the event buffer

The DDU logic checks the filling of the event buffer. If a warning threshold is exceeded a “Warning - GT buffer nearly full” signal is sent to the Trigger Control System as described in Chapter 16.

### Readout of event buffer and DAQ interface

A DPM-to-RUI protocol logic executes all commands received from the RUI. It fetches events from the DPM and feeds them into the serial interface running either in an “auto-push” or “push-on-demand” mode.

### Private monitoring buffer on the GTFE board

Data with a monitoring identifier are written without any further check into DPMs. There is one DPM for four events allocated to each RO-channel. The monitoring requests on the RO-bus are annotated and will be used by software to compose monitoring “events”. Unread events will be overwritten. The DPMs are accessed by VME.

### 15.8.5 Hardware monitoring and failures

In case of a hardware failure VME and JTAG test programs will be used to identify a faulty board. If reloading of the FPGA chips does not cure the problem the faulty board will be replaced and the FPGA chips reloaded. In order to run JTAG test programs all boards are connected to the backplane JTAG bus and can be accessed separately using a boundary scan slot number. All FPGA chips with boundary scan pins are included in one JTAG loop per board. An additional JTAG connector on all boards allows stand-alone JTAG tests. The VME-FPGA chips are loaded at power-up time from a PROM. The other FPGAs will be loaded either by VME or PROMs or JTAG. Most of the modules contain an onboard 40 MHz oscillator to run tests in stand-alone mode.

## 15.9 Simulation

A big preoccupation of the physicists running the trigger system will be to keep the event rates under control. Rates have therefore been simulated as a function of thresholds applied to the trigger objects. The simulations have been performed with ORCA simulation software,

release 4, containing a complete geometrical description of the CMS detector, version 118. The rates for the calorimeter objects have been computed using the minimum bias samples described in Chapter 3, the muon rates and the combined muon-calorimeter rates have been computed using the minimum bias samples described in Chapter 8. The Global Muon Trigger algorithm has been implemented in the simulation as described in Section 14.8.2. The calorimeter and muon rates computed at LHC low luminosity ( $L = 10^{33}\text{cm}^{-2}\text{s}^{-1}$ ) are shown in Table 15.3, the rates computed at LHC high luminosity ( $L = 10^{34}\text{cm}^{-2}\text{s}^{-1}$ ) are shown in Table 15.4. The combined muon-calorimeter cumulative rates take care of the overlapping of the combined channels with the single calorimeter or muon channels. The  $p_T$  thresholds for the muon objects are defined at 90% efficiency, since the Global Muon Trigger delivers the  $p_T$  of the muon candidates according to a transverse momentum scale defined at 90% efficiency (see Section 14.3.1).

As can be seen from Table 15.3 and Table 15.4 reasonable thresholds for the trigger objects can be chosen in a way that the cumulative event rates never exceed the envisaged 12.5 kHz limit for calorimeter and muon triggers both for high and low luminosities. There is flexibility to adjust the thresholds and thus the rates in case of unexpectedly high event cross-sections. The shown rates correspond to a selection of possible trigger algorithms which can be set by the Global Trigger. Other triggers can be set up, and different bandwidths can be allocated, in order to allow studies of specific physics channels.

## 15.10 Prototypes and Tests

Prototypes for the backplane, the PSB and the GTL boards are built as 6U VME boards. All other boards will be designed only in the final version which foresees 9U high modules. The PSB-6U and GTL-6U prototype boards can be used in the final 9U VME crate together with the other 9U boards to allow continuous upgrading. Concurrently to board production test software written in the National Instruments CVI environment is being developed.

### 15.10.1 Custom prototype backplane

The 6U VME backplane provides all slots for the Global Muon Trigger and the Global Trigger. It contains the VME signals, all clock and timing signals, the RO-bus, the JTAG bus, all point-to-point Channel Links for the readout and the connections between the PSB input boards and the logic boards of the GMT and the GT. The 6U backplane exists and is used to run the PSB modules. It is equipped with 160-pin connectors for the VME part and with AMP 2mm Z-pack connectors for all other signals.

### 15.10.2 Prototype input boards PSB

The PSB-6U prototype shown in Figure 15.14 accepts 6 input channels. The synchronisation logic is the same as on the final 9U boards. For the ring buffer DPM chips are used. A 40 MHz oscillator allows stand-alone testing. Two channels can be programmed as outputs to generate external test data for other PSB boards. A small add-on board (PSB-IN) is plugged in on the front panel of the PSB-6U board. It converts Channel Links to parallel LVTTL data. It can be replaced to test other options for input signals. Two PSB prototype boards have been built and tested.

**Table 15.3:** Calorimeter and muon L1 trigger rates for low luminosity.

| $L = 10^{33} \text{cm}^{-2}\text{s}^{-1}$ |                            |                          |                          |            |        |
|-------------------------------------------|----------------------------|--------------------------|--------------------------|------------|--------|
| Trigger Algorithm                         | Trigger $E_T$ cutoff (GeV) | 95% Eff. Threshold (GeV) | 90% Eff. Threshold (GeV) | Rate (kHz) |        |
|                                           |                            |                          |                          | indiv.     | cumul. |
| e                                         | 20                         | 24                       | 22                       | 5.7        |        |
| e e                                       | 10                         | 18                       | 12                       | 2.7        |        |
| $\tau$                                    | 80                         | 95                       | 85                       | 3.2        |        |
| $\tau\tau$                                | 60                         | 75                       | 65                       | 1.5        |        |
| jet                                       | 120                        | 150                      | 140                      | 1.2        |        |
| Di-jet                                    | 90                         | 115                      | 105                      | 1.0        |        |
| Tri-jet                                   | 70                         | 95                       | 85                       | 0.3        |        |
| Quadri-jet                                | 50                         | 75                       | 65                       | 0.3        |        |
| $\tau e$                                  | 65, 10                     | 80, 14                   | 70, 12                   | 3.5        |        |
| jet e                                     | 100, 10                    | 125, 14                  | 115, 12                  | 1.1        |        |
| $E_T^{\text{miss}}$                       | 100                        |                          | 275                      | 0.01       |        |
| $e E_T^{\text{miss}}$                     | 10, 50                     |                          | 12, 175                  | 0.2        |        |
| jet $E_T^{\text{miss}}$                   | 50, 50                     |                          | 65, 175                  | 0.6        |        |
| $\Sigma E_T$                              | 500                        |                          | 1000                     | 0.02       |        |
| Total Rate (kHz)                          |                            |                          |                          |            | 12.24  |
| $\mu$                                     |                            |                          | 10                       | 8.7        | -      |
| $\mu\mu$                                  |                            |                          | 3, 3                     | 1.6        | 9.8    |
| $\mu e/\gamma$                            |                            |                          | 4, 12                    | 3.1        | 11.9   |
| $\mu\tau$                                 |                            |                          | 4, 80                    | 0.42       | 12.0   |
| $\mu$ jet                                 |                            |                          | 4, 80                    | 0.96       | 12.2   |
| $\mu \Sigma E_T$                          |                            |                          | 4, 600                   | 0.48       | 12.4   |
| $\mu E_T^{\text{miss}}$                   |                            |                          | 4, 140                   | 0.72       | 12.7   |

**Table 15.4:** Calorimeter and muon L1 trigger rates for high luminosity.

| Trigger Algorithm        | $L = 10^{34} \text{cm}^{-2}\text{s}^{-1}$ |                                |                                |               |        |
|--------------------------|-------------------------------------------|--------------------------------|--------------------------------|---------------|--------|
|                          | Trigger $E_T$<br>cutoff (GeV)             | 95% Eff.<br>Threshold<br>(GeV) | 90% Eff.<br>Threshold<br>(GeV) | Rate<br>(kHz) |        |
|                          |                                           |                                |                                | indiv.        | cumul. |
| e                        | 30                                        | 35                             | 32                             | 7.2           |        |
| e e                      | 15                                        | 20                             | 18                             | 0.6           |        |
| $\tau$                   | 150                                       |                                |                                | 1.3           |        |
| $\tau \tau$              | 80                                        |                                |                                | 2.5           |        |
| jet                      | 250                                       | 285                            | 275                            | 0.4           |        |
| Di-jet                   | 200                                       | 225                            | 215                            | 0.4           |        |
| Tri-jet                  | 100                                       | 125                            | 115                            | 0.7           |        |
| Quadri-jet               | 80                                        | 105                            | 95                             | 0.2           |        |
| $\tau e$                 | 90, 15                                    | 125, 20                        | 115, 18                        | 1.4           |        |
| jet e                    | 150, 15                                   | 165, 20                        | 155, 18                        | 0.2           |        |
| $E_T^{\text{miss}}$      | 150                                       |                                | 350                            | 0.005         |        |
| $e E_T^{\text{miss}}$    | 15, 100                                   |                                | 18, 250                        | 0.005         |        |
| jet $E_T^{\text{miss}}$  | 80, 100                                   |                                | 95, 250                        | 0.1           |        |
| $\Sigma E_T$             | 1000                                      |                                | $\sim 1500$                    | 0.03          |        |
| non-isolated electron    | 55                                        | 60                             | 58                             | 0.7           |        |
| non-isolated di-electron | 25                                        | 30                             | 28                             | 0.2           | 12.9   |
| $\mu$                    |                                           |                                | 25                             | 8.1           | -      |
| $\mu \mu$                |                                           |                                | 8, 5                           | 2.8           | 10.4   |
| $\mu e/\gamma$           |                                           |                                | 5, 32                          | 1.6           | 11.5   |
| $\mu \tau$               |                                           |                                | 5, 140                         | 0.44          | 11.6   |
| $\mu$ jet                |                                           |                                | 5, 155                         | 0.86          | 11.9   |
| $\mu \Sigma E_T$         |                                           |                                | 5, 800                         | 0.62          | 12.3   |
| $\mu E_T^{\text{miss}}$  |                                           |                                | 5, 200                         | 0.61          | 12.5   |



**Fig. 15.14:** PSB prototype board

### 15.10.3 Prototype logic board GTL

The GTL-6U board contains the algorithm logic for a reduced number of trigger objects and Algorithm Bits. The functions are designed as for the final board.

## 15.11 Status and Schedule

The current status of the GT electronics and the passed and upcoming milestones are described below. The final Global Trigger Processor including the Global Muon Trigger will be available at the end of 2004.

The first milestone was passed in June 1999 with the completion of the design of the 6U backplane prototype. The planned production of the PSB prototype was deliberately delayed to include more functionality. The synchronisation procedure was tested.

By November 1999 the 6U backplane production was terminated and the board used to test the also finished PSB-6U prototype.

June 2000 saw the completion of the VHDL design of the GTL-6U board.

With the publication of this Technical Design Report in November 2000 the GTL-6U trigger logic board design has been completed. The conceptual design of the FDL module directly as a 9U board is in progress. The GTL-6U board will be built and a test of the complete system comprising the backplane, the PSB and the GTL electronics will be performed to pass the milestone set for November 2001. In June 2003 the final Global Trigger for 24 input channels will

be available. The completion of the final 32-channel 9U Global Trigger Processor is planned for November 2004. The schedule is shown in Figure 15.15.



**Fig. 15.15:** Global Trigger Schedule

## References

- [15.1] C.-E. Wulz, “Concept of the First Level Global Trigger for the CMS Experiment at LHC”, CMS Note 2000/052
- [15.2] A. Taurok, H. Bergauer, M. Padrta, “Implementation and Synchronisation of the First Level Global Trigger for the CMS Experiment at LHC”, CMS Note 2000/057
- [15.3] A. Taurok, “Phase Synchronisation of trigger data in the CMS Level 1 Global Trigger”, CMS Internal Note 1999/010



# 16 Trigger Control

## 16.1 System Requirements

The Trigger Control System (TCS) is logically located between the L1 Global Trigger and the CMS readout and data acquisition (Figure 16.1). The main task of the Trigger Control is to control the delivery of L1 Trigger Accepts generated by the Global Trigger, depending on the status of the readout electronics or the data acquisition. This status is derived from local state machines that emulate the front-end buffers occupation, as well as from direct information transmitted back by the CMS subsystems through a Fast Monitoring network. In addition, the Trigger Control will require that the sequence of L1A complies with precise Trigger Rules in order to reduce the probability of buffer overflows in the readout and data acquisition chain. The Trigger Control is also responsible to generate the Bunch Crossing Zero and the Level 1 Reset commands, as well as to control the delivery of test and calibration triggers. The Trigger Control System uses the TTC network to distribute information to the subsystems.



**Fig. 16.1:** Overview of the Trigger Control System context.

The Trigger Control Software is the software system that allows a user to operate, read-out and monitor all the Trigger System equipment. It is composed of three quasi-independent subsystems, namely the Calorimeter Trigger, the Muon Trigger and the Global Trigger Control Software. In Section 16.6 to Section 16.8 we describe the basis of a common framework used by

the trigger subsystems control software, referring to the relevant chapters for more detailed information.

### 16.1.1 Requirements on L1A Control

The Trigger Control is responsible to guarantee that the subsystems are ready to receive every L1 Accept delivered. This functionality is essential to prevent buffers overflows and/or trigger signals missed when the subsystems are not ready to receive them. In either case, the consequence would be a loss of event synchronization, that is an incorrect event numbering in one subsystem or in part of it. This situation has to be avoided since it would give rise to false events, that means events where data from different crossings get mixed.

The Trigger Throttling Subsystem is the Trigger Control System component that takes care of this function. The L1A signals are throttled if any of the CMS subsystems is not ready to receive it.

The TCS receives warning signals from the CMS subsystems through the Fast Monitoring network indicating that some of its buffers are almost full. However this feedback signal can take several microseconds to reach the TCS, which meanwhile has delivered a number of L1A signals that could originate a buffer overflow. This problem is particularly acute in the front-end de-randomizers which have a storage capacity of a few events.

The front-end de-randomizers store a fixed number of words per L1A and for a given sub-detector all behave identically. Therefore, its occupancy depends only on the L1A rate and on the event write and read latency. Provided these latencies are precisely known, a state machine receiving the L1A signals can emulate the de-randomizer behavior and determine its occupancy at each new L1A. If a new L1A is estimated to cause a de-randomizer overflow, this L1A is not delivered.

In general, it would be very difficult to guarantee that the state machine reproduces exactly the buffer status at every time. However in the present case the L1A accept signals are synchronous with the clock and the write and read latencies are measured in multiples of the clock period. It is this time quantization that makes the de-randomizer emulation really possible.

A complementary solution to the same problem is to oblige the delivery of L1A signals to comply with a set of Trigger Rules. These rules take the general form 'no more than n L1A signals in a given time interval'. Suitable rules, inducing a negligible dead time, would minimize the buffer overflow probability, eventually bringing it to zero, and would guarantee that all subsystems at all levels can handle all L1A signals delivered.

The occupancy of buffers following the de-randomizers in the data acquisition chain depends on the event fragment sizes after data selection (zero suppression or selective readout). Potentially, every buffer has a different occupancy at a given time. Therefore its occupancy can not be emulated in a central place. The trigger rules can reduce the overflow probability but are insufficient to completely prevent overflows since the event fragment sizes are not taken into account in the algorithm.

The strategy to avoid overflows in this case is based on the following points:

- a buffer handler controls its occupation and detects when the buffer is filled above a certain warning level;

- at this point a warning message is sent to the TCS which can decide to inhibit the L1A temporarily;
- meanwhile, until the L1A has been paused, the buffer handler stores just 'empty events', that is a small data block containing the event identification and a buffer overflow error flag;
- the storage of complete events is resumed when the buffer occupation gets below the warning level.

The buffer safety region above the warning level can easily store a very large number of 'empty events', giving enough time for the fast monitoring feed-back loop to take place. With this scheme, the probability of overflowing the safety region due to statistical fluctuations is zero in practical terms. The synchronization of event fragments is guaranteed.

On the other hand, and because in first approximation the buffers occupancy are independent, the buffer dimension below the warning level has to be such that the probability of reaching it times the number of buffers is negligible and doesn't contribute significantly to the data acquisition inefficiency. These probabilities are estimated with system simulation tools.

### 16.1.2 Requirements on Fast Controls

In addition to the L1A signal, the TCS must be able to send to the subsystems a number of other fast control signals for synchronization, fast reset, calibration or test purposes. The distribution of fast control signals is organized by sub-detector partition in order to allow independent operation of the partitions in test or calibration mode.

The Trigger Control should send to the CMS subsystems two commands synchronous with the LHC orbit signal. The Bunch Crossing 0 command (BC0) is used by the trigger subsystems to flag the bunch zero data flowing in the trigger system (see Chapter 17). The Bunch Counter Reset (BCR) is used to reset the Bunch Counters in the front-ends readout (see Section 16.4.1). The BCR command differs from the BC0 only by a constant phase and is issued for subsystems convenience.

The Trigger Control should be able to send to the CMS subsystems Reset commands, aimed at re-synchronizing all CMS subsystem components to the same event ID or to recover from some hardware faults. Two distinct Reset commands are foreseen corresponding to two different levels of reset operations. These commands can be issued on a periodic basis, independently of the status of the readout electronics, or in response to a loss of synchronization identified in some subsystem.

The Trigger Control, upon reception of run control commands issued by the Run Control/Detector Control System (DCS), should be able to send Start/Stop Trigger commands through the fast control network (TTC network). These commands are recognized by the Trigger Primitives Generators which start/stop sending trigger data synchronously at the next LHC orbit.

The Trigger Control will be able to reserve pre-defined orbit gaps or longer periods for private use by the Subsystems. Fast commands are sent to the subsystems marking the periods of time the subsystems can use for dedicated activities (test, monitoring or calibration activities). No other command than the BC0 command, in particular no test/calibration triggers, will be issued by the TCS during these special periods.

### 16.1.3 Requirements on Fast Monitoring

The CMS Sub-detectors should send to the Trigger Control, using a Fast Monitoring network, the following feedback status: Ready, Busy, Warning Overflow, Out of Sync and Error. The Trigger Control should be able to receive these status information per sub-detector partition.

The Ready status is applied continuously to know that the system is connected and working. The Ready status has to be different from the signal received when cables are unconnected or the electronics switched-off.

The subsystems should have programmable logic to determine when to issue the signals (e.g. more than  $n$  modules in error state implies that the Error signal is sent to TCS). It is the responsibility of the CMS Subsystems to merge the feedback signals from individual modules into a partition status to be transmitted to the TCS.

The subdetectors should send the fast monitoring messages as fast as possible. The collection of all signals and the decision to send a message to the TCS should be done by hardware without any software intervention.

### 16.1.4 Requirements on Calibration and Test Triggers

The list of calibration and test runs foreseen by the CMS sub-detectors is shown in Table 16.1. Calibration and test triggers in CMS can be delivered in several different contexts:

1. Sub-detectors in standalone mode: some detectors are able to generate test and calibration sequences using their own resources and capturing the data with the sub-detector local DAQ.
2. Sub-detectors in DAQ partition mode: the TCS generates test and calibration triggers at the rate required by the sub-detector partition and the data is collected by the central DAQ. In this mode the sub-detectors have access to a fraction (partition) of the CMS on-line computing and storage resources.
3. Periodic test and calibration triggers issued by TCS during a Physics Run: the triggers are issued centrally and all subsystems deliver an event data block in order to keep the event synchronization (the event block can be empty if the sub-detector doesn't require test data).
4. Local test and calibration triggers issued by the subsystems during a Physics Run: the subsystems perform test, calibration or monitoring activities during Private Gaps or Private Orbits defined by the TCS. The generation of the tests and the data acquisition is the responsibility of the sub-detector local DAQ.

The Trigger Control System will have the ability to deliver calibration and test triggers, with the frequency required by the sub-detectors. However the fraction of the data acquisition bandwidth used by calibration/test data will be restricted to a few percent.

Test and calibration triggers issued centrally by TCS should follow a well defined protocol. With a fixed delay before the test/calibration trigger, the TCS will issue a command that is used by the Subsystems to prepare the test (e.g. laser pulse generation). It is the responsibility of the Subsystems to synchronize the test/calibration pulses with the test trigger.

**Table 16.1:** List of sub-detector calibration runs.

| SUB-DETECTOR            | RUN                        | PARAMETERS                                                 |
|-------------------------|----------------------------|------------------------------------------------------------|
| <b>Pixels</b>           |                            |                                                            |
|                         | Calibration run            | <u>Thresholds, Pedestals, Gain (charge/count)</u>          |
| <b>Silicon tracker</b>  |                            |                                                            |
|                         | Pedestal run               | APV pedestals and noise.                                   |
|                         | Test pulse run             | Pulse shape                                                |
|                         | Synchronization run        | Timing                                                     |
|                         | Optical link special run   | Laser bias point, Gain, Receiver dc offset, Link noise     |
|                         | Linearity run              | Linearity of readout chain                                 |
|                         | Alignment run              | Alignment constants                                        |
|                         | MIP calibration            | Energy and inter-strip calibration                         |
| <b>Preshower</b>        |                            |                                                            |
|                         | Pedestal run               | Pedestals, noise                                           |
|                         | Test pulse run             | Linearity, gain, cross talk                                |
|                         | Synchronization run        | Clock phase, pulse shape measurement                       |
|                         | MIP calibration            | Energy and inter-strip calibration                         |
|                         | Calibration with electrons | Cross calibration with ECAL                                |
| <b>ECAL</b>             |                            |                                                            |
|                         | Calibration data           | Energy calibration constants                               |
|                         | Monitoring run             | Pedestals, short term calibration                          |
|                         | Slow control run           | Temperature, leakage current                               |
|                         | Test pattern run           | Delays in electronics                                      |
|                         | Laser run                  | VFE and fibers latency                                     |
|                         | Synchronization run        | Clock phases, BC0 phase, L1A deskewing                     |
| <b>HCAL</b>             |                            |                                                            |
|                         | Radioactive source         | Ageing                                                     |
|                         | Laser calibration          | Gains, ageing, timing                                      |
|                         | LED calibration            | Trouble shooting                                           |
|                         | Charge injection           | Trouble shooting                                           |
| <b>Drift Tubes</b>      |                            |                                                            |
|                         | Threshold run              | Thresholds                                                 |
|                         | Relative $t_0$ calibration | Relative $t_0$                                             |
|                         | Rates                      | Rates per wire                                             |
|                         | Alignment run              | Alignment constants                                        |
|                         | Gap tests                  | Misalignment, dead channels                                |
|                         | Absolute synchronization   | BC0 phase                                                  |
| <b>CSC</b>              |                            |                                                            |
| Cathode DAQ electronics | Test Pulse Run             | Gain, cross talk, linearity                                |
|                         | Pedestals                  | Capacitor constants and noise                              |
| Trigger electronics     | Trigger thresholds         | Threshold                                                  |
|                         | Trigger patterns           | Patterns                                                   |
| Anode DAQ Electronics   | Test pulse run             | Threshold and noise in fC, gain in mV/fC, and offset in mV |
|                         | Synchronization run        | Time delay                                                 |
|                         | Alignment run              | Alignment constants                                        |
| <b>RPC</b>              |                            |                                                            |
|                         | Test patterns runs         | Thresholds                                                 |
|                         | Synchronization runs       | Timing windows                                             |

### 16.1.5 Requirements on Subsystems Reset

The subsystems electronics has to be initialized at each data taking run, accordingly to the chosen run configuration. This activity is initiated by the Run Control and uses the DCS network and services. It implies the access to configuration and parameters data bases when the current defaults need to be superseded, the full reset of the electronics modules, the download of parameters in programmable registers and memories and the verification (read back) of the parameters. In some circumstances, the re-configuration of FPGAs could be necessary. In case of software problems, the crate controllers may need to re-boot. A hierarchy of start run procedures may be defined to reduce at minimum the data acquisition inefficiency. Nevertheless, these procedures imply human intervention and are normally relatively long.

We can anticipate a number of circumstances during data taking, in particular synchronization errors due to single event upsets in the front-end pipelines, due to buffer overflows or due to errors in signals transmission, that will require a reset in order to recover normal functioning. The standard recover procedure involving a new run start may translate in an unacceptably high data acquisition inefficiency.

For this reason we have foreseen the possibility that TCS distributes reset commands through the fast control network to the subsystems. The reset command defines a time interval without triggers that can be used by the subsystems to partially reset their electronics.

The subsystem electronics may foresee different reset levels, for example:

1. Reset of event and bunch counters and reset of readout memories and pointers.
2. Reset of readout state machines.
3. Reset of TTC components.
4. Reset of configuration registers.
5. Full reset (equivalent to power-on reset).

Two fast reset commands are foreseen. The Level 1 Reset command should be interpreted as a re-synchronization of all subsystems to the same event identifier and concerns just the first item in the list. The Hard Reset is intended to recover from electronics malfunctioning. Its range of action depends on the particular module that receives it, but it could at least involve the reset of the readout state machines. However resets that imply a reloading of parameters must be issued under software control.

### 16.1.6 Requirements on Partitioning

It is a general CMS requirement that sub-detectors and sub-detector main components should have the possibility of operating in stand-alone mode, as independently as possible, during setting-up, test or calibration phases. These components are installed and tested by different teams that need the largest possible autonomy to start and stop runs, reset the electronics, change high voltages, change programmable timings or trigger conditions, etc.

It is clear that sub-detector commissioning will need a non-negligible beam time during which sub-detectors may need different and mutually incompatible run conditions. Test and maintenance of some components may need to be done in parallel with stable physics running on

the rest of the detector. In practice, this signifies that the Trigger and Data Acquisition systems should allow flexible configuration of independent partitions.

In this document we are only concerned with the L1 Trigger. The partitioning of the L1 Trigger means that each sub-detector partition will be able to define their own trigger conditions, receiving the L1 Accepts that correspond to that conditions. It also means that the Trigger Control System is able to control the L1A delivered to independent partitions, as well as the other fast commands, and is able to handle the fast feedback received from independent partitions.

Partitioning requires that the TTC distribution network is organized in branches with intelligence at the top of each partition. This would allow the sub-detectors to define their own fast commands specific of the operation in stand-alone mode. The sub-detector TTC partitions have to be integrated in a single network when in normal experiment operation.

### 16.1.7 Requirements on Trigger Control Software

The Trigger Control Software is a distributed software system running in a large number of crate controller CPU's as well as in dedicated back-end computers. The system provides a number of services including trigger configuration and initialization, distribution of run control commands, and production of status , monitoring and error reports. The Trigger Control Software allows a simple integration with the central RCS/DCS through well defined software interfaces.

The Trigger Control Software should be based on a modular distributed control system in order to respond to the needs of various users working in parallel with different parts of the hardware during the setting-up and test periods. The users should be able to interact with the system in stand-alone mode through standard DCS graphical user interfaces. During normal running, faulty hardware components should be able to be isolated and operated in test mode without disturbing the normal operation of the rest of the trigger system. Modularity is also the key for a scalable system able to cope with unavoidable upgrades of the trigger hardware.

The Trigger Control Software should have access to data samples acquired through the spying channel in the trigger readout hardware. Monitoring tasks running locally in the crate controllers shall use these data to survey the operation of the trigger hardware, producing monitoring reports accessible to the central run monitoring. The software on the crate controllers shall be able also to monitor hardware counters and status registers, or respond to interrupts issued by the trigger hardware, providing status reports and the appropriate feed-back in case of faulty hardware.

Run conditions, configuration data, hardware settings, logging information and reference monitoring data should be organized in a data base management system, accessible to all processors in the distributed system. Attention should be given to performance issues, in particular to the data servers response time in case of simultaneous access at beginning of run.

In summary, the Trigger Control Software should:

- a. support user control of a multi partition environment;
- b. support a multi user monitoring environment;
- c. offer tools to operate and protect the subsystem from faulty equipment;

- d. support the testing and simulation of equipment with respect to its integration within the subsystem;
- e. provide continuous monitoring of its operation so that malfunctions may be readily identified and corrective measures taken;
- f. allow the collaboration with other CMS data acquisition subsystems;
- g. organize all relevant data in a data base management system.
- h. allow the integration with already existing commercial tools.

## 16.2 Overview of the Trigger Control System

### 16.2.1 Architecture

For the sake of simplicity, we will refer to the collection of tasks performed by the TCS as a separate system, although it will be physically integrated in the Global Level 1 Trigger as described later. The TCS uses the output from the Final Decision Logic of the Global Trigger, along with machine information, plus operator control through the Run Control to generate the root Control information to be transmitted via the TTC system (see Figure 16.2).

The TCS needs an interface to the LHC RF clock and orbit signal and a Beam Pickup Interface to communicate with the beam pickup electronics. The interface to the LHC clock and orbit is physically located in the central TTC machine interface crate (TTCmi), where the fan-out of clock and orbit signal takes place.

The TCS connects to the Final Decision Logic (FDL) in the Global Trigger. It generates the contents of the control data to be transmitted by the TTC system to the sub-detectors, or directly to the DAQ Event Manager. The system broadcasts the control data to the sub-detector TTC crates, containing a TTCvi and a transmitter per sub-detector partition, and a CPU. The CPU receives DAQ operator control and initializes the TTCvi, which then can receive fast commands via its front panel signal inputs that trigger preprogrammed fast commands to be transmitted on the TTC distribution system.

The TCS includes the Trigger Throttle Subsystem (TTS) which generates information on the front-end electronics readout buffer status, and decides accordingly to deliver or not L1A signals to the TTC system and to the DAQ. The TCS receives hardware signals from the sub-detector partitions, through the Fast Monitoring network, and directly from the DAQ Event Manager giving the status of the subsystems. It also receives software instructions from the Run Control program. The function of all inputs can be programmed and relations between them are possible to create special conditions either to accept or inhibit or even to force L1A pulses.

The TCS contains a Deadtime Monitor that tracks the disposition of each crossing, i.e. whether accepted, rejected, or lost due to downtime of trigger and/or DAQ.

The TCS also sends calibration and test triggers. First the system inhibits other triggers and then it sends a pre-trigger command via the TTC system to the detector readout. After a specified interval it sends the L1A to read the calibration event. As long as dead time does not increase seriously, calibration events can be inserted during the physics run and are handled like normal triggers.



**Fig. 16.2:** Architecture of the Trigger Control System

### 16.2.2 Interfaces

The Trigger Control System has the following interfaces to external subsystems:

1. Interface to the Global Trigger, from where it receives the L1A information.
2. Interface to LHC machine Clock and Orbit signals and Beam Pickup interface.
3. Interface to the TTC network, to distribute L1A and Fast Commands to the Sub-detector partitions.
4. Interface to the Fast Monitoring network to collect information on the status of the front-end electronics.
5. Interface to the DAQ Event Manager, to transmit L1A and Fast Commands and to receive feedback status information.
6. Interface to Run Control/Detector Control Systems.

### 16.2.3 Components

The main components and respective functions of the Trigger Control system are the following:

1. The Fast Control Generator, responsible to generate the fast commands to be distributed to the DAQ Event Manager and to the sub-detectors by the TTC network.
2. The Fast Monitoring Receiver, responsible to collect the fast monitoring feedback from the subsystems and to request the appropriate action when necessary.

3. The Trigger Throttling System, responsible to control the delivery of the L1A based on the verification of Trigger Rules and on the emulation of the front-end buffers.
4. The Calibration and Test Control, responsible to generate the calibration and test trigger sequences.

## 16.3 Trigger Control Interfaces

### 16.3.1 Interface to Global Trigger

After the Global Trigger processing, a decision is made to issue a Level 1 Accept or Reject for each bunch crossing. The Global Trigger has the ability to define 128 programmable Trigger Conditions that are based on the trigger objects (electrons/photons, jets, muons,  $E_T$  sums) computed by the trigger system sub-components (see Chapter 15). It performs 8 final ORs of selected sub-sets of the trigger conditions.

The TCS receives from the Global Trigger the results of the 8 final L1As to be distributed to the various partitions (see Section 16.5). It also receives the 128 trigger conditions bits to be sent to the DAQ Event Manager.

### 16.3.2 LHC Clock Interface

The LHC Clock and Orbit signals are distributed from the Prevessin Control Room to the LHC experiments through singlemode optical fibers. At the experiment Counting Room, the Clock and Orbit signals are recovered by circuitry (LHCrx module) in the TTC Machine Interface (TTCmi) crate. At this level the clock jitter is expected to be of the order of 10 ps rms. Fanout modules (TTCCf) in the TTCmi crate are used to distribute the Clock and Orbit signals to nearby crates housing the Trigger Control System and the sub-detectors TTCvi/TTCex modules (top elements of each sub-detector partition TTC network).

### 16.3.3 Interface to TTC System

The Trigger Timing and Control (TTC) system provides for distribution of timing, trigger and fast control signals using two time division multiplexed channels transmitted over an optical passive distribution network (Fig. 16.3). The channel A is used to transmit the L1A signal whereas channel B transmits other fast commands. Both channels have a bandwidth of 40 Mbit/s.

#### TTC VME Interface and TTC Transmitter

The top elements of each sub-detector TTC partition are the TTC VME Interface (TTCvi) and TTC Encoder and Transmitter (TTCex) modules. The TTCvi receives the L1A and generates various types of programmable commands. The commands are synchronous with the Orbit signal (used in standalone mode) or triggered by front panel signals received from the TCS. The front-panel signals (B-Go Signals) are associated with FIFOs where the TTC commands are programmed. Each FIFO can be operated in repetitive mode. Asynchronous commands initiated by VME control are also possible. The data consists of 8 bits of broadcast commands and 32 bits of individually addressed commands. All of the data has Hamming Error detection and correction.



**Fig. 16.3:** Architecture of the TTC System and interface to TCS.

The TTCCex module encodes both channels A and B from the TTCCvi and has a transmitter laser with sufficient power to drive the optical splitters. The clock from the TTCmi crate is plugged directly into the TTCCex modules. At reception, Optical Receivers with a pin diode and an amplifier generates an electrical signal at the TTCCrx chip input.

### TTC Receiver chip

The TTC signals are received by the TTC Receiver (TTCCrx) chip (Fig. 16.4) which provides as its output the 40 MHz LHC clock, both raw and deskewed, the Level 1 Accept (L1A) trigger and command data. Deskewing is provided for the clock, the L1A and the broadcast commands. The coarse deskew is provided in 16 steps of 25 ns each and the fine deskew is provided in 240 steps of 104 ps each. The TTCCrx tests to have a clock jitter less than 100 ps. The

TTCrx was revised into a Radiation Hard version, which have a hardwired ID at startup, Boundary Scan, an I<sub>2</sub>C interface with all registers read/write, and 64 possible user broadcast commands. The new TTCrx has latency reduced from 100 ns to 70 ns.



Fig. 16.4: TTC receiver chip (TTCrx)

### Interface to TTC Partitions

The TCS sends the L1A to each subsystem TTCvi module. The TCS delivers independently programmable trigger signals in order to allow independent operation of the CMS partitions during the setting-up phases. In normal operation a single trigger is distributed to all partitions.

The TCS sends signals to each TTCvi that will trigger the transmission of the commands stored in the TTCvi FIFOs through the TTC channel B (B-Go signals). The B-Go signals will be used for BC0, event counters reset, fast resets, test enable and trigger start/stop commands. The present version of the TTCvi module allows for four B-Go inputs on the front panel. The specifications of the CMS requirements on the TTCvi module are being presently reviewed.

### 16.3.4 Interface to Fast Monitoring Network

The Sub-detector partitions send to the TCS information on their status. This status is characterized by digital signals, Ready, Busy, Error, Warning Overflow and Out of Sync. When active the status signals have the following meaning:

- a) Ready: the partition is ready to receive triggers. TCS action: allow L1As;
- b) Busy: the partition is temporary busy preparing itself to take data and can't yet receive triggers. TCS action: inhibit L1As;
- c) Warning Overflow: the partition buffers are close to overflow. TCS action: inhibit L1A until warning message disappears;
- d) Out of Sync: event fragments collected in the partition doesn't correspond to the same front-end pipeline position or have different Event IDs. TCS action: send L1 Reset;
- e) Error: the system is in error state and need a reset. TCS action: send Hard Reset.

The Fast Monitoring Network will transport and combine status signals of individual components that imply a reaction from TCS within the time scale of the L1A rate. This is the case of Ready, Busy and Warning Overflow signals. The partitions status signals are updated at 40 MHz using hardware state machines to deduce the partition status from the state of individual components. The fast monitoring signals are sent to the TCS in parallel LVDS. The latency of the feed-back signals should be constant. The signals that originate reset procedures can be collected in longer time scales allowing if necessary for more sophisticated processor based algorithms to decide when and which resets to apply.

The collection of status information from the thousands of front-end components in a central place is too difficult and not necessary. For small sections of the readout and trigger, it is sufficient to take local steps to either send an error header or zero the defective data and log the problem for subsequent readout by the DAQ. The small subset of the subsystem involved would then remain in an error state until a local CPU took action (if this is possible) or a Reset command was issued to the subsystem. The critical points in this situation are to log the error and flag or zero the defective data, rather than to use the fast control network to initiate an immediate reset.

The granularity at which the TCS receive Fast Monitoring feedback is the sub-detector partition (see Section 16.5.2). The feedback signals (Ready, Busy, Error, Warning Overflow, Out of Sync) indicate that the partition is affected by some problem (error, buffer overflow, loss of synchronization). It is the responsibility of the subsystems to establish in which conditions the feedback signals are raised. When running in global mode, the TCS only takes global actions (e.g. L1A throttling, global L1 Reset), it doesn't perform actions addressed to a particular component. When a partition is running in stand-alone mode some commands can be issued to that partition only (see Section 16.5).

The Fast Monitoring Network implementation follows two different models. The first is a tree-like structure with point-to-point connections and hardware state machines in the nodes combining the status of individual components. This model is used by the sub-detector partitions since the latency of the feedback signals can be short and constant as required. The second is a bus-like, message passing channel (e.g. fast ethernet) supported by software. This structure is more appropriate for the collection of feedback from the DAQ units or in the cases where the timing constraints are not severe.

### 16.3.5 Interface to DAQ Event Manager

The interface to the DAQ Event Manager follows a similar model as the interface to the Sub-detector partitions. The TCS sends to the Event Manager all 128 Algorithm bits and for each Partition a 4 bit Trigger Type indicating the partitions that have received the L1A signal. The partitions trigger type tells the Event Manager how to read the next event. For a physics trigger the procedure is to read the data and send them to the Higher Level Trigger processors. Some physics triggers (e.g. low threshold single electrons) are used for detector calibration and are sent to a special data stream. For calibration triggers, the trigger type tells which detectors are to be read and where to store the calibration events.

The TCS sends also to the Event Manager the command L1 Reset which is used to reset the DAQ buffers and the event ID. In the context of the Event Manager the commands Test Enable and Start/Stop Trigger have no meaning. The Event Manager sends back to the TCS the same status signals as the sub-detector partitions. The Busy signal is interpreted by the TCS as a trigger inhibit. The trigger pipeline logic may run but does not send any L1A until the Event Manager releases the Busy signal to allow broadcasting of L1A signals. During the run the Event Manager may send Warning Overflow signals to the TCS to inhibit the L1A temporarily and to allow DAQ system to recover. Dead time counters in the TCS are incremented until the end of the inhibit signal. The Event Manager may also sent to TCS the Out of Sync signal to force a L1 Reset command.

The Event Manager will be located in the surface Control Room, whereas the Trigger Control System will be placed in the underground Counting Room. For this reason, we will use optical links for the communication between the TCS and the EM.

## 16.4 Trigger Control System Components

### 16.4.1 Fast Control Generator

#### Fast Control Signals

Table 16.2 summarizes the characteristics of the fast control signals distributed by the TCS. This table does not include test and calibration signals described in Section 16.4.5.

#### Front-End Event Identifier

The Sub-detector event fragments delivered to the Data Acquisition are identified with a standard Front-End Event Identifier. This identifier contains the Event Number, the Bunch Number.

The Event Number counts the number of L1A received since the last Event Counter Reset and the Bunch Number counts the number of clocks since the last Bunch Counter Reset. Both counter are incremented locally in the front-end TTCrx chips.

The Event Counter has 24 bits and completes a new cycle every 168 seconds (at 100 kHz L1A rate). This time interval is larger than the readout latency of the entire trigger and data acquisition chain. The Bunch Counter with 12 bits counts the periods within one LHC orbit (3564 periods). This counter is reset by the Bunch Counter Reset (BCR) distributed once per orbit.

**Table 16.2:** Fast Control Signals distributed by TCS.

| Fast Control Signals      | TTC command type                 | Comments                                                                                                                     |
|---------------------------|----------------------------------|------------------------------------------------------------------------------------------------------------------------------|
| L1 Accept (L1A)           | Channel A                        | High priority signal transmitted by dedicated channel (Channel A)                                                            |
| Trigger Type              | Broadcast Word                   | Word specifying the trigger type. Optionally, it is broadcast over the TTC Channel B with 1 $\mu$ s latency relative to L1A. |
| Bunch Crossing Zero (BC0) | Broadcast Synchronous with Orbit | Command synchronous with the LHC Orbit.                                                                                      |
| Bunch Counter Reset (BCR) | Broadcast Synchronous with Orbit | Special TTC command that clears the TTCrx internal Bunch Counter.                                                            |
| L1 Reset                  | Broadcast                        | Command that initiates a reset of the L1 readout buffers.                                                                    |
| Hard Reset                | Broadcast                        | Command used for a partial reset of the front-end electronics                                                                |
| Event Counter Reset (ECR) | Broadcast                        | Special TTC command that clears the TTCrx internal Event Counter.                                                            |
| Event Number              | Broadcast Long Word              | 24-bit word Event Number. Optionally, it is broadcasted over the TTC Channel B.                                              |
| Start/Stop Trigger        | Broadcast                        | Broadcast command used to start/stop synchronously the trigger system components.                                            |

The events will also be identified by the Event Time, which gives the absolute time of the L1A and is used to correlate the event data with the slow control data. The Event Time is measured in the TCS and in the sub-detector partitions either using Orbit Number Counters or the time of the day derived from GPS devices.

### L1 Accept Sequence

A Level 1 Accept is issued to the subsystems by the TCS normally in response to input from the GT Final Decision Logic or due to an operator request. The Level 1 Accept as transmitted by the TTC system is always associated with a particular bunch crossing in the detector, even for those triggers such as cosmic ray triggers which are not directly associated with a crossing.

Level 1 Accepts are transmitted to all front end and trigger subsystem crates in the detector, as well as to the DAQ Event Manager. They are used by front-end subsystems to capture into buffer storage the data associated with particular crossings and to hold that data for readout

by the DAQ system. They are used by trigger subsystems to capture the raw trigger data coming up from the front-end crates, and to similarly hold it for DAQ readout.

The Level 1 Accept sequence consists of the following stages. An interaction of interest occurs in the detector. Front-end readout subsystems capture the event and hold it in analog or digital buffer storage pending transmission of a Level 1 Accept. The Level 1 Trigger processes the data in a 25ns pipeline over a number of crossing clock cycles. This includes feed-forward interconnections between multiple layers of crates. The last level of decision processing is performed by the Global Trigger Final Decision Logic. At this point, the accept/reject decision is made for the interaction. The decision to issue a Level 1 Accept for the crossing is shipped to the Trigger Control logic, where it is reconciled with any pending status conditions in the detector, such as a Busy or Level 1 Reset-in-progress. If the detector is able to trigger, then the Level 1 Accept is transmitted via the TTC system to all readout and trigger systems.

### Bunch Crossing Zero Sequence

Synchronously with the LHC Orbit signal, the commands BC0 and BCR are sent to the front-ends. These commands are generated by each partition TTCvi upon reception of the LHC orbit signal. The TCS and all TTCvi in the experiment receive the LHC Orbit signal (see Section 16.3.2).

The phase of the BC0 and BCR commands relative to the Orbit signal is globally adjusted per sub-detector partition with a programmable delay in the TTCvi, and locally adjusted at the level of the receiver TTCr in a smaller range (16 clock periods). It is the responsibility of the sub-detectors to adjust the timing of these commands.

The BC0 command is used by the trigger system to synchronize the trigger data in the trigger processing pipeline (see Chapter 17). It can also be used by the sub-detectors to localize the orbit gap when particular commands are needed in that period (e.g. sub-detector specific reset or test procedures).

The BCR command is a special TTC command used to reset the Bunch Counter in the TTCr chip. Therefore, the TTCr Bunch Counter, used by the subsystems in the event identification, counts from 0 to 3563, the number of clock periods in one LHC orbit.

For most applications the phase of the BC0 command is adjusted so that it is synchronous with data at the front-end input, whereas the phase of the BCR command is such that it is synchronous with data at the end of the front-end pipelines. In consequence the phase between the two commands is of the order of the trigger latency. The detectors that do not participate in the trigger will need in principle only one of the two commands.

### Reset Sequence

A Level 1 Reset is issued by the TCS either at specific intervals or by internal logic in the TCS when it detects a predetermined error condition or by an operator request. A reset request may be generated by a subsystem as a result of the detection of a mis-synchronization condition that affects a significant part of the subsystem (see Section 16.3.4).

The Level 1 Reset sequence consists of the following stages. After receipt of an external request or the internal generation of a L1 Reset command, Level 1 Accepts are shut off. The L1 Reset command is sent via the TTC system. Upon receipt of a L1 Reset, the subsystems assert the

Busy signal, read pending data in the readout buffers (de-randomizers, DCC/DDU buffers and Readout Unit buffers), reset pipelines and readout buffer pointers. When this operation is concluded the subsystems drop the Busy signal. An Event Number Reset is then transmitted via the TTC system. After the reset procedure, the subsystems should be in a state to receive BC0 and L1A signals as at run start.

After a predetermined interval set by the longest time for a subsystem to complete this operation, the TCS resume BC0 and L1A commands. During the time that the subsystem is being reset or is unable to capture data with a L1A, a Busy status should be asserted via the fast control. This only serves as a system check and it not used to control the Reset sequence.

Subsystems which do not have TTCrxs on their front-end and instead use their own internal sequences of L1As to generate resets and other commands are responsible for translating the TTC signals into these special internal sequences. Special sequences of L1As will not be generated by the TCS for such purposes.

A similar procedure is followed for the Hard Reset command, but hopefully less frequently. The main difference is the type of reset performed on the electronics, intended in this case to recover from electronics problems. Registers containing programmable parameters downloaded by software should not be affected by this reset.

Special reset procedures affecting one subsystem or part of it can be initiated by the subsystems provided it preserves the ability to respond to central issued TTC commands. One of the cases is the pixel detector that will need to reset the front-end pipelines once per orbit due to the high probability of radiation induced SEUs. In order to allow the pixel pipeline reset, the main orbit gap will be artificially extended to a size slightly larger than the trigger latency so that the reset can be issued at the end of the gap after the readout of remaining events of the previous orbit. It should be noted that the present estimate of the L1 trigger latency is 127 bxs (see Chapter 17), slightly larger than the duration of the main LHC gap. The pixel pipeline reset does not affect the pixel TTCrx chips, and in particular the local Event Number and Bunch Number, sitting in the Front End Driver (FED) in the counting room.

In addition to the fast resets sequence described here, there will be different levels of resets controlled by software. Some of these will reset limited local amounts of front end electronics, others will reset whole subsystems, and still another will involve a reset of the entire Trigger and DAQ system, resulting in a loss of all data in the pipelines at all levels. The issuing of these resets, which do not involve time critical constraints and are distributed by the Detector Control System, is decided by the Run Control System .

## **Start/Stop Trigger Command**

The Start Trigger fast command is intended to guarantee that, after initialization of the trigger electronics, all trigger components start processing data at the same crossing. This command is generated by TCS after reception of the Start Run command, which is issued by the Run Control, and after all sub-detectors having issued the Ready signal on the Fast Monitoring network.

The Start fast command is decoded by the Trigger Primitive Generators of the calorimeter and muon triggers which will start sending trigger data at the next Bunch Crossing Zero. After the Stop command the trigger stops sending trigger data at the end of the current orbit.

### 16.4.2 Fast Monitoring Receiver

The Fast Monitoring handles all the signals received from the sub-detector partitions and from the Event Manager with a state machine programmed either to stop L1A signals (Trigger Inhibit) or to deliver Reset signals, if needed. The state machine will also take into account the status of the data acquisition received from the Run Control.

The Fast Monitoring status is reported via the DCS network to the Run Control. In turn, the Run Control can ask the sub-detectors DCS for more detailed information and present it on the control display.

### 16.4.3 Trigger Throttling Subsystem

#### Trigger Rules

To attempt to limit overflows as much as possible, a set of Trigger Rules for minimal spacing for L1As are implemented by the TTS. Examples of these rules are:

- i) No more than 1 Level 1 Accept per 75 ns (minimum 2 bx between L1A), dead time  $5 \cdot 10^{-3}$ .
- ii) No more than 2 Level 1 Accepts per 625 ns (25 bx), dead time  $1.3 \cdot 10^{-3}$ .
- iii) No more than 3 Level 1 Accepts per 2.5  $\mu$ s (100 bx), dead time  $1.2 \cdot 10^{-3}$ .
- iv) No more than 4 Level 1 Accepts per 6  $\mu$ s (240 bx), dead time  $1.4 \cdot 10^{-3}$ .

The total deadtime cost for such rules (estimated for L1A rate 100 kHz) is of the order of 0.9%. It is likely that it will not be possible to prevent all overflows in all subsystems with such rules. However, this would allow the creation of sufficient buffer depth to have a much lower probability for holding off the readout due to buffers getting full.

#### Front-end Buffers Emulation

According to the front-end electronic logical model, the front-end derandomizers after the L1 latency pipelines are the first devices to overflow when the L1A rate is too high. Space and power constraints in the front-ends imply small derandomizer depth and hence these queues are very sensitive to bursty L1A. In general, the derandomizers behave like a first-in-first-out queue: the input/output frequency is directly the L1A rate and the event readout time (detector dependent) ranges from 3 $\mu$ s to 7 $\mu$ s.

Simulations have been made showing the relation between the derandomizer depth, the event readout time and the probability to overflow (see Fig. 16.5). The overflow probability is strongly dependent on the ratio between the service time and the buffer depth. A derandomizer with capacity for 8 events and a service time of 7  $\mu$ s per event, as in the tracker, has an overflow probability of  $2 \cdot 10^{-3}$ . The data acquisition inefficiency is negligible, but at the highest trigger rate the buffer would overflow every 5 ms. To reset the buffers at this frequency it is obviously not a good solution.

On the other hand, the use of the fast Warning Overflow to stop the L1As is not effective in this situation. The latency of the fast feedback network is of the order of a few microseconds, mainly due to cable lengths, during which additional triggers can occur. In practice a reserve of a few events in the derandomizer would be needed for these triggers, so that the effective length of the derandomizer is smaller than the real one, implying a larger inefficiency (several percent).



**Fig. 16.5:** Overflow probability as a function of the service time and the derandomizer event depth.

We have established that the best way to handle this problem is to emulate the readout buffers with deterministic behavior, as it is the case of the derandomizers, and to reduce the L1A rate to avoid fatal overflows in the front end buffers. When the TTS determines through a model that a subsystem readout buffer is getting close to full, it either reduces or shuts off the L1A until the buffer occupancy is reduced to a safe level. This is done and the resulting deadtime recorded on the TCS logic.

#### 16.4.4 Deadtime Monitor and Counters

The TCS has a number of counters that are used for monitoring purposes and that are made available through DCS. Some of these counters are used to measure the experiment dead time.

The TCS has at least the following counters that are incremented for the complete orbits between the fast Start and Stop Trigger commands:

1. Number of orbits (Orbit Number).
2. Number of clock periods in the orbit (Bunch Number).
3. Number of beam crossings.
4. Number of beam crossings where L1A was potentially inhibited.
5. Number of triggers per Trigger Condition (without dead time).
6. Number of global triggers (L1As without dead time).
7. Number of distributed L1As (Event Number).

For each new crossing the TCS combines all possible L1A inhibit conditions (trigger rules, buffer emulators, fast monitoring requests to hold L1A, calibration or reset activities, etc.)

and creates the final inhibit signal. This signal not only prevents potential L1A to be distributed but is used to gate counter number 4. The ratio of counters 4 and 3 gives the experiment dead time.

Some signals will be available externally to gate the luminosity counters, in particular the run signal, active between start and stop, and the L1A inhibit signal.

## 16.4.5 Calibration and Test Control

### Calibration and Test modes

Requirements for the calibration activities in CMS were presented in Section 16.1.4. Several calibration and test operation modes are foreseen with the following properties:

1. *Sub-detectors in stand-alone mode:*

Test and calibration sequences (see below) are generated locally;

Data is captured with the sub-detector local DAQ.

2. *Sub-detectors in DAQ partition mode:*

TCS generates test and calibration triggers at the rate required by the sub-detector partition;

Data is collected by the central DAQ.

3. *Central test and calibration triggers during a Physics Run:*

Test triggers sequences are issued centrally and distributed to all partitions;

Event Number is incremented in local TTCrx;

Subsystems deliver an event data block (the event block can be empty );

Calibration/test triggers are issued at pre-programmed cycles in the LHC orbit;

Data is collected by the central DAQ.

4. *Local test and calibration triggers during a Physics Run:*

Test and calibration triggers can also be handled at the subsystem level during a Physics Run provided the test or calibration activities occur during the Private Gaps or Private Orbits marked by TCS.

Table 16.3 lists the fast signals issued by TCS and distributed by TTC to control the calibration and test activities.

**Table 16.3:** Calibration Control Signals

| Fast Control Signals | TTC command type | Comments                                                                      |
|----------------------|------------------|-------------------------------------------------------------------------------|
| Test Enable          | Broadcast        | Broadcast command sent a fixed time before a test or calibration trigger.     |
| Private Gap          | Broadcast        | Broadcast command marking the next gap for private use by the sub-detectors   |
| Private Orbit        | Broadcast        | Broadcast command marking the next orbit for private use by the sub-detectors |

### Calibration Trigger Sequence

The operation and scheduling of calibration and test triggers during a run are controlled by the TCS. A schedule of all tests to be run is downloaded to the TCS before the beginning of a run by the DAQ. The order of the tests determines which test is performed. The TCS sends to all subsystems a Test Enable signal a fixed number of crossings before the crossing when the test is to be done. This enables subsystems to prepare for the test. The subsystems set up their tests so that the test data will be contained in the exact crossing indicated by the TCS Test Enable signal. The TCS inhibits normal triggers a fixed time before the test trigger crossing so that there will be no interference. The TCS then sends out a Level 1 Accept for the appropriate crossing to read in the test data.

The timing of the test and calibration triggers can be programmed in order to happen at pre-defined bunch crossing numbers in the LHC orbit. Normally these triggers will occur during the LHC main gap but they can also be delivered in superposition to beam crossings to study pile-up effects.

If different types of tests are needed (e.g. laser pulse, electric pulse, pulses send to restricted geographical regions in the detector, etc.) it is the responsibility of the subsystems to pre-load sequences of different test commands in the relevant TTCvi FIFO.

When received by the subsystems front-ends, the Test and Calibration L1As are treated in the same way as the physics triggers. The front-end readout sequence is the same, the Event Number is incremented and the data is collected by the central DAQ. As pointed-out before, the subsystems may respond with an empty event block.

Some sub-detectors need to reconfigure the front-end electronics to perform some type of tests. This operation is done by the Front-End Controllers (e.g. tracker and preshower) or by dedicated logic in the Readout Boards (e.g. calorimeters) sending specific commands to the front-end electronics on the detector. Despite the fact that this action can be initiated by a TTC fast command, it is not practical to switch the front-end electronics configuration to test mode and then back to normal mode at each calibration trigger. For this purpose calibration mini-runs can be inserted during normal physics runs, under software control, providing the subsystems with a period without physics triggers during which the front-end electronics mode is switched and a short sequence of calibration triggers is taken.

## Calibration Triggers at the Subsystem Level

Some subsystems have the ability to generate test and calibration signals by themselves. This feature is already being used in the test beams and will naturally be integrated in the experiment for tests in stand-alone mode. In normal physics data taking, these systems may be used provided that they not interfere with the main DAQ path.

For this purpose, the TCS will issue periodically the command Private Gap (Private Orbit) marking the next Gap (Orbit) for private use by the sub-detectors. The subsystems may use the gap to generate test/calibration signals at the top of the subsystem TTC (TTCvi) or by an independent system. The Event Number is not incremented and the data is collected by the subsystem local DAQ. This possibility is used when the volume of test or calibration data is small and can be easily handled locally.

## Calibration Logic

The Calibration logic in TCS controls calibration cycles in close collaboration with the connected TTCvi board. A calibration cycle includes a 'Test Enable' signal, the generation of data in the subsystem (laser pulses, test pulses, test patterns...) and the L1A to read the data. The programmed parameters in the Calibration module and in the TTCvi board have to match correctly. The Calibration logic accepts software requests as well as fast external requests from the connected subsystem and either inserts just one calibration cycle during normal data taking or runs calibration cycles continuously.

When a calibration cycle is initiated a B\_Go(i) signal is sent to the TTCvi board at a programmed bunch crossing number to start a calibration cycle. The next preloaded command stored in the FIFO(i) is sent to the subsystem front-end to start the calibration procedure. The actual code of the synchronous commands stored in FIFO(i) is the responsibility of the subsystem. It may include one or more 'Test Enable' codes for different types of tests and an 'Empty Event' code if the subsystem is not interested in the test data. Some time later (of the order of 100 bx) at a preloaded bx number the Calibration logic sends a L1A to read the calibration data. During the time from B\_Go(i) until the corresponding L1A the normal physics triggers are stopped.

Several different cases are considered:

- Calibration for another subsystem of the same partition group: if several subsystems are connected to the same partition group (see Section 16.5) and calibration is issued for one or more of the subsystems the other subsystems have to generate empty events to keep the common event number consistent within the group.
- Dedicated Calibration Run: continuous calibration cycles can be requested for dedicated calibration runs.
- Several L1A per calibration cycle: the logic allows to send more L1As per calibration cycle if needed.
- Several calibration cycles per orbit: if a subsystem wants to take calibration data more frequently, several calibration cycles can be sent per orbit.

## 16.5 Partitioning

Partitioning into subsystems may be accomplished in two ways. The principal method (DAQ Partition Mode) is to maintain the full TTC tree and perform separate L1A for different subsystems in the GT. Each partition receives its own L1 trigger and the Event Manager requests readout of the partitions receiving the L1A signal.

A second method (Standalone Mode) is to shut off the TTC link between the subsystem and the TCS. The DAQ Processors controlling the subsystem TTC signal distribution and the TCS accomplish this shutdown. Under control of the local DAQ Processor in the subsystem TTC distribution crate, the TTCvi then takes over the function of generating control data over the subsystem's TTC network. This method has a restricted use since it is only employed with local triggers and local DAQ readout.

The Level 1 Trigger presents a special case in partitioning. Conceived as a subsystem of its own, alongside the various front-end DAQ subsystems, it nonetheless has components within it that are closely associated with each of the DAQ subsystems. For some types of testing, it is desirable to include both the front-end DAQ crates and their associated Level 1 Trigger crates in the same test partition. To accommodate this situation, the Level 1 Trigger system, along with the selected DAQ subsystems, can be controlled from the TCS as a limited configuration using as much of the trigger decision logic as possible (based on which subsystems are available in the limited configuration). The remaining subsystems that are not in the configuration may be operated as separate partitions as described above.

### 16.5.1 Trigger Control Partitioning

The architecture of the Trigger Control System provides control of 32 physical partitions (see Section 16.5.2). Identical units in the TCS implement the L1A and fast command generation, as well as the trigger throttling and calibration functions, for each sub-detector partition (see Figure 16.6).

Each Partition Trigger Control (PTC) unit consists of a TTS (trigger throttle system) module and a Calibration module, and each can select one out of eight Final-OR signals. The PTC unit sends a L1A, a trigger type word and the B\_Go signals to the respective TTCvi board. All units receive the trigger bits computed by the Final Decision Logic (FDL) in the Global Trigger (see Chapter 15).

The PTC units can be combined to form partition-groups that wish to run common test and calibration procedures. In normal data taking all subsystems are connected to one group. Within one group one of the units is programmed as master and the other as slave. The master provides the L1A and the event number of the group. The PTC unit for the Event Manager (EVM) runs either as the group master or as slave for all partitions. In normal data taking mode the EVM-PTC unit is master. As slave, the EVM-PTC unit sends an OR of all partitions L1As to the DAQ Event Manager. The trigger type words tell the EVM to which subsystems the current L1A has been sent. The TCS will be physically located in the Global Trigger crate in order to minimize the L1 trigger latency.

In summary, the foreseen implementation of the trigger control allows the following possibilities:



Fig. 16.6: Partitioning of the Trigger Control System.

- i) The TCS allows the subsystem partitions to run either in single partition mode or combined in partition-groups.
- ii) The TCS provides all functions necessary to run partitions or group of partitions in parallel for calibration, tests and commissioning with beam.
- iii) The partitions are configured by the Run Control both in the TCS and in the DAQ Event Manager. The TCS and the EVM count the Event Number per partition or partition-group. The DAQ system combines data from the subsystems of a group to build events.
- iv) Local readout of events triggered by central L1As is possible: a partition may want L1A to read events by the local DAQ and does not require central DAQ resources. The TCS sends L1A to the partition but the partition bit is not sent to the EVM. Therefore the EVM does not read this partition.

### 16.5.2 Sub-detector partitions

The number of partitions is limited to 32. Table 16.4 gives a preliminary list of the sub-detector partitions. This list will be updated after the sub-detectors having defined precisely their requirements.

In the present preliminary design, and due to L1 latency limitations, the number of trigger condition OR's in the Final Decision Logic is limited to 8. That means that only up to 8 partitions (or partition groups) can run physics triggers simultaneously. For calibration and tests up to 32 partitions can run concurrently.

**Table 16.4:** Preliminary list of sub-detector partitions

| Sub-Detector                                | Number of Partitions | Partitions                                                  |
|---------------------------------------------|----------------------|-------------------------------------------------------------|
| Pixels                                      | 2                    | Barrel, Forward                                             |
| Si-Tracker                                  | 4                    | Disk+, Disk-, Inner barrel, Outer barrel                    |
| ECAL                                        | 6                    | EB+, EB-, EE+, EE-, SE+, SE-                                |
| HCAL                                        | 6                    | HB+, HB-, HE+, HE-, HF+, HF-                                |
| RPC                                         | 4                    | Endcap+, Endcap-, Barrel+, Barrel-                          |
| DT                                          | 2                    | Barrel+, Barrel-                                            |
| CSC                                         | 2                    | Endcap+, Endcap-                                            |
| Calorimeter, Global Muon and Global Trigger | 3                    | Calorimeter trigger, Global Muon trigger and Global trigger |
| Muon Trigger                                | 2                    | CSC and DT trigger                                          |

## 16.6 Trigger Control Software

### 16.6.1 Operational environment.

The user will be able to pilot the entire Trigger System, or any of its sub-components, either from the central Run Control/Detector Control System, or in standalone mode, isolated from the other data acquisition subsystems.

The user can choose a specific configuration of the trigger hardware to test, control, read-out and eventually monitor. These partitions are described in terms of elementary partitions (elements controlled by a CPU), relationships between these elementary partitions, and between these and particular equipment modules. Once specified the equipment to be operated and the run conditions, the user will then be able to initialize it and to start its operation.

During the run, the control software will monitor the behavior of the trigger hardware and will display information characterizing its behavior. After a Stop, the user can look at a statistical summary of the run activity. Concurrently, other users can log into the system to run specific monitoring tasks, or to operate equipment that has not yet been allocated.

During the setting-up periods, several users will be running independently the trigger sub-components (calorimeter, muon and global triggers). The debugging of each component will be more efficient if done by rapidity regions, or physical space sectors. In this scenario one can expect different users, at the same time, operating various detector partitions.

When testing, the user will be provided with a suite of test functions. At the end of each test the user can then look at a summary of the test results.

## 16.6.2 Control Software Framework

In this section we describe the general characteristics of the trigger software framework (CARDS) common to the various sub-component controllers (e.g. Calorimeter Trigger Control, Muon Trigger Control and Global Trigger Control). This framework provides a common tool that can be configured and adapted to the needs of the specific controllers. The trigger software framework will be based on the components and tools of the commercial SCADA selected for DCS, as long as these tools give the necessary functionality.

The functional requirements of trigger control software are assigned to different actors which perform different activities. The actors are the Administrators, the Experts, the Operators and the Users. The activities or use cases that the actors will perform are Administration, Maintenance, Test & Diagnostics, Configuration, Operation and Monitoring. A short description of the content of the Use Cases is given in Table 16.5.

These Use Case categories can be mapped to different applications:

- an administration application, for user administration of the Control Software. Implements Use Case Category Administration;
- an inventory application, to allow the users to manipulate and maintain equipment, elementary partitions and partitions, belonging to the trigger subsystem. This application's purpose is to provide a suitable user interface with the repository database. Implements Use Case Category Maintenance;
- a configuration tool, to allow the user to select that part of the trigger subsystem that he will want to operate. Implements Use Case Category Configuration;
- a control framework supporting concurrent user access and capable to work interactively or in server mode on the selected configuration of the trigger subsystem. Implements Use Case Category Operation;
- an interactive test and simulation framework to verify the functionality of a given hardware equipment or software component. Implements Use Case Category Tests & Diagnostics;
- a monitoring framework supporting multi-user access to supervise the trigger subsystem. Implements Use Case Category Monitoring. In addition to it, API's will be provided to allow a user to integrate specific monitoring services within the subsystem;
- an accounting application, to monitor the trigger subsystem occupation. Implements Use Case Category Accounting.

The difference between the control application and the test and simulation application is that the first can be driven by another (external) detector control system, like the CMS Run Control System. The test and simulation collaboration can only be driven by an interactive client. Therefore, from the control perspective, CARDS can be seen as yet another Sub-Detector Control System (DCS node) and thus should be implemented according to the CMS SCADA framework recommendations.

From the architecture point of view, CARDS can be seen as a 3-tier system for the administration application and the inventory application; and as a n-tier system for the remaining products. The advantages of these architectures are manifold:

**Table 16.5:** Use Case Categories.

| Actor         | Use Case Category   |                                                                                                                    |
|---------------|---------------------|--------------------------------------------------------------------------------------------------------------------|
|               | Name                | Description                                                                                                        |
| Administrator | Administration      | User registration.                                                                                                 |
| Expert        | Maintenance         | Partition, elementary partition and device registration, modification and deletion.                                |
| Expert        | Tests & Diagnostics | Partition test: functional and boundary scan tests, synchronization adjustments.                                   |
| Expert        | Configuration       | System configuration operations: partition, elementary partition and device selection and setup.                   |
| Operator      |                     |                                                                                                                    |
| Expert        | Operation           | Partition operation: control operations.                                                                           |
| Operator      |                     |                                                                                                                    |
| User          | Monitoring          | Partition monitoring: raw data displays, raw data consistency checks, synchronization checks, statistical reports. |
|               | Accounting          | System status: alarms, occupation, statistical displays.                                                           |

- the control logic is located solely in the application server;
- clients communicate with application servers using lightweight standards based protocols;
- data is processed in the application server, thereby reducing the client network traffic and permitting clients to be deployed across bandwidth-poor networks like the global Internet;
- instead of interacting with the system infra-structure directly, the client calls control logic in the applications server. Since clients have no knowledge about the ultimate data sources, they can evolve independently of the system infra-structure;
- offer better hardware architecture flexibility;
- permit the integration of legacy systems by object wrapping;
- can provide heterogeneous database support.

All CARDS applications are characterized by a client process, one or more server processes and a database server process. Each client process is driven by a graphical user interface and can be launched either from a remote site or from one of the machines sitting in the Experiment's Control Room.

To support the monitoring application we need to have several back-end computers with graphical capabilities in the Experiment's Control Room. Some of these machines can and will be used for displaying, in permanence, the status of the trigger system. The rest will be used to support the sporadic needs of the shift crew members.

It should be noted that this architecture is not yet final. The size of the global process to be controlled (in terms of data rates and response-time scales) depends on the evolution of the user requirements. A more accurate estimate of that size, may require an additional processor layer between the Service Provider and some subsystems.

## 16.7 Software Functions

The Trigger Control Software is integrated in the global CMS on-line software structure, providing services requested by the Run Control System (RCS) or by the Detector Control System (DCS). In this sense the Trigger Control Software is a node of both systems, providing services that can be invoked either by RCS or DCS through software interfaces.

The CMS Run Control System is a distributed, hierarchical and partitionable system that takes the control of the experiment during physics data taking or detector calibration. RCS commands the start/stop of runs, takes the initiative for reset procedures, has control over the main data acquisition path and organizes the on-line data processing activities. To perform its activities, RCS requires from DCS the downloading of parameters as well as the monitoring of slow control data.

The CMS Detector Control System (DCS) is a distributed, hierarchical and partitionable system build out of software and hardware components. It will be based on a commercial Supervisory Control And Data Acquisition (SCADA) system, which provides features such as alarm handling, logging, archival, access control, scripting and user interface. During beam off periods DCS is maintained active controlling the status of the detectors infrastructure.

The RCS and DCS systems will be accessible from everywhere in the world via the Internet allowing an expert to work everywhere with the system without additional restrictions.

The part of DCS which deals with the trigger will ensure that, on request of start of run from the Run Control, all necessary configuration data is available to the trigger control, which then distributes the configuration to its clients within the crates. Which configuration was selected, by whom, and when, will be logged in DCS. In case of calibration and test triggers, the sequence is pre-programmed at the beginning of the run. As for the other sub-detectors, DCS will require access to an external data base with its management system to handle the large data volumes required to keep all configuration and monitoring data. The data base information can be accessed directly by the crate controllers once they are informed of the required configuration.

The access control system of DCS will ensure that parameters are only changed and down loaded in a controlled way. In case of a problem within the configuration process, an alarm would be raised and the start of the run suspended. In parallel, DCS has to control all standard slow control items within the trigger system like power for the crates.

The trigger hardware at the various levels will transfer monitoring information and error reports to the trigger software where, if necessary, an alarm would be raised and passed on to DCS.

A fraction of trigger data send to the central DAQ is spied on the crate processors where it can be used for monitoring or diagnosis purposes. Summary reports and statistical data will be produced and made available to RCS or DCS.

## 16.8 Software Design

### 16.8.1 The software process

We will adopt appropriate engineering techniques for the construction of the control software. The software process inputs are the user requirements for the software system and the output is the software system itself - including code and documentation. This process is then split into sub-processes which are linked by intermediate deliverables (project documentation).

The project deliverables are documents of various kinds, such as designs or source code which developers produce at particular stages of the software process. Each deliverable should be reviewed by a small group before being released for general use, so that there is a guarantee that it is of acceptable quality. The format of the deliverables should be defined as to encourage uniformity across the project.

After the requirements have been collected and the outline of the system structure worked out, the development should be carried out in a sequence of controlled cycles. At the end of each new cycle a new working system should be delivered, although with limited functionality. Each new cycle should add some more functionality, integrate it with the existing system and perform some global tests. This type of development is termed 'incremental'. A developer can work independently on the implementation and test of a module, during a cycle, and then integrate it for system testing at the end of the cycle.

To insure and improve software quality we should review the deliverables and identify those software metrics that might give an indication of quality. It must be possible to show that each deliverable satisfies the requirements implied in each preceding document of the software process.

### 16.8.2 The software development environment

By software development environment is meant the integrated set of tools on the developer's desktop (CASE tools, testing tools, compilers, linkers, debuggers etc.) which will allow him/her to participate in the development or modification of the software. This software development environment should include:

- a CASE tool, to help in the design process, to generate code (at least partially) and reverse-engineer existing code, thereby providing a permanent link between the design documents and the corresponding code;
- compilers, debuggers and linkers, supporting the chosen implementation languages (C++ and Java);
- testing tools, for code verification and validation. There are tools in the market which can help with consistency checking of designs, and with the generation and execution of test cases. These tools can also analyze the code test coverage and look for memory leaks. Besides, if the developed code follows a pre-defined set of standard rules, we can automatically check code against standards, and collect several metrics from the code. Such metrics can be used to measure the quality of the code;

- a configuration management tool. The idea is to have a master repository at CERN available to external collaborators;
- documentation tools, for automatic generation of documentation from the code and design diagrams, making the fullest possible use of the World-Wide Web.

## References

- [16.1] W. Smith, A. Taurok, J. Varela, C. Wulz, “The CMS Trigger Control System”, CMS Note in preparation.
- [16.2] W. Smith, “CMS Synchronization Workshop Conclusions”, CMS IN-1999/016.
- [16.3] A. Racz, “Trigger Throttling System for CMS DAQ”, CMS Note in preparation.
- [16.4] J. Varela, “Timing and Synchronization in the LHC Experiments”, proceedings of the 6th Workshop on Electronics for LHC Experiments, Krakow, September 2000, CERN 2000-010, CERN/LHCC/2000-041.
- [16.5] Ph. Farthouat, P. Gallno, “TTC-VMEbus Interface TTCvi”, RD12 Working Document.
- [16.6] J. Christiansen, A. Marchioro, P. Moreira, TTCrx Reference Manual, RD12 Working Document.
- [16.7] B. Taylor, “TTC Distribution for LHC Detectors”, IEEE Trans. Nuclear Science, Vol. 45, 1998.
- [16.8] B.G. Taylor, LHC machine Timing Distribution for the Experiments, proceedings of the 6th Workshop on Electronics for LHC Experiments, Krakow, September 2000, CERN 2000-010, CERN/LHCC/2000-041.

# 17 Synchronization and Latency

## 17.1 Introduction

The design of the CMS front-end readout and level 1 trigger follows a synchronous pipeline model: the detector data are stored in pipeline buffers at the LHC frequency waiting the L1 decision, while data from the calorimeters and muon detectors are processed by a distributed, pipelined, tree-like processor that computes the L1 decision. The L1 latency must be constant and shall match the pipeline buffer length. The whole system behaves synchronously. By definition, the operation of the system implies that the synchronization of signals at several levels is achieved and monitored.

The detector signals shall be synchronized with the 40 MHz clock phase. The L1 trigger signal broadcasted to the detector electronics shall be synchronized with the pipeline data. This data shall correspond to the same crossing as the trigger. Programmable delays are adjusted to achieve synchronization and correct bunch crossing identification. The number of different delay parameters in the experiment is very large. A clear procedure to determine these parameters need to be defined.

The trigger system is based on the assumption that, at the input of every processing stage, data are synchronized and belong to the same bunch crossing. The monitoring of the bunch number of trigger data flowing in the trigger pipeline is of the greatest importance. After the L1 accept signal, event fragments have to be collected to form complete events. Careful checking of event identifiers, recovery from synchronization losses and management of buffer overflows are some of the issues in this context.

The sources of bad synchronization are numerous. The assignment of a pulse to a bunch crossing depends on the shape and jitter of the signals. Variations in the signal shape with the amplitude, large signal jitter, jitter of the clock phase all can be at the origin of bad identification of the signal timing. The detector signals in a given crossing are distorted by eventual pile-up of signals from previous or following crossings. This effect will not only deteriorate the pulse height measurement but its timing as well. For the first time a so large number of high-speed optical serial links will be put in synchronous operation. Any loss of link synchronization will have immediate consequences in the trigger and in the readout synchronization. Optical fibers are mechanical fragile media. Due to friction when pulling or moving cables, stretching can change the fiber lengths. It is also expected that the optoelectronics and associated electronics will introduce propagation time changes with signal amplitude and temperature of a few nanoseconds. The system operates with a single clock frequency, the LHC frequency. However, the phase of the clock signal, after distribution to tens of thousands destinations, is unpredictable at the level of a few ns. When transmitting data between sub-systems re-synchronization of the data to the clock phase at the reception is in order. Much care has to be taken in the way this operation is done so to avoid missing or adding one clock cycle to the transmission latency. Some of this problems and the respective solutions will be investigated and validated using the LHC-like test beam available at the SPS.

The Level 1 trigger latency is a basic parameter of CMS. It determines the amount of pipeline data storage that must be provided in the front end electronics systems located near the

detector. CMS has defined a global L1 latency envelope of 128 bunch crossings (3.2  $\mu$ s). This value was assumed in the sub-detector front-end pipeline design, the analog tracker and preshower pipelines in particular.

In this chapter we will refer to the following synchronization issues (Figure 17.1):

- Sampling Synchronization: synchronization of the detector signals with the clock phase.
- Serial Link Synchronization: recovery of parallel data words from the serial bit stream.
- Bunch Crossing Identification: assignment of a bunch crossing number to data in the Trigger path.
- Trigger Data Alignment: alignment of trigger data at the input of the trigger pipeline processors.
- Sub-Triggers Synchronization: alignment of trigger data from different sub-systems at the Global Trigger input.
- L1 Accept Synchronization: Synchronization of L1A signal with data in the readout pipelines
- Event Synchronization: assignment of bunch and event number to data fragments in the DAQ path.



**Fig. 17.1:** Synchronization in the CMS trigger and readout systems.

## 17.2 Latency

There are many contributions to the latency. These include time of flight to the detector; propagation of signals within the sensitive elements of the detector; signal processing and trigger primitive generation times; cable runs within the detector hall and from the detector hall to the control room; time to form regional trigger components; time to make the global trigger decision; and time to distribute the L1 trigger accept signal back to the electronics in the front end.

In the CMS detector, the muon drift tubes have by far the largest “signal collection” time. The maximum drift time within the tubes is about 0.4  $\mu$ sec. The primitive data from this subsystem is the last to be available to the trigger decision logic. In addition, the front-end pattern logic for the CSC trigger also absorbs significant time. The estimates of the trigger logic for each subsystem indicate that the muon subsystem will be the last system ready for the global level 1 trigger. The calorimeter will be done ahead of it and will be delayed or “aligned” to the muon system. The time critical elements then become the muon trigger itself (including signal paths to the counting room), the time required to form the global level 1 trigger, and the time required to distribute the L1 Accept signal back to the front end electronics. These items are the critical ones in determining the total L1 latency.

An appreciable portion of the latency is in fixed cable delays, in particular the cables between the detector cavern and the electronics room. In order to minimize the cable latency dedicated straight paths for trigger cables are foreseen in the underground layout.

### 17.2.1 L1 Latency Budget

Four numbers are used to specify the L1 latency requirements for CMS. They are:

1. ‘Trigger Processing’ time: The maximum time allowed for any subsystem to process information and for the global L1 trigger to make a decision and produce the L1 accept; in case of the DT trigger this also include the tube drift time;
2. ‘Data Transmission’ time: The maximum time allowed for the transmission of the trigger primitive information to the trigger decision racks in the CMS electronics room and for transmission between trigger components in the counting room;
3. ‘L1 Accept Fanout Processing’ time: The maximum time allowed for the L1 accept to be fanned out for transmission back to detector front-end electronics;
4. ‘L1 Accept Transmission’ time: The maximum time allowed for the L1 accepts to be transmitted from the electronics room to the front end electronics on the detector.

These numbers are listed in Table 17.1.

**Table 17.1:** Time in 'crossing' for five major divisions contributing to the L1 latency

|                                 | Maximum latency |
|---------------------------------|-----------------|
| Drift Time & Trigger Processing | 70 bx           |
| Data Transmission               | 28 bx           |
| L1 Accept Fanout                | 6 bx            |
| L1 Accept Transmission          | 24 bx           |
| Total                           | 128 bx          |

## 17.2.2 L1 Latency Components

### L1 subsystems latency

The latency of the L1 trigger subsystems was discussed in detail in the relevant subsystem chapters. Figure 17.2 provides a summary of the trigger latency. Table 17.2 lists individual items contributing to the subsystems latency.



**Fig. 17.2:** Trigger latency chart

**Table 17.2:** Summary of L1 trigger subsystems latencies.

| Trigger Sub-system                                      | Latency    |
|---------------------------------------------------------|------------|
| Calorimeter Trigger                                     |            |
| VFE and detector link (90m)                             | 23 bx      |
| Primitive formation and transmission to CRT (20 m)      | 23 bx      |
| Regional trigger and transmission to GCT (20m)          | 24 bx      |
| Global calorimeter trigger and transmission to GT (GMT) | 15 (5) bx  |
| Total at GT (GMT) input                                 | 85 (75) bx |
| RPC Trigger                                             |            |
| On-detector sync and compression                        | 8 bx       |
| Transmission (90 m)                                     | 18 bx      |
| Splitters and trigger board                             | 29 bx      |
| Sorting tree and transmission to GMT                    | 24 bx      |
| Global Muon Trigger                                     | 10 bx      |
| Total at GT (GMT) input                                 | 89 (79) bx |
| CSC Trigger                                             |            |
| Comparator signals at CLCT/TMB cards input              | 13 bx      |
| Anode and cathode LCTs found and combined               | 15.5 bx    |
| Port card processing                                    | 5 bx       |
| Optical link (90m)                                      | 18 bx      |
| SR and SP processing                                    | 16.5 bx    |
| CSC muon sorting and transmission to GMT                | 10 bx      |
| Global Muon Trigger                                     | 10 bx      |
| Total at GT (GMT) input                                 | 88 (78) bx |

Table 17.2 (Cont.): Summary of L1 trigger subsystems latencies.

| Trigger Sub-system                                  | Latency    |
|-----------------------------------------------------|------------|
| DT Trigger                                          |            |
| Drift time and front-end electronics                | 18 bx      |
| BTI-TRACO-TSS-TSM                                   | 13 bx      |
| Chamber link, sector collector and sector link      | 23 bx      |
| Sector receiver, extrapolator and quality sorter    | 7 bx       |
| Track assembler, parameter assignment, wedge sorter | 13 bx      |
| DT muon sorter and transmission to GMT              | 7 bx       |
| Global Muon Trigger                                 | 10 bx      |
| Total at GT (GMT) input                             | 91 (81) bx |
| Global Trigger                                      | 8 bx       |
| Trigger Control System and TTCv1                    | 4 bx       |
| Link to Detector (90 m)                             | 18 bx      |
| Receiver TTCr1 and local transmission               | 6 bx       |
| TOTAL                                               | 127 bx     |

### Cable latency

The trigger cable length was studied in detail by the CMS Integration group. This study used a CAD model of the CMS detector where the cable layout was simulated. Figure 17.3 illustrates the cable layout on the endcap detector. The three iron yoke endcap disks can be open by 3m between any two disks without disconnecting cables. Figure 17.4 illustrates the cable path to the Forward Hadron Calorimeter. As it is shown, the detector can be open without disconnecting the trigger cables. Finally, Figure 17.5 shows the special cable path between the experimental hall and the counting room, for the barrel cables. Similar cable paths exist for the two endcaps.

Table 17.3 shows the minimum and the maximum distances to entrance of Counting Room USC55 from specific detector points. These values are smaller than the 90m cable length assumed in Table 17.2.

**Table 17.3:** Distances to entrance of USC55 from specific detector points.

| Location | EB/1<br>(+z face) | EB/1<br>(-z face) | SE/1 | YE/2 | YE/3 |
|----------|-------------------|-------------------|------|------|------|
| Min (m)  | 50.7              | 48.1              | 58.9 | 65.4 | 65.1 |
| Max (m)  | 61.6              | 59.1              | 79.1 | 85.6 | 87.9 |

**Fig. 17.3:** Cable path in the Calorimeters and Muon detectors endcaps.



**Fig. 17.4:** Cable path to the Forward Hadron Calorimeter.



**Fig. 17.5:** Cable path from the barrel detector to the Counting Room.

## Overall L1 Latency

The latency of the Calorimeter Trigger is estimated at 85 bx, including 30 bx for data transmission at various levels. The Drift Tube Trigger, with a latency of 91 bx, including the drift time, the cable paths and the global muon trigger, determines the latest arrival at the Global Trigger input.

The GT issues a Level 1 Trigger Accept/Reject at its output 99 bunch crossings after the event occurred. The distribution of the L1A signal back to the detector is estimated at 28 bx, including 4 bx in the TCS/TTCvi electronics, 18 bx in the cable path and 6 bx in the TTCrx chip and local transmission. Adding up all latency components, the present estimation of the L1 latency is 127 bunch crossings. Assuming 10% contingency both in the cable lengths and in the pipeline electronics, we conclude that the L1 trigger latency will be smaller than 140 bx.

This latency is much smaller than the capacity of the digital pipeline buffers used in the pixel detector, calorimeters and muon detectors. The tracker analog buffer in the APV25 chip (CMOS.25 $\mu$  technology) has 192 buffer cells, out of which 24 cells are used for post Level-1 buffers (8 events x 3 cells/event). This leaves 168 cells that can be used for pipeline storage during the L1 latency and for timing measurements (shifting time). The present version of the Preshower front-end chip (DMILL technology) has 160 buffer cells. Assuming the same storage for de-randomizing buffers (24 cell) we conclude that the present preshower pipeline length is too close to the L1 latency. There is an on going effort to evaluate the possibility of enlarging the preshower front-end buffer length by 5-10%.

## 17.3 Overall Trigger Pipeline Alignment

### 17.3.1 Synchronization of the Detector Signals with the Clock

All level 1 trigger and DAQ pipelines must be driven with a common clock synchronized to the LHC crossing frequency. This clock, distributed centrally via the TTC system, is phase locked to the LHC machine clock and has a 25-nsec cycle. The processing of the Level 1 decision is driven by this cycle.

The phase difference between the LHC 40 MHz clock and the arrival of detector signals from collisions at the front-end electronics must be determined, adjusted for and monitored. The methods used to convert the analog detector pulses to digital information are detector dependent but all rely on the clock signal. The determination of pulse amplitude and timing, in particular the assignment of pulses to bunch crossings, depends critically on the clock phase initial adjustment and stability.

#### Requirements on Clock Jitter

The individual subdetectors have different requirements on the jitter and long-term phase drift of the TTC clock at their receipt at the front-end electronics (Table 17.4).

The tracker requires a jitter of less than 0.5 - 1.0 ns. This is most critical in the deconvolution mode readout. The clock drives a PLL on the Front End, which regenerates locally the clock and level 1 accept with the required jitter.

The Pixel system requires the clock jitter and drift to be less than  $\pm 2$  ns. Degradation in resolution occurs if the jitter exceeds 5 ns and the wrong time stamp can occur if the jitter exceeds 10 ns.

**Table 17.4:** Requirements on clock jitter and phase adjustments

| CMS sub-system   | Clock jitter<br>(ns) | Clock phase<br>adjustment<br>step (ns) |
|------------------|----------------------|----------------------------------------|
| Pixel Detector   | < 2                  | 1-2                                    |
| Tracker          | < 0.5 - 1            | 1                                      |
| ECAL             | <0.25                | 1                                      |
| Preshower        | < 1                  | 1                                      |
| HCAL             | < 2                  | 1                                      |
| Muon Drift Tubes | < 1                  | 1                                      |
| CSC              | < 2                  | .5                                     |
| RPC              | < 1                  | 1                                      |

The charge measurement accuracy of the preshower is approximately 5%. The preshower charge integration in 3 time slots is insensitive to jitter at the ns level. Voltage sampling with three samples has been simulated to show a 1 ns jitter from one sample to the next give about a 2% error.

The ECAL requires a clock jitter with a sigma smaller than 250 ps in order to keep the contribution to the energy resolution below 0.2%. Drifts of the clock phase up to  $\pm 2$  ns are acceptable. The HCAL requires the jitter and drift to be less than  $\pm 4$  ns.

The RPC system requires the clock jitter and drift to be less than 1 ns since 2 ns degrades the trigger performance and 5 ns may cause a failure (i.e. wrong bunch crossing). The CSC system requires clock jitter and drift to be less than 2 ns. The CSC system is designed with 1 ns clock precision.

### Requirements on Fine Adjustments of the Clock

The individual subsystems have different requirements on the precision of adjustment or “step-size” the clock within the space of one bunch crossing (Table 17.4). The tracker uses clocks derived from the TTCrx on the FED. These are tunable in steps of 0.5 ns over a 25 ns range. The tracker front end boards have a PLL that provides a fine adjustment of the clock in steps of 1 ns in the 25 ns range. The Pixel system requires a fine adjustment of 1 - 2 ns. The preshower requires a 1 ns step size since this is the step size of their PLL. The ECAL and RPC systems also require a fine adjustment of 1 ns. The CSC system requests 0.5 ns step size on the clock adjustment.

## Clock Phase Adjustments and Monitoring

The Pixel front-end clock phase will be adjusted by looking at the average pixel cluster size. This must be done only with real particles, which will require special calibration runs with beam. The Pixel ADC clock can be adjusted without beam during machine filling times. The Pixel phase will be monitored using the average pixel cluster size for isolated tracks. The Pixel FED/ADC will monitor the position and amplitude of markers, which appear at fixed positions in the data packets.

The ECAL constantly monitors the stability of the clock phase by analyzing the waveform of sampled events. The HCAL plans individual channel phase trims using the channel control ASIC in a 32 ns range with 1 ns steps. The HCAL plans monitoring by reading out 5 time slices and histogramming the charge sharing. The RPC system requires clock phase adjustment and monitors it with histograms read out by slow control.

### 17.3.2 Bunch Crossing Assignment of Trigger Primitives

The trigger primitive data (amplitude or pattern) for each trigger channel, and for each crossing, are transmitted to the regional trigger logic in digital form, the signals having a width equal to the LHC clock period (approximately 25 ns). These data is transmitted for every crossing, synchronously with the clock, even if most of the crossings it will be zero.

By bunch crossing assignment we mean the process that assigns the trigger primitive digital data to a certain clock cycle. In most cases, this process involves the treatment of detector pulses that span several crossings. The performance of these algorithms is crucial for the trigger synchronization. The main requirement is that the offset between the time of a given beam crossing and the generation of the correspondent trigger primitives is constant.

## Calorimeters

In the calorimeter trigger path, the L1 Filter processes energy sums in order to assign a pulse to a clock tick. The L1 Filter performs a weighted sum of five consecutive pulse samples and in parallel searches for a peak in the filtered pulse. Samples out of the peak are put at zero (see Chapter 4).

The efficiency of clock tick assignment depends on the relative amplitude of signal and noise. In the ECAL case, supposing a noise level of 30 MeV per crystal, the efficiency is 99% for a energy sum of 600 MeV. For the same noise, the efficiency is 100% for energy sums above 1 GeV. If the noise increases to 60 MeV the efficiency is 99% for a energy sum of 1.3 GeV. The efficiency is also affected by pile-up. For a signal of 5 GeV the efficiency remains 100% for pile-up energies up to 1.5 GeV.

## RPC

In the RPCs, the total signal propagation time in the upstream part  $t_{\text{up}}$  before the Synchronization Unit, which shapes the detector signals and aligns them with the LHC clock, has four components:

$$t_{\text{up}} = t_{\text{flight}} + t_{\text{RPC}} + t_{\text{propag}} + t_{\text{preamp}}$$

The time of flight  $t_{\text{flight}}$  is different for different chambers. It varies from  $4m/c = 13\text{ns}$  for station MB1, to  $12.6m/c = 42\text{ns}$ . More important is variation within one chamber. The strip length may cause flight path variation  $\Delta t_{\text{flight}} = 3.5\text{ns}$ . The differences between various chambers can be corrected by adjusting the length of cables or electronics delay. The variation within one chamber cannot be corrected and it should be considered as a random jitter.

Second contribution is the time of intrinsic RPC phenomena denoted all together by  $t_{\text{RPC}}$ . It has quasi-gaussian distribution with  $\sigma = 1-5\text{ns}$ . The third contribution is signal propagation along the strip  $t_{\text{propag}}$ . The propagation time varies from 0 to max  $t_{\text{propag}}$ . One cannot correct for it on-line and from the trigger point of view it should be considered as having approximately flat random distribution. In the worst case of MB/0/1 the combined variation is  $\Delta(t_{\text{flight}} + t_{\text{propag}}) = 5.7\text{ns}$ .

The contribution from preamplifier and discriminator jitter  $\Delta t_{\text{preamp}}$  is usually much smaller than 1ns. The total jitter of the upstream part  $\Delta t_{\text{up}}$  must be lower than 25ns in order to recognize the bunch crossing. We have seen that it has two major contributions:  $\Delta(t_{\text{flight}} + t_{\text{propag}})$  and  $\Delta t_{\text{RPC}}$ . Assuming 1-3ns for the setup time of the synchronization electronics and taking the worst case of  $\Delta(t_{\text{flight}} + t_{\text{propag}}) = 5.7\text{ns}$ , one gets 15-18ns remaining for  $\Delta t_{\text{RPC}}$ . In the case of gaussian distribution this would correspond to  $\sigma_{\text{RPC}} < 3.0-3.5\text{ns}$  with 99% efficiency. This requirement is fulfilled by recently tested RPC prototypes (see Chapter 13).

## CSC

The timing structure of the CSC system has several components similar to those of RPC. The difference is that the chamber response time  $t_{\text{CSC}}$  includes drift time, which makes the  $t_{\text{CSC}}$  distribution as wide as 50-70ns at the base. Because of that the bunch crossing identification is more difficult. Apart from that the synchronization procedures are similar to those of the RPC PACT.

The local trigger is based on a coincidence of at least 4 out of 6 layers within 75ns gate. The bunch crossing is identified by the second (in time) hit of those contributing to the coincidence. Prototype tests indicates that the distribution of the second hit arrival time is fully contained within 20ns (see Chapter 11). Assuming 1-3ns for the setup time of the electronics, about 2-4 ns remains for the phase adjustment precision.

In the DAQ path the anode (wire) signals are discriminated, whereas the cathode (strip) signals are sampled 8 times with 50ns step. Because the signal can arrive at any clock phase due to the long drift time, there is no requirement on the phase adjustment precision.

## Drift Tubes

Requirements for Drift Tubes are similar to those of CSC. Here the drift time is even longer — about 400ns. The bunch crossing recognition is performed by Bunch and Track Identifier (BTI) circuit, using generalized meantimer technique (see Chapter 9). This method relays on the clock phase adjustment relative to the incoming data. The required precision is about 5ns. In order to determine the delay one has to find the maximum of the BTI efficiency for real muons.

There is no requirement on phase adjustment at the Front End in the DAQ path. The signals are digitized by TDC, so the exact time can be reconstructed off-line.

### 17.3.3 Trigger Subsystems Alignment

The basic architecture for the Level 1 trigger is that of a fully pipelined structure with a 25 ns clock. The result is a complex structure in which raw trigger data flows to the Level 1 Trigger Logic from a number of front-end subsystems, each at a different offset with respect to the absolute bunch phase. Each trigger decision subsystem in turn has its own offset with respect to the front-end data as well as the other trigger decision subsystems. At the Global Level 1 Trigger, the remaining offsets between trigger decision data streams are reconciled to a single offset.

**Table 17.5:** Programmable delays and synchronization FIFOs in the L1 trigger pipeline

| Trigger Sub-system  | Component     | Function                                   | Location                             |
|---------------------|---------------|--------------------------------------------|--------------------------------------|
| Calorimeter Trigger |               |                                            |                                      |
| TPG                 | Channel delay | Sync to local clock; alignment registers   | Calorimeter readout & trigger boards |
|                     | Sync FPGA     | Overall alignment of trigger primitives    | Calorimeter readout & trigger boards |
| RCT                 | Phase ASIC    | Sync to local clock; phase adjustment FIFO | Receiver Card                        |
| GCT                 | Sync FPGA     | Sync to local clock; FIFO                  | Input Module                         |
| DT Trigger          |               |                                            |                                      |
| DTTF                | Sync Pipeline | Sync to local clock; alignment FIFO        | Sector Receiver Unit                 |
| CSC Trigger         |               |                                            |                                      |
| CSC LT              | Sync Pipeline | Sync to local clock; alignment pipeline    | Muon Port Card                       |
| CSC TF              | Front FPGA    | Sync to local clock; alignment pipeline    | Sector Receiver Card                 |
| RPC Trigger         |               |                                            |                                      |
| Link Board          | Sync Unit     | BC identification; alignment buffer        | On-detector board                    |
| RPC Trigger Crate   | FPGA          | Sync to local clock; alignment pipeline    | Trigger Board                        |
| Global Muon Trigger | Sync FPGA     | Synchronization pipeline                   | GMT Boards                           |
| Global Trigger      | Sync FPGA     | Sync to local clock; alignment FIFO        | PSB Boards                           |

The Level 1 Trigger logic has the capability to provide a programmable multi-clock buffer delay on data that they transmit to or receive from other logic. This delay is necessary to compensate for the different inherent processing latencies in the different logical units and different cable lengths. With these capabilities, it is possible to adjust the timing delays of convergent data streams as necessary to guarantee the proper alignment of data for trigger decision calculations. A summary of the alignment stages in the trigger system is given in Table 17.5.

When the signals are sent to another board they might have a constant shift in phase in respect to the local clock at the destination. The rule to be followed is synchronize the phase of the signal at the destination to the local clock.

The bunch crossing number assigned to trigger primitives data allows to check the alignment of trigger data transmitted between subsystems. For this purpose, a BC0 flag that marks the beginning of each orbit is transmitted along with the trigger data. At the input of a subsystems the BC0 flags of different input channels is checked for consistency.

### 17.3.4 Alignment of TTC

Offsets exist between individual subdetector crates for the distribution of TTC data. These offsets reflect mainly the difference in cable interconnection lengths between those crates on the detector and those upstairs. Thus, each crate is assigned offsets which reflect both its position in the trigger decision pipeline as well as its distance from the global Trigger Control.

TTC system has various adjustable delays. At the top of the TTC partition, the TTCvi module allows to set programmable delays on all fast commands except on the L1A signal for latency reasons. In particular, these delays allow to adjust globally the timing of the BC0 command, as well as of the Calibration and Reset commands distributed to the front-ends.

On the front-ends, the TTCrx receiver provides for fine adjustment of the clock phase and for coarse adjustment (in bx steps) of the L1A, BC0 and other fast commands. Some subdetectors (e.g. Tracker, ECAL) send the clock and L1A signals received on the counting room readout boards by the TTCrx to the detector front-ends using dedicated links. Procedures for adjusting these delays are described in Section 17.6.

### 17.3.5 Global Alignment

Trigger subsystems ship trigger data to the GL1T every 25 nsec. All of the data relevant to one crossing is shipped in one 25 nsec time period. All subsystems send their data to the GT so that it arrives at the GT input within 2.3  $\mu$ sec after the crossing occurred at the interaction point. The subsystems send data continuously even if they are being read out. The GT receives the trigger data and aligns this data in time for all the subsystems since the arrival time for the data of a particular crossing is different among the different subsystems.

## 17.4 Alignment with LHC Crossings

The LHC frequency of 40.08 MHz corresponds to a 24.95ns period. One LHC orbit consists of 3564 periods. They are often call “bunches” although some of them do not contain protons. The proton bunches are grouped in 39 trains, 72 bunches each (Fig. 17.6). The structure



**Fig. 17.6:** LHC proton beam structure

of gaps between them can be used for the absolute synchronization. The main gap is 3  $\mu$ s long (119 missing bunches).

At the LHC start the machine will operate with proton bunches 75 ns apart. This operation condition is well suited to establish the trigger and readout synchronization.

The heavy ion beams have a different bunch structure, which can be represented by the formula:  $3x[3x(13x4b+7e)+e+0.2e]+[2x(13x4b+7e)+(9x4b+24e+0.2e)]=712.8$ , where b denotes occupied bunches and e - empty ones, in units of 124.75 ns. Thus, one orbit consists of 608 bunches spaced 124.75 ns apart, grouped in 11 trains with 52 bunches each, and one train with 36 bunches. The spacing between trains is of the order of 1  $\mu$ s and the main gap is 3.125  $\mu$ s long (24 missing bunches). During heavy ion runs the basic 40.08 MHz clock is still available. The bunch spacing is five times the clock period.

#### 17.4.1 Histograms of Occupancy per Bunch

The absolute synchronization of trigger and DAQ data is based on the identification of the LHC bunch structure. Histograms of the occupancy (per channel or group of channels) per bunch crossing number are used for this purpose. Empty bins in the histogram shall be made to correspond to the gaps of the LHC beam structure.

#### Statistics

In the calorimeters, the histograms are incremented at LHC frequency using dedicated synchronization circuits in the ECAL and HCAL readout and trigger primitive boards. Each circuit receives the output of the TPG for two trigger towers and increments the histogram if the energy in one of the two towers is above a given threshold. The histogramming energy threshold is applied after L1 filtering so that the histograms show net gap boundaries. The bunch crossing number is reset by the TTC fast command BC0, synchronous with the beginning of each LHC orbit.

The content of the histogram is accessed by the crate CPU where the correlation function between data and the expected bunch profile is computed. Misalignments are compensated by a corresponding number of steps in synchronization FIFO's (see Chapter 4).

The threshold for histogram incrementing is set at 1 GeV, in order to guarantee efficient clock tick assignment by the L1 Filter. At this threshold the crystal occupancy is around  $10^{-5}$  for minimum bias collisions. Requiring 10 counts per bin in the histogram and assuming one collision per crossing ( $L=10^{33}\text{cm}^{-2}\text{s}^{-1}$ ) we estimate that the determination of the alignment constants for all the ECAL channels will take about 2 hours of beam time.

The rate of muons, especially in the barrel, is very low. On average less than 1 muon per 1000 pp interactions enters the first barrel station MB1 and a few times less enter the last one MB4. At luminosity  $10^{33}\text{cm}^{-2}\text{s}^{-1}$  the muon rate at MB4 of about  $6\text{ Hz/m}^2$  is expected. The area of MB4 RPC is  $1.28 \times 3.75\text{m}^2 \approx 5\text{m}^2$ . Similarly, the area of the smallest Drift Tube chamber at MB4 (the one in the CMS leg) is  $2.52 \times 2.0\text{m}^2 \approx 5\text{m}^2$ . This gives  $\sim 30\text{Hz}$  per chamber, i.e.  $\sim 1800$  muons / minute / chamber.

In the endcap the rate expressed in  $\text{Hz/m}^2$  varies rather fast with rapidity. Therefore, it is more useful to quote the rate per  $\eta$ -unit. The lowest rate will be seen by the CSC ME1/3, which is  $10^\circ$  wide and covers  $\eta=0.88-1.14$ . In this region the muon rate is about  $4 \cdot 10^4\text{Hz}/\eta\text{-unit}$ , which results in  $300\text{Hz}/\text{chamber}$ . This is 10 times higher than in the case of RPC and Drift Tubes, therefore we consider only the RPC/DT rate, as the worst case.

Absolute synchronization can be done only with muons from isolated (at least from one side) bunches. Running with only one bunch in the machine would reduce the luminosity by factor  $\sim 5000$  to the level of  $2 \cdot 10^{29}\text{cm}^{-2}\text{s}^{-1}$ . In such a case only  $20\mu/\text{hour}/\text{chamber}$  are expected, i.e. 50 hours are needed to collect  $1000\mu/\text{chamber}$ .

Much better solution is to run with the full available luminosity, e.g.  $10^{33}\text{cm}^{-2}\text{s}^{-1}$  and make use of the LHC bunch structure. For the absolute synchronization only the first or last bunch in each train can be used. Hence  $\sim 1\%$  of muons is useful. This gives an effective rate of  $0.3\text{Hz}/\text{chamber}$ , i.e.  $1000\mu/\text{hour}/\text{chamber}$ . It does not sound unreasonable.

The most effective configuration should have single bunches of protons separated by several “empty bunches”. For example 1 bunch of protons followed by 4 “empty bunches” would give the muon rate of  $\sim 6\text{Hz}$  per chamber, i.e.  $\sim 360\mu/\text{min}/\text{chamber}$ . In such a case 3-5 minutes would be enough to collect reasonable statistics.

## Background

Synchronization with muons can be disturbed by the presence of background. There are two major kinds of background to be considered:

- electrons — mainly from thermal neutron capture followed by  $\gamma$  emission and conversion,
- charged hadrons — mainly due to punch through from hadronic showers and backsplashes from the forward calorimeter (HF).

The single hit rate due to neutrons ( $n \rightarrow \gamma \rightarrow e$ ) is 1-3 orders of magnitude higher than that of muons. CSC and Drift Tubes are able to eliminate this background by local coincidence of several layers in one chamber. The relative timing of those layers is ensured by construction, so the synchronization is not affected. The case of RPC is more difficult, because there is no local coincidence within one muon station. The only place when the neutron background can be suppressed is the Patter Comparator processor, looking for coincidence of at least 3 RPC planes. This implies that the RPC synchronization with real data must involve the trigger. That, in turn,

means that the first iteration, done without the beam, should be precise enough to enable trigger to work with at least 10% efficiency and thus to make the relative synchronization in <10 min.

The rate of charged hadrons is of the same order as the rate of muons. Charge hadrons can traverse several CSC or DT layers and satisfy the local trigger coincidence. However they often come in time, so their presence helps, rather than disturbs the synchronization process. Those created in HF can come 1 bunch crossing later. Their number, however, is too small to cause any synchronization problems.

### 17.4.2 Subdetectors Bunch Crossing Identification

For completeness, we describe shortly the different algorithms subsystems are planning to identify which LHC bunch crossing number a specific element of data belongs to.

While the pixel system does not need to identify the LHC bunch crossing number a specific piece of data belongs to at the front end, the FEC needs to know the end of LHC orbit so that it can perform resets in the abort gap if necessary. The pixel FED can histogram the occupancy per crossing and communicates this to the FEC, if a fast communication path is available.

The preshower systems will histogram occupancy per crossing. Sufficient statistics are accumulated at low luminosity ( $3 \cdot 10^{32} \text{ cm}^{-2}\text{s}^{-1}$ ) in about 800 sec with a 75 kHz Level 1 trigger rate. In the calorimeters, the histograms are incremented in dedicated synchronization circuits, as described above. Masking all but one crystal channel feeding a particular TPG allows measuring differences in time alignment in a channel basis.

The RPC system identifies the bunch crossing by checking the RPC signals in coincidence with the selected 25 ns window. The CSC systems will histogram LCTs per crossing. Sufficient statistics are accumulated at high luminosity ( $10^{34} \text{ cm}^{-2}\text{s}^{-1}$ ) in about 3 min with a 1 kHz Level-1 trigger rate. At low luminosity ( $10^{32} \text{ cm}^{-2}\text{s}^{-1}$ ), it takes about 5 hours if the full LHC cycle is used and only 25 min if the pattern is matched to the LHC cycle sub-structure that repeats 12 times per cycle.

## 17.5 Alignment with Readout

Data in a given DAQ channel are waiting for the L1A decision in the pipeline. They are read out from the end of the pipeline if there was a positive L1A response. The L1 Accept corresponding to a given b.x. has to match the data from the same b.x. This can be achieved by delaying the LV1 Accept or adjusting the length of the pipeline.

### 17.5.1 Sub-detector Frontend Pipelines

The subsystems have various implementations for the pipeline-derandomizer. The tracker uses an analog pipeline with 192 locations for each channel. It uses “read” and “write” pointers regulated by the 40 MHz clock. The actual depth for event pipeline storage is 160. The tracker employs two readout modes. The deconvolution mode, used when pileup is a problem, involves 3 samples per event and the peak value mode stores only one sample. This implementation explicitly requires no triggers less than 2 bunch crossings apart.

There is no pixel front-end pipeline, but there is an 8-event buffer in the front-end readout chips (ROC). The data is removed if there is no L1A after 128 crossings. The ECAL uses a dual port memory with the read and write addresses differing by an offset that corresponds to the L1 trigger latency. The RPC system pipeline-derandomizer is placed on the readout board located in the RPC trigger crates in the counting room and is implemented as a shift register.

### 17.5.2 Alignment of L1A with Readout Pipelines

In the readout chain, we are concerned with the synchronization of the L1A signal with the pipeline data. The L1 trigger signal is broadcasted to all channels through the same distribution system that distributes the clock (TTC system). The TTC receivers assign to each trigger a double identifier: the event number and the bunch crossing number. The event number is counted since the last event counter reset and the bunch crossing number is counted since the last BC0.

Three parameters can be adjusted to achieve pipeline synchronization: programmable delays on the L1A and BC0 signals and the pipeline length. The goal is that the trigger and the data extracted from the pipeline correspond to the same bunch crossing.

The value of the delays can be established using one of the following methods.

1. Multi-crossing readout of real data

This method is especially suitable for low occupancy detectors participating in the trigger, e.g. Calorimeters, RPC, CSC and Drift Tubes. A given detector region is read out if there was a LV1 Accept caused by the data from this region. Several consecutive b.x. are read out in order to discover a possible misalignment of data with respect to LV1 Accept. High occupancy may disturb this method if probability of having data in consecutive b.x. is high.

2. Multi-crossing readout of test data

This method is suitable for any detectors participating in the trigger, i.e. muon detectors and calorimeters. A test pattern causing a trigger is generated in a certain detector region. The data from this region, covering several consecutive b.x. are read out in order to discover a possible misalignment of data with respect to LV1 Accept.

3. Histogramming real data

This method is similar to the trigger b.x. synchronization with real data (Section 17.4). Whenever there was a trigger, the data are stored in a histogram according to the b.x. number given by the trigger. Possible misalignment can be detected comparing obtained histograms to the LHC bunch structure. This method can be used by any detector, not necessarily participating in the trigger, e.g. by the tracker. High occupancy is of advantage in this method, because needed statistics can be collected faster. Background not correlated in time with the collisions (e.g. loopers in the tracker due to magnetic field) is an obstacle to the precision that is finally achieved.

### 17.5.3 Bunch Number and Event Number

The particular bunch crossing number of a subsystem's data is indicated by a local bunch counter, named Bunch Number. The Bunch Number is used to check the alignment of the data from each subsystem. The subsystem also counts Level 1 triggers. This number, Level 1 Event Number, is used to check that the data read out corresponds to the correct Level 1 Accept.

At the front end, each subsystem counts the number of crossings since the last bunch-crossing zero. Each subsystem then forwards either all (preferably) or a number (if transmitting all is too much of a burden) of the lowest order bits of these counters with its data to the next level in the readout chain. The number of bits is that needed for the next level to unambiguously check that the value transmitted is consistent with its own counters. The next level then transmits a like number of bits to the level following it and so on.

This check does not need to be done for every front end, but over logical units which are operated by the same clock or are locked together by design, or can verify their relative synchronization at high frequency.

If there is an inconsistency, a single error bit can be forwarded with this data and the information logged. This error should be transmitted via fast controls to the Trigger Control. The important principle is that the checking of the synchronization is done in the same direction as the dataflow. The individual front ends do not receive the total crossing and level 1 accept numbers, but instead count these quantities and forward the phases of these counters to the subsequent levels which then check them.

#### **17.5.4 Synchronization of Event Fragments**

The event fragments are synchronized after all the necessary timing constants are adjusted correctly and hence corresponding event fragments are labeled by identical event and bunch crossing id's, where the emphasize has to be put on the "and". This consistency is checked during the transfer of the data toward the data acquisition system (DAQ).

### **17.6 Timing Setup Procedures**

#### **17.6.1 Cable Length**

All cables and fibers, including those of the laser monitoring systems, will be measured before installation and their length stored in a database. The programmable timing parameters (deskewing parameters, pipeline lengths, etc.) will be initially adjusted to compensate differences in cables lengths.

All particle and signal paths should be measured or calculated. Corresponding time differences should be compensated either by cutting the cables or by adjusting programmable electronics delays. It should not be too difficult to achieve precision better than 5ns, which corresponds to  $\sim 1\text{m}$  of cable. Special care should be taken with TTC fibres. Good knowledge of their length will facilitate synchronization with test data.

#### **17.6.2 Timing of TTC distribution**

For each subdetector partition (see Chapter 16), the TTC distribution fibers to the crates at a given level in the trigger and DAQ pipeline chain will have the same length. In this way, we guarantee that the fast commands, in particular BC0, arrive synchronously to all trigger or readout modules in a given level (e.g. TPG level).

The initial adjustment of the programmable BC0 deskewing is based on the knowledge of the cable lengths, both for data and control signals distribution, to the various levels (Front-end, TPG, Regional Trigger, Global Trigger). The goal is that the BC0 timing at different levels is synchronous with data from bunch crossing zero flowing in the trigger pipeline.

For the electronics placed in the counting room the alignment of the TTC distribution system can be directly verified using an output in the module's front panel. In the case of TTC fibers distributed to the electronics located on the detector (e.g. muon system) there is no direct way to check the timing of the TTC signals on the detector.

The final verification of the TTC timing is done with real data using the methods described in Sections 17.4.

### 17.6.3 Timing of Trigger Pipeline

The timing of trigger pipeline, that is the adjustment of the delays needed for the synchronous behavior of the trigger electronics (Table 17.5), will be made initially with test patterns.

Test patterns should be generated at the source (e.g. Front End board) and transmitted on request broadcasted by the TTC. At the destination (e.g. Trigger Processor board) they should be compared to the generated ones. Let us consider a simple example — a sequence (00100) sent through all channels. The same sequence should be observed at the destination, namely the “1” should come at the same time, defined by the bunch crossing number provided by TTC. If in one of the channels, the “1” was observed e.g. one bunch crossing later, it means that this channel is delayed by one b.x. in respect to others.

Test data can be used for absolute synchronization of data to the LHC clock at the destination if the absolute synchronization was already done at the source. Before using real data this can be done only approximately, taking the LHC clock and bunch crossing number provided by TTC at Front End as a reference. Precision of this method is limited by the knowledge of all delays in the TTC network. The main purpose of this procedure is to setup the system, so it can fully operate with test data. In this way trigger hardware and algorithms can be tested before the LHC beam is available.

Generated test pattern should unambiguously mark one b.x. Let us denote by N its number given by the TTC at the source. Again one can use the sequence (00100) as an example. The “1” should be sent in bunch crossing N. The delay of the signal or the TTC clock at the destination should be adjusted in such a way, that the “1” is received in b.x. N, according to the local TTC.

Once beam is on, the alignment of the trigger pipeline will be checked with real data using the methods described in Section 17.4.

### 17.6.4 Timing of Readout Pipelines

The procedures needed to obtain the synchronization between L1A and the pipeline data are detector dependent and will not be discussed here in detail. However, they follow the general principles described in Section 17.5.

Detectors participating in the trigger (calorimeters and muons) are first aligned at the level of the electronics chain. A test pattern generated at the input of the trigger boards propagates through the system and originates a L1A signal, which after distribution will trigger the readout of a time frame in the readout pipelines. The identification of the position of the test pattern in the time frame allows to measure the trigger latency and to adjust in consequence the pipeline length or the L1A deskewing. This procedure is repeated for all channels, in order to check that the trigger latency is the same irrespective of the trigger geographical origin.

Detectors not participating in the trigger (e.g. tracker, preshower) are aligned at the level of the electronics chain using test triggers delivered by the Trigger Control System. The timing of the test pulse and corresponding L1A is adjusted, relative to BC0, in such a way that the test pulse simulates data produced at bunch crossing N. The latency of the test L1A is adjusted to reproduce the latency of the physics trigger capturing data at crossing N.

The synchronization procedures with beam are aimed at verifying the relative synchronization between system components and getting the absolute BC0 synchronization. These procedures follow three main steps:

- (a) Setting the clock and trigger primitive phases, as described in Section 17.3.
- (b) Alignment to the BC0 reference, as described in Section 17.4.
- (c) Synchronization of L1A with pipeline data, as described in Section 17.5.

## 17.7 Monitoring and Diagnostics Procedures

### 17.7.1 Data and Trigger Links

The synchronization of the detector links is monitored in permanence by the deserializer circuits at reception. If the link desynchronizes the circuit sets a flag that accompanies the data in the readout pipeline. This flag stays up until synchronization is recovered, so that all data produced during that interval is flagged.

In case of the trigger links, when a loss of synchronization is identified by the deserializer a bit is set which causes the corresponding trigger channel to be masked. Masking is also applied if the error detection logic recognizes a frame error.

In parallel the local controller at reception is requested to initiate a synchronization procedure. A command is sent to the transmitter side which instructs the serializer to send synchronization patterns during a fixed interval of time. This procedure can be executed during data taking without the need for a full system reset.

### 17.7.2 Bunch Crossing Assignment

The bunch crossing assignment is monitored in permanence with the bunch profile histograms accumulated on dedicated circuits in the trigger boards or on the crate controller using spying events. The monitoring can be done channel by channel or per group of channels. Estimates of the histogram filling time are given in Section 17.4.

The correlation function between histogram content and the expected bunch profile is computed in the crate CPU allowing the monitoring of the time alignment. Misalignments are compensated by a corresponding number of programmable steps in synchronization FIFO's in the trigger boards.

The database stores periodically the status of bunch crossing assignment. The recovery from bunch crossing mis-assignments implies an adjustment of the BC0 deskewing in the relevant TTCrx circuit or a reprogramming of the concerned synchronization pipeline length. This operation requires a system reset.

### 17.7.3 Synchronization between L1A and pipeline data

The synchronization between L1A and pipeline data is monitored with real events verifying the matching between Global Trigger data and detector data. On the other hand, profiting from the readout of time frames,  $Z^0$  candidates triggered by the single lepton trigger will be reconstructed in 4-D space in order to check possible misalignments between detector regions. Summary information is stored in the database periodically. The method requires that the detector is already calibrated, at least at a level of precision sufficient for good  $Z^0$  reconstruction.

The recovery of L1A-pipeline synchronization implies an adjustment of the L1A deskewing at the level of the TTCrx circuits concerned or an adjustment of pipeline lengths. This operation requires a system reset.

### 17.7.4 Synchronization of event fragments

A number of faults can be at the origin of a loss of synchronization between event fragments collected by the readout and data acquisition systems. Recovery from this loss of synchronization relies on global reset signals (L1 Reset) distributed by the TTC system (Chapter 16).

## 17.8 Operation with Test and Calibration Triggers

### 17.8.1 Special Beam Conditions

Various subsystems may need a special fill of the LHC in order to establish synchronization. A special fill would have a series of empty crossings before a single full crossing in order to clearly identify a specific point in the pipelined samples. This pattern would be repeated as much as possible. The CMS trigger could then operate only on the full crossing after each gap. This conditions will be fulfilled at LHC start when the proton bunches will be 75 ns apart.

Special runs dedicated to the synchronization might be very useful. It may differ from a normal physics run:

- special LHC bunch structure (e.g single, separated bunches) can be set up;
- different trigger and DAQ partitions can be run independently in order to facilitate internal synchronization of each one;
- trigger algorithms may run in a “loose” mode to collect needed statistics in a shorter time;

- DAQ partitions may run in special modes, e.g. without zero suppression, reading out several consecutive bx.

### 17.8.2 Test and Calibration Triggers

The timing of test and calibration triggers must be carefully adjusted so that it reproduces as exactly as possible the timing of physics triggers. Having this in mind, test and calibration triggers will always correspond to a well defined bunch crossing number. The test data should be generated synchronous with that bunch crossing and the test trigger shall capture the test data in the pipelines at the same bunch crossing. In this way test and calibration triggers can be issued during physics runs without any change in the timing settings.

### References

- [17.1] J. Varela, “A method for synchronization of the trigger data”, CMS NOTE/1996-011
- [17.2] L. Berger, R. Nóbrega, J.C. da Silva, J. Varela, “Trigger synchronization circuits in CMS”, in proceedings of ‘Electronics for LHC experiments’, Oxford, Sept 97.
- [17.3] J. Varela, “Using an LHC-like Test Beam to Study the Trigger and Front-End Readout Synchronization of the CMS Detector”, CMS IN 1998-012
- [17.4] G. Wrochna, “Synchronization of the CMS Muon Detector”, CMS CR-1998/017
- [17.5] A. Taurok, “Phase Synchronization of Trigger Data in the CMS Level 1 Global Trigger”, CMS IN-1999/010
- [17.6] W. Smith, “CMS Synchronization Workshop Conclusions”, CMS IN-1999/016.
- [17.7] A. Ranieri, “A New Front-End Board for RPC Detector of CMS”, CMS NOTE-1999/047
- [17.8] J. Varela, J. C. Silva, “Technical Specifications of the ECAL Trigger Synchronization and Link Board Prototype”, CMS IN-2000/005.
- [17.9] R. Benetta et al., “Synchronization of the ECAL data”, CMS IN-2000/006.
- [17.10] R. Cirio, R. Martinelli, L. Ventura, C. Willmott, P. Zotto, “A synchronization procedure for the muon drift tubes detector”, CMS IN-2000/031



# 18 Installation and Maintenance

## 18.1 On-Detector Electronics

The only trigger subsystem with on-detector electronics is the Muon subsystem. The DT, CSC and RPC trigger systems generate trigger primitives on the chambers (RPC and DT) or on peripheral crates on the detector outer perimeter (CSC). The installation and maintenance of these electronics are contained in the CMS Muon Project Technical Design Report[18.1].

## 18.2 Counting Room Electronics



**Fig. 18.1:** CMS Underground Counting Room (USC55).

The CMS underground counting room is shown in Fig. 18.1. The trigger electronics in the counting room is placed in racks and organized in locations designed to minimize latency caused by cable lengths both carrying the trigger primitive information from the detector and bringing back the L1A from the Global L1 trigger back to the front end electronics. In general the trigger primitive processing is done in the same crate and often on the same board as the readout data processing. Therefore the specification of the installation and integration of these components is found in the subsystem TDRs.

All trigger signals and readout signals used for trigger primitive generation are carried from the detector on optical fibers through the shielding wall fiber tunnels into the underground counting room. Inside the counting room, cables are routed under the floor down the rows of racks and in most cases cross between rows of racks only at the counting room walls. This arrangement

allows for the optimal installation and maintenance of cables. However, in a few cases of critical latency, provision is made for small numbers of cables to run under the floor orthogonal to the rows of racks. Cables are brought into the rear of crates whenever possible to avoid blocking easy removal of individual modules in the crates.

The racks have standard hardware to permit the installation of 6U and 9U format crates and contain both water-cooled heat exchanger units and fan units taking up 2U apiece between the crates. The locations of the crates are specified in advance and the heat exchanges and fan units are pre-installed in all of the racks so that the crates of trigger electronics can be installed afterwards, with all cooling and air handling already provided. The racks contain at most 3 9U VME crates or 5 6U VME crates. In some cases, less crates are included to reduce the heat load per rack. Maintenance of crates is performed by removal of crates for servicing. Electronics modules in the crates are also removed for servicing individually using the standard practice for securing and removing VME modules.

## 18.3 Electronics Maintenance

All CMS trigger electronics is manufactured with sufficient spares to provide for 10 years of running without necessity for additional purchases of boards or components. This entails ordering sufficient spares of all boards and components during the manufacturing process. The actual number of spares ordered will depend on experience with the technology involved through examples of other large systems and prototype testing experience with the system being produced. The computation of the number of spares will include sufficient numbers to address manufacturing and testing losses, infant mortality during burn-in, and long-term repair needs. The spares calculation also considers board repairability by evaluating the feasibility of removal of such parts as BGA (Ball Grid Array) ASICs. The rubric for spares purchasing assures independence from manufacturer product line and foundry process obsolescence. This is required by the long time-scale of the CMS project.

## 18.4 Configuration Control

Since the exact configuration of the CMS trigger electronics directly impact the physics being taken by the detector, all circuit diagrams of the final boards, ASIC's and cables installed into CMS will be electronically stored in a central CMS repository. In addition, copies of all software used to control the trigger logic will also be kept in this repository.

Since the scale of programmable logic in LHC experiments is vastly greater than heretofore experienced in HEP detectors, specific steps must be taken to document the state of Field Programmable Gate Arrays (FPGAs) and other Programmable Logic Devices (PLDs). In addition, with the large number of programmable gates, the configuration of each PLD is a huge file, which when multiplied by the number of PLDs becomes a huge volume of data to store. Because of this, CMS policy is to avoid changing trigger cuts or other trigger adjustable parameters by changing the PLD logic configuration, but instead including registers written via VME, JTAG or other means which contain the values of these adjustable parameters. Therefore the alteration of these parameters is straightforward, documentable in a small file and does not run the risk associated with reconfiguring the PLD logic. A method employed by CMS to avoid problems with the huge file size associated with PLD configuration is to require that each PLD include a register

where the version of the logic configuration last run and its date of run is stored. This date and version would then refer to PLD code used to set up a large number of PLDs and this code would be stored in the central CMS repository.

A specific difficulty is not only the obsolescence of the PLD, but its programming software and the platforms upon which this software runs. In order to address this problem, CMS has requested the CERN IT division to provide and maintain a small number of “legacy machines” which are workstations running the operating systems and PLD programming code used to set up the PLDs in CMS. These machines would be maintained until the PLDs programmed by them were replaced.

## References

- [18.1] CMS Muon Project Technical Design Report, CERN/LHCC 97-32.



# 19 Safety and Environment

## 19.1 On-Detector Electronics

Trigger-specific on-detector electronics is subject to the CERN Electrical Code C1 complemented by the CERN safety instructions. The requirements are either full compliance with the harmonised European Decrees or a set of complementary measures for all installations that cannot comply due to scientific reasons. The complementary measures are not being carried to the subdetector level but are part of the prevention plan for the full CMS experiment. As owners of an electrical installation we accept the responsibility of submitting to the relevant CERN authorities full schematics, layout of electrical protections and proof of compliance with safety requirements. As electrical protection requires the detector control system to be running for full protection it is stated that parameters for electrical protection are part of a set of safety relevant data and software. Traceability of modifications carried out in this domain is a safety requirement. The GLIMOS must be able to provide authorities with the safety data sets loaded at any time.

Fire protection is done at the detector level. The sole requirement for the trigger electronics is the minimisation of fire risks as outlined above. No additional fire prevention measures are foreseen at the subdetector level.

Safety requirements that become known at an advanced state of design of the electronics will be followed as much as possible. In particular CERN's zero-halogen policy that stipulates using materials, that is also printed circuit boards, with zero-halogen flame retarders, will be respected as such but no follow-up is possible on items readily designed. No guarantee can be given in this document.

Environmental measures include safe dismounting and waste disposal when the electronics becomes obsolete. Full traceability of on-detector electronics and parts thereof is required for later disposal of potentially radioactive waste. Again this issue cannot and will not be treated at the subdetector level. Full traceability is a LHC requirement also imposed by the French authorities by declaring, that the LHC, including its experiments, is a public utility but at the same time a basic nuclear installation subjected to specific French rulings. Apart from the traceability requirement no particular consequences for the trigger system apply.

## 19.2 Counting Room Electronics

Counting room electronics will be subject to the same set of rules as outlined in chapter 19.1. In the counting room the sheer accessibility of the equipment requires that the equipment strictly follows the general safety rules issued by the European Union and France as the host state. We refer to the rules published by CERN TIS on the Web and the CMS fire prevention plan that includes, e.g., technical details on rack safety. CERN premises are subject to initial and periodical general safety inspections. In addition all risk-ridden equipment is inspected initially and periodically. Trigger counting room electronics does not pose any particular risk.

## 19.3 Electrical and Non-Ionising Radiation Safety

The electrical safety is treated in some detail in the two previous sections. The relevant code is the CERN Electrical Code C1 which is based on the harmonised requirements of IEC-364. Equipment must comply with the “CE” rulings that combine two European decrees, 89/336/CEE for electromagnetic compatibility and 88/1056/CEE for low voltage. In case of non-compliance CERN/TIS will be requested to check and conditionally accept the equipment to run exclusively on CERN premises. Restrictions may apply such as access limitations, metallic enclosures, earthing, operational procedures, fusing, galvanic separation. Foreign equipment is treated the same way as equipment without CE-marking.

Electromagnetic compatibility is a basic requirement. Although required for the “CE” marking the equipment will not undergo EMC-testing or certification by CERN or others. Compatibility will be sought for in case of reported and verified disturbance. An explicit policy on electromagnetic compatibility will be issued by CERN. The present document stresses the importance of parameters and compatibility levels to be known for locations inside physics detectors.

The trigger electronics uses non-ionising radiation because of the timing and control distribution via optical fibres that are driven by a class 2 laser system. Data transfer will use similar but less powerful systems. We refer to CERN safety note #9 (non-ionising radiation) and CERN safety instruction #22 (laser safety) that will be strictly followed.

## 19.4 Fire Safety

The fire safety is assured by the CMS fire prevention plan. The trigger electronics and its cabling represents a sizeable amount of fuel. Ignition from inside the system is considered a very small risk due to compliance with CERN’s safety instruction #23 and cross-checked electrical protections. However, trigger cables or fibres are exposed to other risks and may serve as fuel for large fires. The trigger subdetector will not significantly change existing fire risks in the CMS cavern.

## 19.5 Radiation Tolerance

The majority of the trigger electronics is located in the underground counting room, USC55. The exception is the trigger primitive generation for the barrel muon drift tube trigger and the endcap muon CSC trigger. The radiation tolerance of the barrel muon drift tube trigger is discussed in Chapter 9 and that of the CSC system is discussed in Chapter 11.

## 19.6 Radiation Levels and Access

CMS will draw up an intervention plan that includes access conditions. Access to detector mounted electronics will be subject to a shutdown of the LHC, a waiting time for radiation to cease, and very strict planning according to the severeness of the intervention.

In order to assess the radiation effects on electronics components the radiation environment has to be characterized by three quantities:

- **Hadron flux above 100 keV.** This includes both charged hadrons and neutrons and displacement damage is to a good approximation proportional to this quantity. The average damage constant of the LHC spectrum is very close to 1 MeV neutron equivalent.
- **Hadron flux above 20 MeV** including both neutrons and charged hadrons above the indicated threshold. It has been recently shown that Single Event Upset rates are proportional to this quantity. For typical components the average upset rate of the LHC spectrum corresponds to that of 200 MeV protons.
- **Total ionizing dose.** Damage in CMOS components and structural damage in plastics is proportional to this quantity. By definition, dose does not have any dependence on particle type.
- These three quantities are listed in Table 19.1 for some characteristic positions of the Muon and HCAL system. The values for the HF FEE racks correspond to the racks positioned outside the lateral shielding of the HF. The values in the table are extracted from the most recent FLUKA simulations which include a detailed description of the entire shielding system of CMS.

**Table 19.1:** Table for radiation levels: Hadron fluxes and ionising dose around muon chambers and HCAL. The quoted errors only show the simulation statistics. All values for an integrated luminosity of  $5 \times 10^5 \text{ pb}^{-1}$ .

| Location     | Hadrons >100 keV ( $\times 10^{10} \text{ cm}^{-2}$ ) | Hadrons >20 MeV ( $\times 10^{10} \text{ cm}^{-2}$ ) | Dose (Gy)       |
|--------------|-------------------------------------------------------|------------------------------------------------------|-----------------|
| MB1 center   | $0.20 \pm 0.02$                                       | $0.018 \pm 0.003$                                    | $0.04 \pm 0.01$ |
| MB1 end      | $0.84 \pm 0.09$                                       | $0.19 \pm 0.02$                                      | $0.20 \pm 0.06$ |
| MB4 center   | $0.25 \pm 0.04$                                       | $0.004 \pm 0.001$                                    | $0.05 \pm 0.01$ |
| MB4 end      | $0.54 \pm 0.09$                                       | $0.015 \pm 0.004$                                    | $0.07 \pm 0.01$ |
| ME1/1 inner  | $61 \pm 3$                                            | $18 \pm 1$                                           | $23 \pm 4$      |
| ME1/1 outer  | $4.9 \pm 0.3$                                         | $1.4 \pm 0.1$                                        | $1.9 \pm 0.4$   |
| ME3/1 inner  | $11.0 \pm 0.9$                                        | $3.9 \pm 0.3$                                        | $3.1 \pm 0.4$   |
| ME3/2 outer  | $0.94 \pm 0.06$                                       | $0.12 \pm 0.01$                                      | $0.8 \pm 0.6$   |
| ME4/1 inner  | $13.8 \pm 0.9$                                        | $5.8 \pm 0.4$                                        | $3.9 \pm 0.8$   |
| ME4/2 outer  | $2.6 \pm 0.2$                                         | $0.44 \pm 0.04$                                      | $0.52 \pm 0.05$ |
| HB FEE box   | $17 \pm 1$                                            | $3.1 \pm 0.6$                                        | $2.1 \pm 0.5$   |
| HE FEE box   | $2.0 \pm 0.7$                                         | $0.10 \pm 0.04$                                      | $0.07 \pm 0.02$ |
| HF FEE racks | $6.5 \pm 0.3$                                         | $2.7 \pm 0.1$                                        | $3.5 \pm 0.5$   |

The local induced radioactivity due to long-term operation of the LHC is negligible at the positions of the Muon and HCAL trigger cards. However, nearby elements, the HF absorber in particular, will be highly radioactive and need dedicated shielding.



# 20 Project Management

## 20.1 Institutes and Responsibilities

**Table 20.1:** Major CMS trigger tasks and institutional responsibilities.

| System                            | Responsible Institutes |
|-----------------------------------|------------------------|
| <b>Calorimeter Trigger</b>        |                        |
| ECAL Trigger Primitive Generation | Palaiseau, Lisbon      |
| HCAL Trigger Primitive Generation | Maryland, Fermilab     |
| Regional Calorimeter Trigger      | Wisconsin              |
| Global Calorimeter Trigger        | Bristol                |
| Calorimeter Trigger Control & DAQ | Lisbon                 |
| <b>Drift Tube Trigger</b>         |                        |
| Track Segment Generation          | Padova                 |
| Track Segment Linking             | Bologna, Padova        |
| Muon Track Finding                | Vienna                 |
| Muon Track Sorting                | Bologna                |
| <b>CSC Trigger</b>                |                        |
| Track Segment Generation          | Rice, UCLA             |
| Track Segment Collection          | UCLA                   |
| Muon Track Finding                | Florida                |
| Muon Track Sorting                | Rice                   |
| <b>RPC Trigger</b>                |                        |
| Front End Boards                  | Bari                   |
| Optical Communications            | Helsinki, Korea        |
| Pattern Logic                     | Warsaw                 |
| Ghostbusting & Sorting            | Warsaw & Bari          |
| <b>Global Muon Trigger</b>        | Vienna                 |
| <b>Global Trigger</b>             | Vienna                 |

The major systems of the CMS Trigger are listed in Table 20.1 along with the responsible institutes. These responsibilities are determined by the TriDAS Institutional Board in consultation with the CMS Level 1 Trigger Project Manager and approved by overall CMS management.

## 20.2 Management Organization

The CMS Level 1 Trigger Project Manager is W. Smith. The TriDAS Resource Manager is J. Varela. The Individuals responsible for the subsystem projects in the CMS Level 1 Trigger are listed in Table 20.2.

**Table 20.2:** CMS Trigger projects and responsible individuals

| System                       | Responsible Individuals  |
|------------------------------|--------------------------|
| <b>Calorimeter Trigger</b>   | W. Smith                 |
| Trigger Primitive Generation | P. Busson                |
| Regional Calorimeter Trigger | S. Dasu                  |
| Global Calorimeter Trigger   | G. Heath                 |
| Readout & Control            | S. Silva                 |
| <b>Muon Trigger</b>          | G. Wrochna               |
| Drift Tube Trigger           | R. Martinelli, P. Zotto  |
| Drift Tube Track Finder      | J. Erö                   |
| CSC Trigger                  | J. Hauser                |
| CSC Track Finder             | D. Acosta                |
| RPC Trigger                  | J. Krolikowski, M. Kudla |
| Global Muon Trigger          | A. Taurok                |
| <b>Global Trigger</b>        | C.-E. Wulz               |

## 20.3 Overall Schedule

The overall schedule for the CMS trigger project is shown in Fig. 20.1. The important milestones are the integration and test of the trigger systems with their respective front-end electronics, the beneficial occupancy of the underground counting room, USC55, and the integration of the individual trigger systems with the Global Trigger and the CMS DAQ.

Progress towards this schedule is monitored internally by the TriDAS group through two yearly reviews. Progress is also officially reported once per year through the CMS Annual Review (AR) and at more frequent intervals through reporting on the CMS L1 trigger milestones. Major procurements of the Trigger system are approved through Electronics Procurement



**Fig. 20.1:** CMS Trigger Schedule.

Readiness Reviews (PRR) and Electronics System Reviews (ESR) organized by CMS Technical Coordination. The AR, PRR and ESR reviewers include non-CMS members who are experts in the area of the review and expert CMS members outside of TriDAS.

## 20.4 Costs and Resources

The cost estimate for the CMS trigger is based on experience with similar systems deployed in other HEP collider detectors and the actual costs of the prototype modules built thus far. These costs are compiled for the major trigger subsystems in Table 20.3. The costs listed are for materials and supplies. The labor is provided by the participating institutes without inclusion in this table. The costs shown are the summaries of detailed individual items provided as part of an extensive cost and schedule exercise. Contingency and spares are included as part of this exercise. The costs of each item are computed in the currency of the country responsible for the item and then converted to CHF at CERN-approved exchange rates.

Much of the cost of the systems that are used in the trigger and described in this report are contained in the electronics of the ECAL, HCAL and Muon subsystems as they are integrated with these detectors and their front end electronics. Nevertheless, they are described in this report in order to provide a complete understanding of the trigger system and its operation.

The contributions to the cost of the trigger from the participating countries is explained in the CMS money matrix for the trigger project in Table 20.3. This has been approved by the CMS Finance Board, the individual Country representatives and by the CERN Resource Review Board.

**Table 20.3:** CMS Trigger Money Matrix

| No.        | Item                | Total Cost | AUSTRIA | CERN | FINLAND | GREECE | ITALY | KOREA | POLAND | PORTUGAL | U.K. | USA Doe | Assigned |
|------------|---------------------|------------|---------|------|---------|--------|-------|-------|--------|----------|------|---------|----------|
| <b>6.1</b> | <b>TRIGGER</b>      | 12259      | 1220    | 200  | 1020    | 200    | 100   | 400   | 2060   | 255      | 314  | 6375    | 12143    |
| 6.1.1      | CALORIMETER TRIGGER | 5177       |         |      |         |        |       |       |        | 255      | 314  | 4608    | 5177     |
| 6.1.2      | CSC TRIGGER         | 1767       |         |      |         |        |       |       |        |          |      | 1767    | 1767     |
| 6.1.3      | DT TRIGGER          | 780        | 780     |      |         |        |       |       |        |          |      |         | 780      |
| 6.1.4      | RPC TRIGGER         | 3695       |         |      | 1020    |        | 100   | 400   | 2060   |          |      |         | 3580     |
| 6.1.5      | GLOBAL TRIGGER      | 840        | 440     | 200  |         | 200    |       |       |        |          |      |         | 840      |

# Appendix A: Acronyms and Abbreviations

A list of acronyms and abbreviations used in the Technical Design report is given below:

|          |                                                                                  |
|----------|----------------------------------------------------------------------------------|
| ADC      | Analog-to-digital converter                                                      |
| AFEB     | (CSC) Anode Front-End Board                                                      |
| ALCT     | (CSC) Anode Local Charged Track segment, theta view                              |
| ASIC     | Application Specific Integrated Circuit                                          |
| AU       | Assignment Unit                                                                  |
| AWG      | American Wire Gauge                                                              |
| BC0      | Bunch Crossing Zero                                                              |
| BCR      | Bunch Counter Reset                                                              |
| BER      | Bit Error Rate                                                                   |
| BTI      | Bunch and Track Identifier                                                       |
| BTIM     | BTI Module                                                                       |
| b.x.     | bunch crossing                                                                   |
| BXA      | Bunch Crossing Analyzer - input stage of CSC Sector Processor                    |
| BXN      | Bunch Crossing Number                                                            |
| CARDS    | Trigger Control Software framework                                               |
| CCB      | Clock and Control Board, responsible for distributing TTC signals within a crate |
| CERN     | European Laboratory for Particle Physics                                         |
| CFEB     | (CSC) Cathode Front-End Board                                                    |
| CLCT     | (CSC) Cathode Local Charged Track (cathode view muon stub)                       |
| CLCT/TMB | A board containing both CLCT and TMB functions                                   |
| CMC      | Common Mezzanine Card                                                            |
| CMS      | Compact Muon Solenoid experiment                                                 |
| CMSIM    | A program to simulate CMS detector response to particle interactions.            |
| CPU      | Central Processing Unit                                                          |
| CSC      | Cathode Strip Chamber                                                            |
| DAC      | Digital-to-analog converter                                                      |
| DAQ      | Data AcQuisition                                                                 |
| DAQMB    | (CSC) Data Acquisition MotherBoard                                               |
| DC       | Direct Current                                                                   |
| DC-DC    | Devices to convert DC voltage from one level to another                          |
| DCC      | Data Concentrator Card                                                           |
| DCS      | Detector Control System                                                          |
| DDU      | Device Dependent Unit                                                            |
| DPM      | Dual Port Memory                                                                 |
| DT       | Drift Tubes                                                                      |
| DTBS     | Drift Tube Barrel Sorter                                                         |
| DTBX     | Drift Tubes with Bunch Crossing identification capability                        |
| DTCCB    | Drift Tubes Chamber Control Board                                                |
| DTCI     | Drift Tubes Control Interface                                                    |
| DTCM     | Drift Tubes Master Control                                                       |
| DTTF     | Drift Tube Track Finder                                                          |

---

|              |                                                                               |
|--------------|-------------------------------------------------------------------------------|
| DTWCB        | Drift Tubes Wheel Control Card                                                |
| DTWS         | Drift Tube Wedge Sorter                                                       |
| EB           | Barrel portion of ECAL covering pseudorapidity below 1.5.                     |
| ECAL         | Electromagnetic CALorimeter                                                   |
| ECAL FG Veto | A bit to veto non electromagnetic showers from ECAL fine grain crystal data.  |
| ECL          | Emitter Coupled Logic                                                         |
| EDC          | Error Detection Code                                                          |
| EE           | Endcap portion of ECAL covering pseudorapidity between 1.5 and 3.             |
| EEPROM       | Electrically-Erasable Programmable Read-Only Memory                           |
| EID Card     | Electron Identification Card                                                  |
| EISO ASIC    | Electron Isolation ASIC                                                       |
| EMC          | ElectroMagnetic Compatibility                                                 |
| ET           | Scalar sum of transverse energy components over the calorimeter systems       |
| ETmiss       | 2-vector sum of transverse energy over the calorimeter systems                |
| EU           | Extrapolation Unit                                                            |
| Ex,Ey        | Components of ETmiss                                                          |
| FBT          | First Best Track                                                              |
| FDL          | Final Decision Logic                                                          |
| FE           | Front End                                                                     |
| FEB          | Front End Board                                                               |
| FEC          | Front End Chip (an ASIC)                                                      |
| FED          | Front End Driver                                                              |
| FG           | Bit characterizing the fine grain profile of energy within the trigger tower. |
| FIFO         | First-In First-Out logic device that can be used to store and retrieve data   |
| FPGA         | Field Programmable Gate Array                                                 |
| FPPA         | Floating Point PreAmplifier.                                                  |
| FSB          | Final Sorter Board                                                            |
| FSU          | Final Selection Unit - sorting element of CSC Sector Processor                |
| FTT          | Fake Track Tagger                                                             |
| GB           | Ghost Buster                                                                  |
| GCT          | Global Calorimeter Trigger                                                    |
| GIF          | Gamma Irradiation Facility: area with high-intensity radioactive gamma source |
| GMT          | Global Muon Trigger                                                           |
| GPS          | Global Positioning System                                                     |
| GT           | Level 1 Global Trigger                                                        |
| GTFE         | Global Trigger Front End                                                      |
| GTL          | Global Trigger Logic board                                                    |
| H/E          | Ratio of energy deposits between HCAL and ECAL trigger towers.                |
| HB           | Barrel portion of HCAL covering pseudorapidity less than 1.2.                 |
| HCAL         | Hadronic CALorimeter                                                          |
| HCAL FG Bit  | "Fine-Grain" bit to indicate if HCAL tower energy is consistent with a MIP.   |
| HE           | Endcap portion of HCAL covering pseudorapidity between 1.2 and 3.             |
| HF           | Very forward portion of HCAL covering pseudorapidity between 3 and 5.         |
| HLT          | High Level Triggers                                                           |
| HTR          | HCAL Trigger and Readout.                                                     |
| HTRG         | BTI High Level Trigger                                                        |
| I/O          | Input/Output                                                                  |

---

|              |                                                                                           |
|--------------|-------------------------------------------------------------------------------------------|
| IM           | Input Module                                                                              |
| ISA          | Industrial Standard Architecture PC communication/control bus                             |
| ISAJET       | A program to simulate high energy particle interactions.                                  |
| ISO          | Isolation bit                                                                             |
| JTAG         | Joint Test Action Group; test and diagnostic bus standard by IEEE1149.1                   |
| L1           | Level-1                                                                                   |
| L1A          | Level-1 Accept                                                                            |
| LAN          | Local Area Network                                                                        |
| LB           | Link Board                                                                                |
| LCT          | (CSC) Local Charged Track, or muon stub                                                   |
| LDEMUX       | Link Demultiplexer (an FPGA or anASIC)                                                    |
| LFSR         | Linear Feedback Shift Register                                                            |
| LHC          | Large Hadron Collider                                                                     |
| LINX         | Link Test Board                                                                           |
| LMUX         | Link Multiplexer (an FPGA)                                                                |
| LTRG         | BTI Low Level Trigger                                                                     |
| LTS          | Low Trigger Suppression                                                                   |
| LUT          | Look-Up Table (memory)                                                                    |
| LVDS         | Low Voltage Differential Signaling, a specification for differential digital logic        |
| LVPECL       | Low Voltage Positive ECL                                                                  |
| LVTTL        | Low Voltage TTL                                                                           |
| MAD          | Multiple Amplifier and Discriminator                                                      |
| MB1..MB4     | Muon Barrel Stations                                                                      |
| MC           | Mini Crate                                                                                |
| ME           | Muon Endcap                                                                               |
| MIP          | Minimum Ionizing Particle (muon).                                                         |
| MLB          | Master Link Board                                                                         |
| MPC          | (CSC) Muon Port Card                                                                      |
| mPU          | MicroProcessor Unit                                                                       |
| MRB          | Master Readout Board                                                                      |
| MS           | Muon Sorter of CSC trigger                                                                |
| MSL          | Mask and Sort Logic                                                                       |
| MSSM         | Minimal SUSY Standard Model - a simplified theory based on SUSY.                          |
| mSUGRA       | Minimal Supergravity - a simplified model based on SUSY.                                  |
| Muon MIP Bit | A bit to characterize if a 4x4 trigger region is consistent with a MIP.                   |
| OCS          | Optical Communication System                                                              |
| ORCA         | Object-Oriented Reconstruction for CMS Analysis                                           |
| PAC          | Pattern Comparator (RPC trigger ASIC)                                                     |
| PACT         | PAtern Comparator Trigger                                                                 |
| PC           | Personal Computer                                                                         |
| PCI          | Personal Computer Interface Bus                                                           |
| PHIAU        | phi Assignment Unit                                                                       |
| PHITRB128    | 128 channels Trigger Board in longitudinal CMS plane                                      |
| PHITRB32     | 32 channels Trigger Board in longitudinal CMS plane                                       |
| PLD          | Programmable Logic Device, term used by Altera Corp., similar to FPGA                     |
| PMC          | PCI Mezzanine Card                                                                        |
|              | Priority Encoder Logic that selects for output the highest value of a set of input values |

---

---

|          |                                                                                     |
|----------|-------------------------------------------------------------------------------------|
| PSB      | Pipelined Synchronizing Buffer                                                      |
| PSU      | Power Supply Unit                                                                   |
| $p_T$    | Transverse momentum of a physics object                                             |
| PYTHIA   | A program to simulate high energy particle interactions.                            |
| qAU      | Quality Assignment Unit                                                             |
| QCD      | Quantum Chromodynamics - a theory that describes strong interactions.               |
| QIE      | Charge (Q) Integrating and Encoding.                                                |
| RAM      | Random Access Memory                                                                |
| RB       | Readout Board                                                                       |
| RCS      | Run Control System                                                                  |
| RCT      | Regional Calorimeter Trigger                                                        |
| RDPM     | Readout Dual Port Memory                                                            |
| RF       | Radio-Frequency                                                                     |
| RLDEMUX  | Special LDEMUX on a RB (an FPGA)                                                    |
| RO       | Readout                                                                             |
| ROP      | Readout Processor                                                                   |
| ROSE100  | ECAL Readout and Trigger Board                                                      |
| RPC      | Resistive Plate Chamber                                                             |
| RT       | Regional Trigger                                                                    |
| RUI      | Readout Unit Interface                                                              |
| Rx       | Optical Receiver                                                                    |
| SB       | Server Board                                                                        |
| SBT      | Second Best Track                                                                   |
| SC       | Sector Collector                                                                    |
| SCADA    | Supervisory Control And Data Acquisition                                            |
| SCB      | Sector Collector Board                                                              |
| SCSI     | Small Computer Systems Interconnect                                                 |
| SCU      | Server and Control Unit                                                             |
| serdes   | serializer/ deserializer chip                                                       |
| SEU      | Single Event Upset                                                                  |
| SL       | SuperLayer                                                                          |
| SLB      | Synchronisation and Link Board, also: Slave Link Board                              |
| SP       | Sector Processor                                                                    |
| SR       | Sector Receiver                                                                     |
| SRAM     | Static Random Access Memory: memory that does not need refresh cycles               |
| SRB      | Slave Link Board                                                                    |
| SSTL     | Stub Series Terminated Logic                                                        |
| STP      | Shielded Twisted Pair                                                               |
| SU       | Synchronization Unit                                                                |
| SUSY     | Super symmetry - an as yet unobserved symmetry between fermions and bosons.         |
| TA       | Track Assembler                                                                     |
| TAU      | Track Assembler Unit - element of CSC Sector Processor                              |
| Tau Veto | A bit to indicate if energy in a 4x4 region is not consistent with a narrow shower. |
| TB       | Trigger Board                                                                       |
| TC       | Trigger Crate                                                                       |
| TCM      | Trigger Concentrator Module                                                         |
| TCP/IP   | Transmission Control Protocol over Internet Protocol                                |

|           |                                                                                |
|-----------|--------------------------------------------------------------------------------|
| TCS       | Trigger Control System                                                         |
| TDC       | Trigger Data Concentrator card, also: Time-to-digital converter                |
| TEP       | Technology Evaluation Platform                                                 |
| TF        | Track Finder                                                                   |
| THETATRB  | Trigger Board in transverse CMS plane                                          |
| TIM       | Timing Module                                                                  |
| TMB       | (CSC) Trigger MotherBoard: performs anode-cathode coincidence                  |
| TPG       | Trigger Primitive Generator                                                    |
| TPM       | Trigger Processor Module, also: Track Pipeline and Multiplexer                 |
| TRACO     | Track Correlator                                                               |
| TRC       | Trigger Readout Crate                                                          |
| TS        | Trigger Server, also: Track Segment                                            |
| TS $\phi$ | Trigger Server in longitudinal CMS plane                                       |
| TSM       | Track Sorter Master                                                            |
| TSMD      | Trigger Sorter Master - Data                                                   |
| TSMS      | Track Sorter Master Sorter                                                     |
| TSq       | Trigger Server in transverse CMS plane                                         |
| TSS       | Track Sorter Slave, also: Trigger Sorter Slave                                 |
| TST       | Trigger Server Theta                                                           |
| TTC       | Trigger Timing and Control, a system for distribution of clocking and control. |
| TTCex     | TTC Encoder and Transmitter                                                    |
| TTCrx     | TTC Receiver ASIC                                                              |
| TTCvi     | TTC VME Interface                                                              |
| TTL       | Transistor-transistor logic, a common type of digital logic signaling          |
| TTS       | Trigger Throttle System                                                        |
| Tx        | Optical Transmitter                                                            |
| USC55     | Underground Services Cavern; CMS electronics counting house                    |
| VME       | Electronics and mechanics standard for crates, buses and application boards    |
| WS        | Wedge Sorter                                                                   |
| ZCD       | Zero Crossing Discriminator                                                    |



# Appendix B: CMS Trigger and Data Acquisition Membership

Current Participants in the CMS TriDAS Collaboration by Country and Institute

**Institut für Hochenergiephysik der ÖAW, Wien, AUSTRIA**

M. Brugger, J. Erö, M. Fierro, A. Jeitler, P. Porth, H. Rohringer, L. Rurua<sup>1</sup>, A. Taurok, G. Walzel, C.-E. Wulz

**Université Libre de Bruxelles, Brussels, BELGIUM**

G. De Lentdecker, P. Vanlaer

**Université Catholique de Louvain, Louvain-la-Neuve, BELGIUM**

V. Lemaitre, A. Ninane, O. Van der Aa

**Universitaire Instelling Antwerpen, Wilrijk, BELGIUM**

W. Beaumont, E. De Langhe, V. Zhukov<sup>2</sup>

**Helsinki Institute of Physics, Helsinki, FINLAND**

K. Banzuzi, E. Pietarinen, E. Tuominen, D. Ungaro

**Laboratoire de Physique Nucléaire des Hautes Energies, Ecole Polytechnique, IN2P3-CNRS, Palaiseau, FRANCE**

P. Busson

**Institute of Nuclear Physics "Demokritos", Attiki, GREECE**

M. Barone, G. Fanourakis, T. Geralis, C. Markou, N. Mastroiannopoulos, A. Staveris Polykalas, A. Tsirigotis, S. Tzamarias, K. Zachariadou

**University of Ioánnina , Ioánnina , GREECE**

A. Asimidis, I. Evangelou, P. Kokkas, N. Manthos, F.A. Triantis

**Università di Bari, Politecnico di Bari e Sezione dell' INFN, Bari, ITALY**

F. Loddo, A. Ranieri

**Università di Bologna e Sezione dell' INFN, Bologna, ITALY**

G.M. Dallavalle, C. Grandi, S. Marcellini, A. Montanari, F. Odorici, R. Travaglini

**Laboratori Nazionali di Legnaro e Sezione dell' INFN, Legnaro, ITALY (Associated Institute)**

L. Berti, M. Biasotto, U. Gastaldi, M. Gulmini, G. Maron, N. Toniolo

**Università di Padova e Sezione dell' INFN, Padova, ITALY**

M. Bellato, M. De Giorgi, F. Gasparini, U. Gasparini, S. Lacaprara, I. Lippi, A. Meneguzzo, R. Martinelli, L. Ventura, S. Ventura, P. Zotto<sup>3</sup>

**Università di Torino e Sezione dell' INFN, Torino, ITALY**

F. Bertolino, R. Cirio, A. Vitelli

**Cheju National University, Cheju, KOREA**

Y.J. Kim

**Konkuk University, Seoul, KOREA**

J.T. Rhee

**National Centre for Physics, Quaid-I-Azam University, Islamabad, PAKISTAN**Z. Aftab, M.A. Ahmad, J. Alam Jan, N. Bhatti, K. Hasanain, H.R. Hoorani<sup>4</sup>, M.K. Khan, S.M. Khan, A. Niaz, R. Riazuddin, T. Solaija**Institute of Experimental Physics, Warsaw, POLAND**

M. Cwiok, M. Kazana, J. Krolikowski, I. Kudla, M. Pietrusinski, K. Pozniak, P. Zych

**Soltan Institute for Nuclear Studies, Warsaw, POLAND**

R. Gokieli, M. Gorski, L. Goscilo, G. Wrochna, P. Zalewski

**Laboratório de Instrumentação e Física Experimental de Partículas, Lisboa, PORTUGAL**C. Almeida<sup>8</sup>, N. Almeida, T. Barata Monteiro, N. Cardoso<sup>8</sup>, J. Da Silva, M. Santos<sup>8</sup>, S. Silva, I. Teixeira<sup>8</sup>, J.P. Teixeira<sup>8</sup>, J. Varela<sup>4,9</sup>**Joint Institute for Nuclear Research, Dubna, RUSSIA**

I. Golutvin, V. Karjavin, S. Khabarov, P. Moissenz, S. Movchan

**Petersburg Nuclear Physics Institute, Gatchina (St Petersburg), RUSSIA**

A. Atamanchuk, V. Golovtsov, B. Razmyslovich, V. Sedov

**Moscow State University, Moscow, RUSSIA**

A. Erchov, A. Gribushin, O.L. Kodolova, N.A. Kruglov, I.P. Lokhtin, V. Mikhailin, S. Petrouchanko, L. Sarycheva, A. Snigirev, I. Vardanyan, A. Vassiliev

**CERN, European Organization for Nuclear Research, Geneva, SWITZERLAND**E. Cano, S. Cittolin, B. Faure, P. Favre, W. Funk, D. Gigi, P. Gras, J. Guteleber, C. Jacobs, M. Konecki, F. Meijers, E. Meschi, N. Neumeister, A. Nikitenko<sup>5</sup>, L. Orsini, L. Pollet, A. Racz, H. Sakulin, D. Samyn, W. Schleifer, C. Schwick, P. Sphicas<sup>6</sup>, F. Szoncszo, B.G. Taylor**Paul Scherrer Institut, Villigen, SWITZERLAND**

M. Barbero, D. Kotlinski

**Institut für Teilchenphysik, Eidgenössische Technische Hochschule (ETH), Zürich, SWITZERLAND**G. Antchev<sup>6</sup>, C. Carpanese, A. Rubbia, N. Sinanis**University of Bristol, Bristol, UNITED KINGDOM**

D.S. Bailey, J.J. Brooke, D. Cussans, G.P. Heath, S.J. Nash, D.M. Newbold, A.D. Presland

**Rutherford Appleton Laboratory, Didcot, UNITED KINGDOM**

S.A. Baird, K.W. Bell, J.A. Coughlan, M. French, R. Halsall, W.J. Haynes, L. Jones, J. Maddox, Q.R. Morrissey, P. Murray, P. Rabbetts, A.A. Shah, P. Thayaparan, I. Tomalin

**Imperial College, University of London, London, UNITED KINGDOM**

C. Seez

**University of California at Davis, Davis, California, USA**

R. Breedon, P.T. Cox, J. Smith

**University of California San Diego, La Jolla, California, USA**

S. Bhattacharya, J.G. Branson, I. Fisk, J.P. Fryckman, E. Hill, M. Mojaver, H.P. Paar, G. Raven, A. White

**University of California at Los Angeles, Los Angeles, California, USA**

A. Attal, R. Cousins, S. Erhan, J. Hauser, M. Lindgren, J. Mumford, P. Schlein, Y. Shi, B. Tannenbaum, M. Von Der Mey

**University of California, Riverside, California, USA**

H. Rick

**University of Florida, Gainesville, Florida, USA**

D. Acosta, L. Gorn, A. Korytov, A. Madorsky, G. Mitselmakher<sup>7</sup>, B. Scurlock, S.M. Wang

**Fermi National Accelerator Laboratory, Batavia, Illinois, USA**

S. Aziz, E. Barsotti, M. Bowden, J. Elias, I. Gaines, M. Litmaath, V. O'Dell, I. Suzuki

**The University of Iowa, Iowa City, Iowa, USA**

U. Akgun, A.S. Ayan, E. McCliment, Y. Onel, I. Schmidt

**University of Maryland, College Park, Maryland, USA**

S. Abdullin<sup>5</sup>, S. Arcelli, D. Baden, R. Bard, S.C. Eno, T. Grassi, N.J. Hadley, S. Kunori, A. Skuja

**Massachusetts Institute of Technology, Cambridge, Massachusetts, USA**

G. Bauer, S. Pavlon, K.S. Sumorok, S. Tether

**Princeton University, Princeton, New Jersey, USA**

W.C. Fisher, V. Gupta, J. Mans, D. Marlow, P. Piroue, D. Stickland, C. Tully, T. Wildish

**Rice University, Houston, Texas, USA**

N. Adams, M. Matveev, T. Nussbaum, P. Padley

**University of Wisconsin, Madison, Wisconsin, USA**

P. Chumney, S. Dasu, M. Jaworski, J. Lackey, W.H. Smith

1. Also at Inst. of Physics Academy of Science, Tbilisi, Georgia
2. Also at Moscow State Univ., Moscow, Russia
3. Also at Politecnico di Milano, Milano, Italy
4. Also at CERN, Geneva, Switzerland
5. Also at Inst. for Theoretical and Exp. Phys., Moscow, Russia
6. Also at MIT, Cambridge, Massachusetts, USA
7. Also at Fermi National Accelerator Lab., Batavia, USA
8. Also at INESC, Lisbon, Portugal
9. Also at IST, Technical University of Lisbon, Portugal

