

### 3.2.1.6 Array trigger and clock board

In order to analize properly the images taken by the readout systems, it is very important to know with high accuracy when the samples were taken. However, when there are many tens of telescopes spread over an area of more than  $1 \text{ km}^2$ , and an accuracy of 1 ns is required, clock signals distribution is not an easy task. With the aim to solve this problem, there are currently two hardware developments in CTA. One of them is an ad-hoc system developed by the APC laboratory in Paris, while other is developed by the DESY institute in Germany, based on the White Rabbit standard [99][100]. In both cases the solution consists of distributing the clock from a central unit, calibrating the delays in the optical links, and compensating them in FPGAs. There is always a board in each camera to recover the synchronized clock and send it to the central backplane, so that it is distributed throughout the backplane network, as will be explained in section 3.2.6. The distributed clock is not directly the one used by the samplers, but a 10 MHz one in order to reduce EM interferences, together with a 1 pulse per second (1PPS) signal. In addition, the clock board present in the camera receives the camera trigger signal from the TIB, time stamps it, and sends it to the central array trigger unit, performing the array-level trigger described in section 3.2.8.2.

### 3.2.1.7 Calibration box

With the aim to perform calibration runs, a calibration box will be placed in the center of the reflector of the LSTs. The calibration box, developed by the MPI in Munich, controls one or several lasers or LEDs which can illuminate the whole camera with approximately the same intensity, mimicking the few nanoseconds wide Cherenkov pulses. When a calibration run is performed, the calibration box sends a trigger signal to the TIB, which in turn will trigger the camera, or at least will mark the event as a calibration run (see section 7.1.1). In addition, the calibration box can also trigger the camera when all the lights are off, in order to sample the background. This kind of triggers are known as “pedestals” and must also be marked properly by the trigger system.

In the case of the MSTs, the way in which the calibration runs will be taken is still under design by the LUPM group in Montpellier. It can be done in the same way as in LSTs, or by means of small LEDs placed in the inner side of the camera lids. Figure 3.14 shows the calibration box used in MAGIC. The one which is being developed for the LSTs is expected to be quite similar to it.

### 3.2.1.8 Camera Slow Control System

All the aforementioned hardware systems, together with many other sensors and actuators not described in this thesis, need to be slow-controlled. The slow control architecture of every LST and MST is being developed by the LAPP institute in Annecy and comprises two main elements: a slow control board and an industrial PC. The slow control board<sup>4</sup> centralizes the control of small devices, which usually only accept I2C or SPI-based interfaces over dedicated cables. The slow control board gathers all the information and sends it to the PC through the camera Ethernet network. On the other hand, the more complex devices like the front-end boards, the TIB or the calibration box, are connected directly to the network and therefore to the slow control PC. The communication between the slow control PC and the other elements will be based on OPC-UA protocol [101], whenever possible.

---

<sup>4</sup>Different from the board which controls the PMT high voltage.



Figure 3.14: Photo of the calibration box used in MAGIC. The CTA one was not ready at the end of this thesis

### **3.2.2 Trigger architecture**

Once the different hardware elements related with the trigger have been briefly described, it is time to describe the proposed trigger system itself [3], [4]. The trigger concept developed by the UCM, IFAE and Ciemat groups is implemented in several stages. In the first one, a pixel level is implemented, also dubbed Level 0, which processes the analog signals of individual pixels in the cluster. For this processing, two prototypes have been produced in order to test either the Sum or Majority trigger concepts. In the former, the analog clipped sum of the signals of the pixels is raised to the second stage, a camera-level trigger also dubbed Level 1, where the camera trigger decision is taken. In the latter, the signal of each pixel is compared to a threshold value, so that it provides a fired pixel account to the Level 1 stage. Whatever the case, a Level 0 fan-out stage is used to send and receive the Level 0 outputs to and from the neighbouring clusters. At the Level 1, the analog combination of the Level 0 signals (both in the Majority and Sum modes) for all possible compact regions of a geometrical size given within the hardware limits is examined. This combination roughly consists of the analog addition of the Level 0 signals for different geometrical patterns. The Level 1 trigger is fired if the output of the combination in any part of the camera exceeds a given threshold. This decision feeds the third stage, the Level 1 distribution system, which guarantees that the decision signal reaches the central cluster and then the TIB at the same time (with a 1 ns accuracy) whatever the cluster which generated the trigger. Once the TIB sends back the trigger command to the central backplane, the Level 1 distribution subsystem broadcasts it in

such a way that it reaches the readout electronics associated to all 7-pixel clusters at approximately the same time (with a 1ns accuracy again).



Figure 3.15: Trigger architecture

In its final implementation, the different elements of this trigger system will be placed at different locations on the cluster electronics, as shown in figure 3.15. While Level 0 and Level 1 subsystems are placed in the front-end board itself, Level 0 fan-out and Level 1 distribution subsystems should be at the backplane, in order to send and receive the trigger signals efficiently among the neighbouring clusters. Current prototypes have been produced as separate mezzanine boards for easy testing, but the final trigger subsystems will lay directly on the front-end and backplane boards. A cluster prototype with mezzanines can be seen in figure 3.16.



Figure 3.16: Cluster with all the elements connected

### 3.2.3 Level 0

The Level 0 system is responsible for collecting the signals from all pixels in one cluster. These signals are treated and then added together before being replicated and broadcasted (in the Level 0 fan-out system) to the Level 1 trigger system of the own cluster and the surrounding ones. Two different Level 0 mezzanines have been developed by the IFAE group: one for Majority trigger approach and the other following the Sum trigger concept. In both cases the input and output signals are differential to minimize the effect of long distances in the connection with the other subsystems.

#### 3.2.3.1 Majority

As was explained in section 3.1.2.1, the majority trigger compares the signal from each pixel in the cluster with a voltage threshold. If the signal is greater than the threshold voltage, a gate, of width proportional to the time the pulse exceeds the threshold, is generated. The gates generated by the 7 pixels are added analogically, so the amplitude of the added signal is proportional to the number of pixels with a signal above the threshold. This addition is sent to the Level 0 fan-out and later to the Level 1.



Figure 3.17: Level 0 majority scheme

Figure 3.17 shows the block diagram of the Level 0 Majority mezzanine. The first stage in each channel is a differential to single-ended converter based on a typical subtractor circuit as shown in figure 3.18(a). The output single-ended signal is compared with a threshold in a fast AD-CMP604 comparator[102]. The 7 thresholds can be controlled independently by means of an AD5328 DAC[103] with 8 outputs, which receives the corresponding DC levels codified in SPI messages from the front-end FPGA. The outputs of the 7 comparators are added analogically in an adder-inverter circuit as shown in figure 3.18(b) and, finally, the sum is sent to a single-ended to differential stage (figure 3.18(c)) which sends the Level 0 output to the backplane.

Even being formed by the addition of digital signals, the output of the Level 0 is an analog signal and it must be handled carefully. In this sense, impedance matching is important, as well as reducing the noise and the offset error introduced by the operational amplifiers. So, impedance



Figure 3.18: Some Level 0 circuit schemes

matching resistors and AC coupling capacitors are commonly used, as can be seen in figure 3.18. The gain of the Level 0 majority is adjusted to have 80 mV/fired-pixel at the output, in order to be suitable for the dynamic range of the Level 1 which will add the Level 0 from up to 4 clusters. The adjustment is done in attenuators and other auxiliary subcircuits not showed here and not relevant for this thesis. Nevertheless, 80 mV/fired-pixel is much higher than the electronic noise (around 2mV). It is also remarkable that the Level 0 majority mezzanine requires 6.1 ns to generate the output from the input, and consumes 1.33 W. In section 3.2.7 the tests of several Level 0 parameters are presented, after measuring it together with the other trigger subsystems. A Level 0 majority mezzanine is shown in figure 3.19.



Figure 3.19: Photograph of the Level 0 Majority

### 3.2.3.2 Sum

The Sum trigger adds the signals from all pixels in the cluster and sends the resulting signal to the Level 1 decision subsystem. Before adding the signals from the individual pixels, each of them goes through attenuator and clipping circuits (both slow-control adjustable). The former allows all pixel gains to be equalised with a precision better than 5%. The latter cuts signals greater than a given value, which limits the influence of after pulses from the photosensors, as explained in section 3.1.2.1. The clipping and attenuator circuits might also reduce the pixel signal to zero, in case that noisy pixels need to be removed from the trigger patterns.



Figure 3.20: Level 0 sum scheme

The block diagram for the Sum trigger design (fig.3.20) is made up of six different elements. The differential to single-ended converters, the analog adder, the single-ended to differential converters and the DACs are similar to those used for the majority design (see figure 3.18), although 2 DACs are required in this case to generate the control voltages for the attenuators and the clipping. This two circuits, the attenuators and the clipping, are exclusive for the Sum modules. The attenuators are based on PIN diodes (see figure 3.21), while clipping is based on a differential BJT amplifier [104]. Both the attenuators and the clipping circuits are controlled by DC levels generated in the DACs. Figure 3.22 shows one of the manufactured L0 sum trigger boards.

The Level 0 sum provides output pulses with 9 mV/phe amplitude, requires 6.8 ns to generate the summing output and consumes 2.7 W, with one third of this power consumption due to the clipping circuits. Precise measurements of the Level 0 sum functions when working together with the other trigger subsystems are presented in section 3.2.7.



Figure 3.21: Level 0 attenuator circuit schematic



Figure 3.22: Photograph of the Level 0 Sum

### 3.2.4 Level 0 fan-out

The level 0 output signal of a given cluster must be distributed to the neighbouring clusters in order to perform all possible additions of clusters in the Level 1 system. To do so, the Level 0 output signal of each cluster is sent to the backplane where it is fanned-out to several branches corresponding to the neighbouring backplanes of the clusters which have to receive that signal, and one more copy for the cluster itself. In the same way, every cluster backplane receives Level 0 signals from its neighbours, which are then sent to the Level 1 subsystem, along with the Level 0 signal from the cluster itself. These functions are implemented in the Level 0 fan-out subsystem, developed by the author of this thesis at GAE-UCM.

The Level 0 fan-out module entails several technical challenges which were successfully solved:

- Long transmission lines introduce losses and make the signal noisier. To minimize the noise problem, Level 0 lines are implemented using differential pairs and losses are characterized and compensated.
- The Level 0 fan-out must be designed with similar gains for each branch and good matching features. It is difficult to match one input with six outputs, but using lumped element Wilkinson dividers [6] and amplifiers it is possible to achieve good matching and flexibility in the gain to compensate losses. The circuit used to implement the pulse replication can be seen in figure 3.23.

As it is shown, the replication is implemented in two steps: first, the input differential pair (`L0_OUT_P` and `L0_OUT_N`) are directly splitted between two branches, just adjusting the input impedance of these two output branches to avoid reflections. The original differential

impedance is  $100 \Omega$ , which is equivalent to  $50 \Omega$  to ground for the positive and negative branches due to the AC coupling. Thus, when both the positive and negative inputs are feeding two lines each, the impedance of this new lines must be  $100 \Omega$  single-ended in order to have  $50 \Omega$  single-ended at the division for each input, avoiding reflections. In this way, two differential pairs  $L0\_OUT\_P - L0\_OUT\_N$  are obtained, which are sent to the inputs of a double-channel differential amplifier ADA4927 [105]. In fact, the input resistors used to adjust the input impedance are  $120 \Omega$  instead of  $100 \Omega$ , just to compensate the input impedance of the amplifier.

The two differential inputs are amplified and then sent to 12 3-branches Wilkinson splitters as the one showed in figure 3.23(b). These splitters are single-ended so 3 splitters are connected to each positive output and other 3 to each negative one. The Wilkinson splitters, implemented with lumped elements, have a low-pass response, equally dividing the power among its branches and providing good impedance matching to  $50 \Omega$  in each branch (detailed explanation can be found in section 4.3 or in [6]). However, as the power is distributed among the branches, the amplitude is reduced by  $1/\sqrt{3}$ , and this is why previous amplification is required.



Figure 3.23: Level 0 fan-out schematics

- The delay in reaching Level 1 system must be the same for all channels. Every channel has to go from the Level 0 output to the cluster backplane and from this to the Level 1 input, so these delays are naturally equalized. The only difference is between the Level 0 signal from the cluster itself and the ones coming from the neighbours, which have some delay due to the travelling through the interconnection between backplanes. The lengths of these extra lines are

relatively small, limiting the delay difference to approximately 1 ns and can be compensated in the Level 1 subsystem with a delay line.

Figure 3.24(a) shows a Level 0 fan-out board, designed as a backplane mezzanine for the first prototypes and later integrated with the Level 1 distribution and the common functions in an integrated backplane, already shown in figure 3.12. The compactness of the design, with only one active element, facilitated the integration. In order to test the first versions of the mezzanine, a test board was developed which can be seen on figure 3.24(b). This board was required to change from the standard SATA connectors chosen for these first prototypes into SMA connectors, facilitating in this way the connection to the test equipment. In addition, the test board allows to terminate the outputs not being used with  $100\Omega$  differential, which is essential to make the splitters work correctly (see section 4.3).



(a) Photograph of the Level 0 fan-out, implemented in a backplane mezzanine (b) Level 0 fan-out mezzanine, connected to its test-bench

Figure 3.24: Photographs of the Level 0 fan-out backplane mezzanine

The gain measured for the different outputs is represented in figure 3.25. It was chosen to be slightly higher than 0 dB in order to compensate some of the losses in the connectors. However, one of the main advantages of the implemented design is that the gain in all the channels can be adjusted in a very straightforward way, just by changing the four feedback resistors used in the amplifier. The difficulty is to achieve a very similar gain in all the channels and figure 3.25 shows differences of less than 7% in the input amplitude. In addition, it can be seen that the linearity of the ADA4927 is rather good. Regarding the bandwidth and how it affects to the pulse width, some measurements of all the analog stages are shown in section 3.2.7.6.

Apart from the commented features, the Level 0 fan-out introduces a latency of only 3 ns, consumes 620 mW and it is fairly inexpensive to manufacture because the amplifier is the only expensive component (see appendix B).



Figure 3.25: Measured voltage gain for different input amplitudes in the 6 output branches of the Level 0 fan-out module

### 3.2.5 Level 1

Chapter 4 will be entirely devoted to explain the Level 1 circuits in detail, as this is one of the main contributions of the author of this thesis to the Analog Trigger System of CTA telescope cameras. So in this section the Level 1 will be described only from a functional point of view. The Level 1 receives in each cluster the inputs from the Level 0 of the same cluster and from the neighbours. Then, it performs the addition of several combinations of these inputs and compares the outputs with a threshold voltage in an ADCMP604 LVDS comparator[102]. In this way, every combination corresponds to a trigger region, and several trigger regions are being evaluated in the Level 1 of each cluster, achieving the complete overlapping of the trigger regions and thus detecting the possible shower photons whatever their position and orientation are in the camera.

In order to have some flexibility to optimize the sensitivity for different sky brightness conditions, such as those obtained in observations of the galactic plane or those focusing on extragalactic objects, the size of the trigger regions can be slow-control selectable. In this way, the Level 1 can work in three different modes, dubbed modes 2, 3 or 4, corresponding to trigger regions composed of all relevant combinations of 2, 3 or 4 clusters (14, 21 and 28 pixels) respectively [5]. Figure 3.26 shows the trigger regions evaluated by the cluster in the central position (numbered as “0” in the figure) for the different modes. It is important to understand that every cluster in the camera -excepting the ones in the borders- is the center of a set of seven clusters as the one showed in figure 3.26, and therefore it is evaluating two or three trigger regions depending on the mode. This means that, for a 265 cluster camera, there will be 795 (or 530 in mode 2) trigger regions being evaluated concurrently. This is a very important step forward comparing with previous trigger systems with much fewer trigger regions. For instance, in MAGIC there are only 19 trigger regions with only some overlapping between them (see figure 3.2). This improvement in both the number of trigger regions and their overlapping is expected to allow the detection of more and weaker Cherenkov showers.



Figure 3.26: Level 1 working modes

It is also worth noting that only, performing the additions of the clusters showed in figure 3.26 there is not any possible combination of 2, 3 or 4 compact clusters without evaluation. Also looking at figure 3.26 it can be seen that cluster number 4 does not participate in any addition performed by the central cluster. This is not a problem, because cluster 4 contributes to other trigger regions evaluated by other clusters, but it has an interesting consequence: the Level 0 output only needs to be sent to 5 out of 6 neighbours and to the local Level 1, so a 6-branch fan-out is enough instead of 7, as was shown in section 3.2.4.

The other main feature of the Level 1 presented is that, as it only performs analog sums and comparisons with a threshold voltage, it is compatible with both majority and sum Level 0 trigger mezzanine. In the former case, exceeding the threshold means that there are more than a certain number of pixels fired in the trigger region while in the latter it would mean that more than a certain number of photoelectrons have been collected. The adders are based on analog adder-inverters implemented with an AD8003 [106] operational amplifier and, as the linear dynamic range of this device ranges from 0 to 2 V, the gain for all the inputs has been adjusted to 0.9, thus being able to count up to  $247^5$  photoelectrons (in the sum trigger mode) or  $28^6$  pixels fired (for the Majority trigger mode). The DAC used to generate the threshold voltage has always a resolution better than 0.01 phe.

A complete characterization of the Level 1 mezzanines can be found in section 4.12. As a summary, we can say here that the added noise is only 1.4 mV RMS, the latency required to take the trigger decision is less than 9 ns and the power consumption lower than 800 mV. Figure 3.27 shows the third version of the mezzanine, including the above-mentioned functionality. Later new requirements appeared which led to updated versions of the board with improved capabilities. These additional features will be explained in depth in chapters 5 and 6.

<sup>5</sup> $9 \frac{mV}{phe} \cdot 0.9 = 8.1 \frac{mV}{phe} \rightarrow \frac{2V}{8.1 \frac{mV}{phe}} = 247 phe$

<sup>6</sup> $80 \frac{mV}{pixel} \cdot 0.9 = 72 \frac{mV}{pixel} \rightarrow \frac{2V}{72 \frac{mV}{pixel}} = 27.7 \text{ pixels}$



Figure 3.27: Level 1 mezzanine, version 3, in real scale

### 3.2.6 Level 1 distribution

The output of the Level 1 is a digital LVDS signal which is sent to the cluster backplane. The function of the Level 1 distribution (developed by Ciemat group), which is implemented in the backplane, is to transmit the trigger signal to the central backplane and then to the Trigger Interface Board, taking the same time for whatever the position of the cluster which generated the trigger. Later, when the Trigger Interface Board sends a trigger command, the central backplane distributes the trigger to all the clusters, guaranteeing that the trigger arrives to all of them at the same time, with an accuracy of 1ns (and so stopping the digitizers at the same time).

The implementation of these functions would be straightforward if every cluster had a cable connecting it to the central one. However, in order to reduce the number of cables, the backplanes are only connected to their direct neighbours, so the trigger signal has to jump through several backplanes until reaching the central one (figure 3.28). As the direction of each jump depends on the precise backplane, every backplane needs to know its position in the camera, expressed as an address. The address is programmable, so if one backplane is broken, it can be substituted by other one configured with the same address.

Regarding the delay adjustment, it is clear that the minimum time required by the Level 1 trigger signal to reach the central cluster depends on the distance to the center and the number of jumps required, that is, it depends on the position of the cluster which was fired. So in order to equalize the delays, the backplanes have to add an asynchronous delay equivalent to the time difference between their transmission delay and the latency of the most external clusters in the camera. In this way, when a backplane receives a trigger from the Level 1 of its cluster, it waits a certain time which



Figure 3.28: Level 1 distribution connection among neighbour clusters (left) and example of Level 1 distribution paths in the camera (right)

depends on the cluster position, before sending the trigger to the next backplane in the path [3]. Conversely, in the trigger signal path starting from the TIB an ending in each specific cluster happens exactly the same: the most internal backplanes receive the trigger command earlier than the most external ones, so they have to wait an additional delay which depends on the cluster position before sending the trigger to the readout subsystem.

The Level 1 distribution system is implemented in a low cost FPGA Xilinx Spartan 6 [107], which provides the capability to introduce small asynchronous delays in its ports, including fine tap delays of only 25 ps. It is worth to remember that these delays must be asynchronous to keep the trigger time information in the leading edge of the trigger pulses. Otherwise, very high clock frequencies would be required. As can be seen in the block diagram of figure 3.29(a), the FPGA allows also to tag regions of interest (ROIs) which can be used to implement partial camera reading algorithms (see chapter 5). Figure 3.29(b) shows the first Level 1 distribution prototype implemented in a mezzanine which was connected to the Level 0 fan-out board which, in turn, was connected to the common functions backplane, forming a 3 floor structure which was later integrated into the integrated backplane showed in figure 3.12.

As the maximum number of jumps between the central cluster and any other one, a demonstrator test bench with 9 Level 1 distribution modules was prepared, as can be seen in figure 3.30. The trigger signal was sent from the first to the last one and back again to simulate the longest round trip in a camera. The result was that 169.4 ns were required for the round trip (figure 3.31(a)), with a jitter slightly lower than 1 ns (figure 3.31(b)). The mean value of the latency measured with the TDC is 173 ns instead of 170 due to the length of an additional cable.

In figure 3.31(a), it can be seen that the output pulse is wider than the input one. In fact the output width is programmable, but this is not very important, because the trigger time information is in the leading edge. The trailing edge has much more jitter, because the end of the pulse is synchronous with an internal clock in the FPGA.



Figure 3.29: Level 1 distribution block diagram and circuit implementation



Figure 3.30: Level 1 distribution test bench

### 3.2.7 Performance of the trigger system at camera level

The trigger subsystems described in the previous subsections are the only ones required to trigger the telescope cameras independently. According to the schedule of the CTA project they should be ready before the stereo and array trigger modules, in order to test them in the first telescope prototypes, when the array still will not exist. In this way, the Level 0, Level 0 fan-out, Level 1 and Level 1 distribution systems have been tested exhaustively as independent modules, interconnected with each other, and with the NECTAr and DRAGON front-end boards as shown in figure 3.32. The next subsections show the results of the integration tests performed in the University of Kyoto, the Laboratoire de Physique Nucléaire et des Hautes Énergies (LPNHE) in Paris and the UCM and Ciemat facilities in Madrid (the author participated in the tests performed at LPNHE, Ciemat and UCM), while in chapters 4, 5, 6 and 7 precise measurements of the functionalities and subsystems which are the core of this thesis are presented.



Figure 3.31: Level 1 distribution measurements



Figure 3.32: Two test benches for integrated tests of trigger subsystems and front-end boards

### 3.2.7.1 Latency and Jitter

The measured latency of the complete trigger, between the input pulse arrival at the front-end board and the reception of the trigger pulse in the front-end FPGA, is around 220 ns (figure 3.33), with a breakdown of the latency corresponding to the different subsystems as showed in table 3.1. This measurement was done with nine Level 1 distribution boards to simulate the longest possible delay, sending the trigger signal through all the L1-distribution boards and back again with 18 jumps, as was done in the characterization of the Level 1 distribution. The TIB was not taken into account in these measurements, because the delay added by it depends strongly on certain parameters which are not definitely fixed (like the distance between telescopes), and because the TIB was not available until recently.

As was commented in section 3.2.6, the width of the output pulses from the Level 1 distribution is programmable in order to make them easy to handle by the front-end FPGA. On the other

hand, the falling edge of the distributed trigger signal seems thicker in figure 3.33 because this edge is synchronous with the FPGA clock, suffering jitter. On the contrary the leading edge is asynchronous and keeps the trigger timing information introducing very low jitter.

It is also worth to comment the delay contribution due to the routing of the Level 1 output through the front-end FPGA. It is indeed true that it would be faster to route the Level 1 output directly to the backplane. However, routing this signal in this way allows to perform rate scans easily (see section 3.2.7.2) and let the cluster to be self triggered, even without the Level 1 distribution subsystem, which is useful for tests.



Figure 3.33: Global trigger latency [3]

| Subsystem                                                               | Delay                             |
|-------------------------------------------------------------------------|-----------------------------------|
| Front-end analog input to Level 0 input                                 | 15.9 ns                           |
| Level 0                                                                 | 6.8 ns (Sum) or 6.1 ns (Majority) |
| From Level 0 to Level 0 fan-out                                         | 3 ns                              |
| Level 0 fan-out                                                         | 3 ns                              |
| From Level 0 fan-out output to Level 1 input                            | 3 ns                              |
| Level 1                                                                 | 8.3 ns                            |
| From Level 1 output to Level 1 distribution input (through the FE FPGA) | 7 ns                              |
| Level 1 distribution                                                    | 170 ns                            |
| From Level 1 distribution to the front-end FPGA                         | 3 ns                              |
| <b>Total latency</b>                                                    | <b>220 ns</b>                     |

Table 3.1: Delays of the trigger system

### 3.2.7.2 Gain measurements with rate scans

The rate scan is a very useful test in any system including a comparator, such as the subsystems of the proposed trigger. It consists of generating bursts of pulses with a fixed input amplitude and increasing the comparator threshold, counting how many events have been detected for each threshold value. Then, the trigger efficiency is computed as the ratio of accepted events over all the input events. If the noise level is low, the trigger system will change from detecting all the events to not detecting any in a narrow margin of few millivolts.



Figure 3.34: Rate scan changing Level 1 threshold when only one pixel in cluster 0 is fired, with Level 0 majority

Looking at the threshold level corresponding to 50% trigger efficiency, it is possible to know what the amplitude of the signal at the input of the comparator is. This is very useful in checking the gain of the analog chain and to test if the additions are being done properly. In fact, measuring in this manner is more correct than just placing an oscilloscope probe at the output of the adder, because it avoids the parasitic effects of the probe and measures exactly the amplitude which the comparator sees. In this way, in figure 3.34 the threshold to detect the 50% of the events is close to 72 mV, corresponding to  $80 \frac{mV}{\text{fired-pixel}}$  at the output of Level 0, after multiplying by the gain of 0.9 in the Level 1.

In the case of sum trigger, figure 3.35 shows the rate scans measured when there are 2.5 ns width pulses with an amplitude of 500 mV (25 phe) at the input of two, three, or four pixels in one cluster, or when the same amplitude is present in one pixel of two, three or four different clusters. From these measurements, it can be seen that the gain of the complete analog path is approximately 0.4, because the threshold to detect the 50% of the 500 mV pulses when they are being added from two inputs is around 400 mV. It can be seen also that when three or four inputs are added, the thresholds to detect 50% of the pulses are around 600 and 800 mV respectively, which stands out the linearity of the system. Comparing the results when the addition is made in the Level 0 (inputs from the same cluster) and when it is made in Level 1 (inputs from different clusters), it can be seen that the thresholds to detect 50% of the pulses never differs more than 7%. Ideally they should be

compatible with each other, the small difference arises from the difficulty to adjust exactly the same gain and latency in all the paths when these paths are long. Figure 3.35 also shows that the slope is sharper when the additions are done in the Level 0 because the signal to noise ratio is better due to the shorter length that the signals have to go through before being added.



Figure 3.35: Rate scans corresponding to different additions of pixels and clusters

### 3.2.7.3 Noise

Rate scans are also useful in measuring the noise level, and therefore determining the minimum detectable signal. To make this measurement, a rate scan is performed with all the inputs disconnected, during a fixed time window. When the threshold is under the noise level, the trigger system is triggering all the time, but when the threshold level exceeds the noise, no more triggers are detected. With this method a noise level of 2 mV has been measured at the input of Level 1 comparators, for sum trigger scheme.

Figure 3.36 shows the rate scan measures when there is only a  $50\ \Omega$  load at the input (only noise, like explained in previous paragraph) and when there are 2.5 ns width pulses with amplitudes corresponding to 1, 2, 4 and 8 phe (20, 40, 80 and 160 mV respectively). For a certain margin of thresholds it is possible to detect nearly 100% of inputs with only 1 phe, while there are almost no detections due to noise. This means that the system sensitivity is enough to detect 1 phe, as it is required by CTA LSTs and MSTs<sup>7</sup>. As was already mentioned in the previous subsection, the gain of the complete analog chain is 0.4. This means that the 50% thresholds for inputs of 1, 2, 4 and 8 photoelectrons should be at 8, 16, 32 and 64 mV, being compatible with figure 3.36 [3].

<sup>7</sup>Complying with requirements C-LST-CAM.0433 and C-MST-CAM-NC-0184



Figure 3.36: Rate scan for input amplitudes corresponding to 8, 4, 2, 1 phe and only noise, when the Level 1 is working in mode 4

### 3.2.7.4 Attenuation and clipping

Level 0 sum trigger boards contain other two subsystems which can be also characterized with rate scans, namely the attenuators and the clipping circuits. In this case, for a fixed input amplitude and a fixed threshold (low enough to detect the signals when no attenuation or clipping is active), the complete attenuation or clipping range of values are tested, counting the number of events in each case. Figure 3.37 shows that both systems work as expected. By repeating this test for different input amplitudes and threshold values, the attenuators and the clipping circuits can be characterized.



Figure 3.37: Rate scans changing the attenuation or the clipping control voltages

Apart from actually clipping the pulses, the clipping circuit also introduce a certain attenuation which depends on the selected clipping voltage. In order to compensate this variations in the

attenuation the channels must be calibrated, finding an attenuation value in the variable attenuator useful to compensate the parasitic attenuation introduced by the clipping. Figure 3.38 shows the response of one calibrated channel for different clipping values. In this figure each line represents the gain of the trigger chain for a specific clipping value in photoelectrons<sup>8</sup>, considering the 50% threshold voltage from a rate scan as the output amplitude. When the input signal amplitude is lower than the clipping value, the channel response is quite linear, with only some degradation for low clipping values. On the other hand, when the input signal is higher than the clipping voltage, the clipping works as expected, introducing attenuation. The figure shows that the clipping is not able to fix the output amplitude to the clipping voltage for inputs which exceed the clipping value, allowing higher output amplitudes for large input pulses. However, this is not critical because the large afterpulses are very much attenuated, which is the real target of the clipping circuit while not affecting the low amplitude pulses caused by the Cherenkov photons.



Figure 3.38: Level 0  $V_{out}$  vs  $V_{in}$  in a calibrated channel, for different clipping values, in photoelectrons

### 3.2.7.5 Dynamic range

The specification of the LSTs and MSTs requires the trigger system to be able to distinguish between 0.25 and 200 phe with the sumtrigger scheme. The lower limit is given by the noise power in the Level 1 comparators, while the upper limit is related to the saturation of the amplifiers. In this way, if there are 20 mV/phe<sup>9</sup> at the input of Level 0 and the amplifier used in the Level 1 adder is saturated with voltages higher than 2 V, by adjusting the gain of the trigger analog chain to 0.4 it is possible to have 8 mV/phe at the input of the comparator, detecting up to 247 phe without saturation. In the same way, a noise level of 2 mV will correspond to 0.25 phe, meeting the specification.

With a majority strategy, the objective is to distinguish between 0 and 28 pixels fired. As can

<sup>8</sup>It can be changed to mv with no more than multiplying by  $8.1 \frac{mV}{phe}$

<sup>9</sup>Requirements C-LST-CAM.0429 for LST and C-MST-CAM-NC-0180 for MSTs

be seen on figure 3.34, Level 1 detects each pixel fired with a threshold of 72 mV (well over the noise level), so 28 pixels fired will be detected with a 2016 mV threshold, making full use of the 2V dynamic range.

### 3.2.7.6 Pulse width

The bandwidth of the trigger analog chain (up to the L1 comparator) should be high enough to avoid stretching the 2.5 ns width typical pulses from the PMTs. If the pulses were square, 400 MHz should be required to keep at least the first lobe of the spectrum undistorted. As the pulses are not square but gaussian, the bandwidth can be somewhat lower, provided that it does not stretch the pulses. If the pulses were stretched, the probability of adding NSB-caused pulses to the pulses generated by the Cherenkov photons is increased as much as the pulse width. And this would reduce the performance of the trigger system.



Figure 3.39: Pulse width measured at different points in the trigger chain

Figure 3.39, shows the pulses measured with an oscilloscope at different points of the testbench with the sum trigger and the Dragon front-end board. The measurements show an increment in the pulse width from 1.92 ns at the output of the PMT to 3.56 ns at the end of the analog chain,

just before the comparators in the Level 1. There are several reasons for this increment. First, the Dragon front-end board has some amplifiers to adapt and replicate the input signals to feed the readout and trigger branches which are short of bandwidth. To solve this problem, these amplifiers will be substituted by others with higher bandwidth in the next Dragon FEB version. The next stretching occurs in the Level 0 and Level 0 fan-out modules. Some of this widening is due to not perfect matching in the mezzanine connectors as well as in the backplane connectors, and it will be solved with the mezzanines integration. In addition, the Level 0 had many delay stages with switches due to the implementation of the delay compensation system (see chapter 6), making difficult to obtain a good bandwidth. Finally the stretching due to the Level 1 is negligible<sup>10</sup>. According to the measurements showed in figure 3.39 it seems that the Level 1 makes the pulses shorter, which is an artefact of the measurement process due to the effect of a not perfect probe positioning. It should be also taken into account that the tests showed in figure 3.39 were performed with a PMT with particularly short pulses, while the PMTs specified to be used in the LSTs are 2.6 ns width. Anyway, reducing the pulse stretching is one of the motivations for the future improvements (see chapter 8).

### **3.2.7.7 Power consumption**

Keeping the power consumption to minimum is essential for systems like the presented trigger, which is placed inside the camera. The main problem is not the power consumption itself, but the refrigeration system required to take away the heat dissipated by the electronics [108]. In this sense, the larger the power consumption, the bigger and heavier the refrigeration system must be, and the weight of the camera is limited by the mechanical characteristics of the telescope. Table 3.2 gathers the power dissipated in the different elements of the trigger system, for one cluster:

| Subsystem                              | Power (W)                                |
|----------------------------------------|------------------------------------------|
| Level 0                                | 2.7 W (Sum) or 1.33 (Majority)           |
| Level 0 fan-out                        | 0.62 W                                   |
| Level 1                                | 1.4 W                                    |
| Level 1 distribution                   | 0.8 W                                    |
| <b>Maximum total power consumption</b> | <b>5.52 W (Sum) or 4.15 W (Majority)</b> |

Table 3.2: Power consumption for one cluster

It can be seen that the element which consumes more power is the Level 0, specially for the sum trigger version. This is mainly due to the clipping stage, which uses very power demanding BJTs in active state [104]. As the clipping stage is used to reduce the effect of the afterpulses, reducing the afterpulsing in the PMTs would allow to remove the clipping stage, thus saving a considerable fraction of power<sup>11</sup>.

### **3.2.7.8 Summary of camera trigger parameters**

The measurements of the main different parameters have been summarized in table 3.3:

<sup>10</sup>In the initial versions, the Level 1 actually stretched the pulses, but after some detailed characterization, the problem was solved. See section 4.12.3

<sup>11</sup>Nevertheless, reducing the afterpulsing without worsening other PMT features is not easy. See [109]

| Parameter                                | Value                             |
|------------------------------------------|-----------------------------------|
| Gain of trigger chain                    | 0.4                               |
| Sumtrigger dynamic range                 | 0.2 to 247 phe                    |
| Majority dynamic range                   | 0 to 28 pixels                    |
| Noise                                    | 2 mV (0.25 phe in sum trigger)    |
| Latency                                  | 220 ns                            |
| Jitter                                   | < 1 ns                            |
| Minimum pulse width at the L1 comparator | 3.56 ns                           |
| Power consumption                        | 5.52 W (Sum) or 4.15 W (Majority) |

Table 3.3: Summary of camera analog trigger parameters

### 3.2.8 Array trigger

The array level trigger in CTA has several implementations. In the case of telescopes with long memory buffers like the LSTs, it is possible to implement hardware schemes like the one described in section 3.1.3, where a hardware module decides if an event stored in the analog memory buffer should be digitized or not. On the other hand, in the case of telescopes with short memory buffers like the SSTs or the MSTs based on NECTAr it is not possible to keep the data in the analog memories during the time required to take the decision. In these last cases a software scheme is used, deciding whether the digitized data should be stored to disk or not.

#### 3.2.8.1 LST hardware stereo trigger

The LST hardware stereo trigger function, developed by the author of this thesis, will be implemented in the trigger interface boards of the LSTs. Its function consists of looking for coincident camera triggers in the LSTs, inside a time window of around 50 ns. If more than one telescope has triggered inside the coincidence window, it is likely that these telescope triggers were fired by a Cherenkov shower instead of by a NSB-induced one, so the event deserves to be digitized and stored. In this way, many NSB events are discarded before being digitized, avoiding a large fraction of dead time. This allows to use lower Level 1 thresholds and thus detecting lower energy  $\gamma$ -ray showers, with a manageable trigger rate which would be much higher without the stereo trigger system due to the NSB events. As a summary, it can be said that the stereo trigger allows to digitize more low-energy showers observed by several telescopes rather than loosing time storing NSB. The hardware stereo trigger implementation is explained in depth in chapter 7, so here only the most relevant ideas about the Stereo Trigger functionality will be presented.

There will be one trigger interface board in the camera of each LST looking for stereo coincidences. Typically there will be four LSTs forming a square of 100 m side as in figure 3.40.

All the TIBs will send their local camera (i.e., Level 1) triggers to the other LSTs so all of them will have the information required to look for coincidences. However, the local trigger pulses from the LSTs will take different amounts of time to reach the neighbours. As the coincidences must be checked at synchronized times in all the telescopes, the inputs to the stereo trigger must be delayed until equalizing the delay to the telescope with the longest path. The time to compensate is not only the one corresponding to the different length of the optical fibers, but also the fiber length



Figure 3.40: Typical LST layout

between the camera and the LST base (around 100 m). In addition, the different time of flight of the Cherenkov photons depending on the pointing angle (see figure 3.41) should be taken into account. This last contribution is the most challenging, because it means that the compensating delays for the different inputs must be updated every time the pointing direction changes - which happens continuously when tracking a source -.

The addition of all the delay contributions will reach around 2500 ns, between Level 1 trigger generation in a cluster and the beginning of the digitization, still well below the 4096 ns of signal stored in the analog buffer of a Dragon front-end board. It is worth to mention that the reading must occur a fixed time after the local trigger and not just when the coincidence happens, thus guaranteeing that the pulses corresponding to the shower image will be in the recorded samples.

Apart from the time compensation and the coincidence logic, the trigger interface board implements other functions related to collecting triggers from other sources as the calibration box (for calibration triggers and pedestals) or the central array counting house. The trigger interface board and the stereo trigger function are described in depth in chapter 7.

### 3.2.8.2 Software array trigger

In a telescope array as large as CTA, not all the IACTs can be participating in a hardware stereo trigger scheme like the one described in section 3.2.8.1. The distance between telescopes positioned at opposite ends can be higher than 1 km and this would mean prohibitive long memory buffers. However, in order to reduce the array data rate to a manageable size and thus also the huge amount of data to be stored, a software array trigger can be used, as it is sketched in figure 3.42.

In order to implement such software array trigger, whenever a camera generates a trigger command, it sends a short data packet with the time stamp of the event to an array trigger central unit. In addition, around  $1 \mu\text{s}$  later, once the full camera image is digitized, it is not immediately stored to disk, but stored in temporal RAM memory in a telescope computer server. Meanwhile, the array trigger unit analyzes the timestamps coming from all the telescopes in the array, looking



Figure 3.41: Different distances the Cherenkov photons have to go through, depending on the pointing direction

for temporal coincidences between neighbour IACTs in a time window of a few tens of nanoseconds. If the coincidence occurs, the array trigger unit will send the command to write the data to disk at the telescope server; otherwise the data will be discarded after a certain time[110]. It is worth mentioning that these RAM buffers are large enough to store several camera events (which are recorded at a maximum rate of 10 kHz), before the array trigger command arrives. The only requirement for this scheme to work, is to have a well synchronized clock in all the telescopes, so that the time stamps are consistent.

This software array trigger can help to reduce the total amount of data stored, but it is important to highlight that, as it works with the already digitized data, it does not help to reduce the dead time nor to use lower thresholds. In the same way, as the software array trigger only considers the triggers which caused the digitization of an event, both the hardware and the software array trigger implementations can work together. Some additional restrictions could be required, like storing to disk all the events which caused a LST hardware stereo trigger, if the hardware stereo triggers are directly considered good enough to be stored. The array trigger is currently being developed by the Array Control Working Package inside the CTA Consortium.



Figure 3.42: Array trigger scheme [110]

# Chapter 4

## The Level 1

As has been introduced in section 3.2.5, the function of the Level 1 is to evaluate several trigger regions composed of different combinations of neighbour clusters, generating a trigger output signal if the trigger condition is satisfied. This Level 1 trigger system has been fully designed, prototyped and tested by the author of this thesis. In order to perform the above mentioned functionality, it receives six analog differential input signals coming from the Level 0 outputs of the local cluster and from the neighbours. These analog inputs are properly adapted and scaled and then, several combinations are analogically added and compared with a threshold voltage. If the addition exceeds the threshold, the Level 1 will trigger, otherwise it will not.

Three different sets of combinations (i.e. trigger regions) can be added in each Level 1, slow-control selectable, defining three working modes as presented in figure 4.1. In this way, it is possible to select trigger regions with 14, 21 or 28 pixels (2, 3 or 4 clusters), which provides with enough flexibility to enhance the sensitivity to showers of a certain size in environments with different levels of NSB. The presented Level 1 trigger system is the first one in the history of IACTs to provide this adjustable trigger region size.



Figure 4.1: Level 1 working modes [5]

It is also worth to mention that the meaning of the L1 analog input signals (independently of the Level 0 scheme, photoelectrons or fired pixels), is transparent for the Level 1 which simply

add analog signals and compare the sums with a threshold. In the following sections the electronic implementation of the Level 1 trigger will be described.

## 4.1 General scheme



Figure 4.2: Level 1 general scheme

Figure 4.2 shows the overall architecture of the Level 1 trigger system, so it can be used as a reference for the following sections. First, the differential analog inputs from the Level 0 of the local cluster and from the neighbouring clusters are scaled and transformed into single-ended signals, to operate with them in a simpler way. Then, the single-ended outputs are split into two or three branches, or not divided at all but attenuated, depending on the specific channel and trigger region to be evaluated. For example, by looking at figure 4.1 it is easy to see that the input from cluster 0 can participate in up to three sums in modes 2 or 4, while the input from cluster 5 only takes part

in one sum in mode 4. So, the input from cluster 0 must be divided into 3 branches for 3 adders, while the input from cluster 5 only has to be attenuated to equalize the losses suffered by all the input signals. Additionally, from figure 4.1 it also follows that channel 4 does not take part in any combination, and consequently, it is terminated in figure 4.2.

Whatever the selected working mode (2, 3 or 4), there are never more than 3 sums required, so all the sums for all the working modes can be performed in a cost-effective way, with only three adders with 4 inputs each one. For example, when mode 4 is selected, all the adder inputs are connected. On the other hand, if any other mode is active, some of the adder inputs are set to ground with a switch, so only the required signals reach the adder inputs. Then, these signals are added and afterwards scaled in the adders themselves, which finally send the sums to the comparators. Here, the three sums are compared with a threshold previously set by the DAC.

The outputs of the comparators, already digital and differential, are finally sent to an OR gate which provides one differential trigger output which will be distributed throughout the camera, activating the readout of all the clusters.

In the following sections the different stages of the Level 1 trigger system are explained in more detail.

## 4.2 Differential to single-ended converter

The target of this stage is to transform the six differential analog inputs into single-ended signals, amplify and invert them to prepare them for next stages. To do so, six traditional subtractor circuits have been used, like the one shown in figure 4.3, where  $V_{out} = (V_{in}^- - V_{in}^+) \cdot \frac{R_f}{R_{in}}$ . On the other hand,  $R_m$  is only required for impedance matching and the output capacitor removes any possible offset due to parasitic effects in the amplifier.



Figure 4.3: Schematic of the differential to single-ended stage

The AD8003 operational amplifier, from Analog Devices Inc. [106] has been chosen for its large bandwidth ( $\approx 1$  GHz) and slew rate (3800 V/ $\mu$ s), so that it is suitable to handle fast pulses. Moreover, as every single chip contains three amplifiers, it is possible to perform all the subtractions

with only two chips, in a compact and cost-effective way. Nevertheless, this amplifier, with a  $\pm 3.3V$  power supply, saturates at 2 V, which limits the maximum possible gain at this stage and must be taken into account to define the dynamic range.

Three possible gains are defined in the differential to single-ended stage, depending on how many sums the channel participates in. Thus, the gain for channels 0 and 1, which take part in up to three additions, will be higher than for channels 2 and 6 which only can take part in a maximum of two sums. At the same time, the gain for channels 2 and 6 will be higher than for channels 3 and 5 which only can take part in one sum. Table 4.1 shows the number of sums for every channel that they are participating in depending on the working mode, the values of  $R_f$ ,  $R_{in}$ , and the corresponding gain in the differential to single-ended stage, in natural units.

| CH | Contributions in Mode 2 | Contributions in Mode 3 | Contributions in Mode 4 | $R_f(\Omega)$ | $R_{in}(\Omega)$ | Gain |
|----|-------------------------|-------------------------|-------------------------|---------------|------------------|------|
| 0  | 3                       | 2                       | 3                       | 300           | 200              | 1.5  |
| 1  | 1                       | 2                       | 3                       | 300           | 200              | 1.5  |
| 2  | 1                       | 1                       | 2                       | 300           | 261              | 1.15 |
| 3  | 1                       | 0                       | 1                       | 300           | 300              | 1    |
| 5  | 0                       | 0                       | 1                       | 300           | 300              | 1    |
| 6  | 0                       | 1                       | 2                       | 300           | 261              | 1.15 |

Table 4.1: Contributions to the sums for the different channels and modes, and gains in the differential to single-ended stage

### 4.3 Wilkinson splitters

The splitters used to divide the signals between two or three branches must achieve an equal power division between the outputs, the lowest possible losses and good impedance matching in a bandwidth wide enough to deal with the fast pulses (around 400 MHz). In classic microwave engineering this can be done with a Wilkinson divider [111]. However, the response of a typical Wilkinson divider is bandpass, and its branches have a  $\lambda/4$  length, being  $\lambda$  the inverse of the central frequency. At 400 MHz, this length is longer than 10 cm, which is too large to be integrated in the mezzanine.

The solution to this problem consists of substituting the  $\lambda/4$  lines by their  $\pi$  equivalent circuits with lumped elements, as it is sketched in figure 4.4 [112]. This change reduces drastically the space required for the circuit and, even more important, makes the response type of the divider lowpass, which is exactly the required characteristic for the Wilkinson splitters of the Level 1. In this way two splitters were designed, with two and three output branches. The electric schematics and the corresponding component values are presented in figure 4.5.

Before integrating the splitters in the Level 1, a couple of prototypes were manufactured, using 1.5 mm thick FR4 substrate, with  $\epsilon_r = 4.5$ . Figure 4.6 shows photographs of the two prototypes.

The prototypes were measured with a vectorial network analyzer at the facilities of the Microwaves and Radar Group (GMR) at ETSIT-UPM, obtaining the measurements shown in figure 4.7 [6]. As can be seen in that figure, the frequency response is rather good up to 1 GHz and 800 MHz for the two and three branches splitters respectively. Above those frequencies, the  $|S_{21}|$  and



Figure 4.4: Conversion of a Wilkinson splitter made with distributed elements into one made with lumped components



Figure 4.5: Wilkinson splitters for the L1 Trigger system

$|S_{31}|$  fall sharply, attenuating the signal. As most of the power of the pulses is concentrated at lower frequencies, this response is advantageous, as it will filter high frequency noise. On the other hand, the impedance matching is somewhat worse than expected due to the effects of the connectors, and the losses are higher, although this last problem can be compensated by increasing the gain in the



(a) 2 branches Wilkinson splitter prototype

(b) 3 branches Wilkinson splitter prototype

Figure 4.6: Photographs of the Wilkinson splitter prototypes[6]



(a) Measurements of the 2 branches Wilkinson splitter prototype



(b) Measurements of the 3 branches Wilkinson splitter prototype

Figure 4.7: Measurements of the Wilkinson splitter prototypes[6]

differential to single ended or in the adder stage. It is worth to mention that the Wilkinson splitters needs to have all their outputs terminated with  $50 \Omega$ , otherwise the matching becomes worse and the power is not equally distributed.

Once the prototypes were validated, the same circuits of figure 4.5 were included in the Level 1. In this case, the absence of SMA connectors and a substrate with only 0.13 mm thickness improves the performance of the splitters.

## 4.4 Attenuators

Recalling figure 4.2 it can be seen that inputs 3 and 5 do not require any division, which means that they will suffer much less attenuation. In spite of having compensated part of this difference by adjusting the gains in the differential to single-ended stage, the difference between the channels which are splitted and the ones with are not is still too large. With the aim to compensate this difference, channels 3 and 5 are attenuated with a simple resistive attenuator as the one showed in figure 4.8.



Figure 4.8: Attenuator schematic

## 4.5 Switching network

Once all the required signals are available for the sums, a smart distribution of them is required in order to minimize the number of components and to simplify the control logic, which means at the same time reducing the cost and the power consumption. Table 4.2 shows this smart distribution and the control logic.

|         | Adder 1 |     |       |       | Adder 2 |       |       |     | Adder 3 |       |       |     |
|---------|---------|-----|-------|-------|---------|-------|-------|-----|---------|-------|-------|-----|
| Mode 4  | 0       | 1   | 2     | 6     | 0       | 1     | 2     | 3   | 0       | 1     | 5     | 6   |
| Mode 3  | 0       | 1   | 2     | GND   | GND     | GND   | GND   | GND | 0       | 1     | GND   | 6   |
| Mode 2  | 0       | GND | 2     | GND   | 0       | GND   | GND   | 3   | 0       | 1     | GND   | GND |
| Control | Wired   | A1  | Wired | A1&A0 | A0      | A1&A0 | A1&A0 | A0  | Wired   | Wired | A1&A0 | A1  |

Table 4.2: Distribution of the input channels among the adders and control logic for working-mode switching

As there are never more than 3 sums required at the same time, one can therefore perform all the sums of the different modes with only three adders which can be implemented in a compact way with only one AD8003 chip. Mode 4 forces channels 0, 1, 2 and 6 to be connected to a first adder, channels 0, 1, 2 and 3 to a second one and finally, channels 0, 1, 5 and 6 to the last one.

Following the channel distribution of Table 4.2, it is possible to obtain the sums required for the other modes with nothing more than connecting or disconnecting some of the input signals to the adders. It is remarkable that it is not necessary to exchange channels at any input of the adders, but only let the input signal reach the adder or not. Moreover, four channels are hard wired so they are always connected without any additional intermediate component. For the others that can be connected or not, the switching has been implemented with eight ADG901 single-pole single-throw (SPST) chips, which allow us to select between connecting their inputs and outputs or setting both to ground through  $50\ \Omega$  resistances, as sketched in figure 4.9. In this way, the splitters are always loaded with  $50\ \Omega$ , as it is required to avoid mismatching and signal distortion. In addition, insertion losses in ADG901 chips are negligible, the bandwidth is around 1 GHz and they consume hardly any power [113]. The only drawback of this switch is that it requires a power supply between 1.65 V and 2.75 V, instead of +3.3 V like the most of the components in the Level 1. Fortunately, the front-end board uses +2.5 V for other purposes and it can be used also to power these switches.



Figure 4.9: ADG901 block diagram [113]

As regards the logic to control the working mode, the selected mode is encoded with two bits A0 and A1, as shown in table 4.3. Thus, obtaining the control signals for the switches is as simple as connecting them directly to A0, A1 or to the logic AND of both, as it is shown in the last row of table 4.2. Besides, mode 0 is a special mode that sets all the Level 1 into calibration state to perform the Delay Compensation algorithm described in chapter 6. It is important to clarify that A0, A1 and the input controls of the switches are standard CMOS signals which are managed using ordinary electronics. To be precise, A0 and A1 signals are directly driven by the front-end board FPGA.

|               | A1 | A0 |
|---------------|----|----|
| <b>Mode 0</b> | 0  | 0  |
| <b>Mode 2</b> | 0  | 1  |
| <b>Mode 3</b> | 1  | 0  |
| <b>Mode 4</b> | 1  | 1  |

Table 4.3: Mode codification

## 4.6 Adders

The adders perform the addition, inversion and amplification of their four inputs. To do so, a typical adder-inverter scheme has been used as shown in figure 4.10. Thanks to the virtual close

circuit of this scheme there is always 0 V at the negative input of the amplifier, and therefore the same input impedance, slightly lower than  $50\ \Omega$ , is present at the four inputs, whatever the selected working mode. In addition, the fixed 0 V at the inverter input of the amplifier prevents the input signals from going back through another channel input. Due to this advantages the adder inverter scheme was chosen and this is why a pre-inversion was performed at the differential to single-ended stage.



Figure 4.10: Adder schematic



Figure 4.11: Spice simulation of the addition of 3 input pulses (green, yellow and pink). The dark blue line is the result of the addition with ideal electronics, while the red one is the expected output of the real circuit

Additionally, the gain of the adders is defined by  $R_f/R_{in}$ , so it is possible to adjust the gain for all the inputs simply by changing  $R_f$ . In this way the dynamic range of the Level 1 trigger chain can be adjusted by no more than changing three resistors. In fact, it is adjusted to obtain an overall gain of 0.9 in the entire Level 1 trigger chain ( $R_f = 275\ \Omega$  and  $R_{in} = 100\ \Omega$ ), so there are 8 mV/phe at the outputs of the adders. Taking into account that the AD8003 saturates for outputs larger than 2 V, it is straightforward to see that the limit of the linear dynamic range is 247 phe, which is more than enough for our requirements. In addition,  $R_m$  resistors (currently  $100\ \Omega$ ) are placed at the input

of the adders for proper matching (when the channels are connected to the adders, otherwise the signal only sees the  $50 \Omega$  resistors of the ADG901) and one  $10 \mu\text{F}$  capacitor at the output to remove the possible DC offset. In order to study the suitability of the AD8003 to perform the additions, several Spice simulations were carried out with successful results, as the one shown in figure 4.11. Later measurements like the ones shown in figure 4.12 confirmed the expected results.



(a) Additions of 1, 2, 3 and 4 coincident pulses



(b) Additions of 3 and 4 coincident inputs (blue), and 3 coincident inputs plus other one delayed 3 ns

Figure 4.12: Measurements of the adder outputs

Last but not least, the AD8003 includes 3 operational amplifiers in each chip, so all the required additions can be done in a cost effective way using only one AD8003 chip. The required space is also very small and the power consumption can be reduced by switching off the amplifiers not in use by using the power down pins.

## 4.7 Comparators

The output of the adders are sent to three fast LVDS ADCMP604 comparators [102], which compare them with a voltage threshold and provide a differential LVDS output in less than 1.5 ns. There are other fast differential standards like ECL or PECL with a similar speed, but LVDS is preferred because of its lower power consumption.

One of the main limitations of this comparator is the time which the input signal needs to be exceeding the threshold before the comparator change its output to “1”. This phenomenon, known as “time over threshold”, was characterized for this comparator by the CTA IFAE group during the development of the Level 0 majority trigger [114], obtaining the result presented in figure 4.13. The time over threshold changes slightly depending on the input amplitude, being shorter if the threshold is exceeded with a high margin. Nevertheless, this effect of the amplitude exceeding margin over the time over threshold is not very important, and, generally speaking, it can be said that the ADCMP604 always needs around 1 ns before generating a positive output. Additionally, due to the finite slew rate of the comparator, another nanosecond is required to ensure that the comparator output voltage has reached its highest level. The combined effect is that the output pulse of the comparator is around 1 ns narrower than the analog input, at 50% amplitude level. This entails problems to detect short pulses, which were solved in the OR gate.

Another effect typically associated to comparators is the time-walk, which consists on the variation of the propagation delay depending on the size of the input signal. This effect has been measured for the whole Level 1 (not only the comparator), and the results are shown in section 4.12.5.

## 4.8 Digital to Analog Converter

The threshold levels for the comparators were originally generated with a DAC AD5060 [115], with a resolution of 16 bits between 0 and +3.3 V, so it was possible to generate voltage differences of 50.35  $\mu$ V (0.0063 phe). At the same time, the measured offset for this DAC is 0.9 mV (0.11 phe), so it complied with the lowest detectable signal specification of 0.2 phe. In the first prototypes the +3.3 V voltage reference was directly connected to the +3.3 V power supply and this was very problematic, because the slight offset voltage variations of the power supply and even some switching noise coming from the DC/DC converters affected the voltage reference stability.

These problems were solved later, when AD5060 was replaced by the current AD5663R [116]. This second DAC contains a 2.5 V internal voltage reference, which is very stable and it is independent from the power supply<sup>1</sup>. Additionally AD5663R has two independent outputs, which are very useful to implement the improvements required for the COLIBRI (see chapter 5), and moreover it is cheaper than the AD5060.

---

<sup>1</sup>Anyway, the power supply of the DAC is carefully filtered in the last prototypes, as a precaution.



Figure 4.13: Measurement of the time over threshold effect [114], considering a 100 mV threshold.

As the reference voltage (2.5V) is now more close to the dynamic range of the adders output (2V), the resolution has been improved. Thus, with AD5663R the minimum voltage difference is  $38.15 \mu\text{V}$  (0.0048 phe). Regarding the offset the AD5663R is not as good as the previous DAC, and an offset error of up to 2 mV has been measured for the 0 V input code. This means 0.25 phe which



Figure 4.14: Measured output voltage vs input code for the AD5663R. The output voltages are nearly identical both in the same DAC and in different chips, and linearity is nearly perfect

is out of the initial specification, but the AD5663R has been maintained, considering that 0.25 phe is still acceptable from the physics point of view, and the advantages of the new DAC. Figure 4.14 presents the measurements performed in the laboratory for the DACs corresponding to two Level 1 mezzanines, showing very good linearity and no noticeable voltage differences neither between the two outputs of each DAC nor between different chips.

Regarding the slow control, the DAC is configured through a standard SPI bus, mastered by the front-end FPGA. AD5663R must receive the configuration messages according to the format described in [116]. During the initialization the DAQ needs the following sequence of messages:

1. 0x 2F 00 00: Reset the DAC.
2. 0x 3F XX X1: Switch the internal reference ON.
3. 0x 18 XX XX: Set voltage X in DAC A.
4. 0x 19 XX XX: Set voltage X in DAC B.

During the Level 1 tests, the SPI communication with the DAC was implemented with an Aardvark I2C/SPI Host Adapter [117], connected to a PC running a Labview program [118].

## 4.9 OR Gate

At the output of the comparators there are 3 digital LVDS signals, which must be combined in a single LVDS trigger signal at the output of the Level 1 system. Therefore, a three-input LVDS OR gate is required. In the case of other fast differential logic families like ECL or PECL, there is

a small range of commercial components to perform logic operations with the data, this is not the case. For LVDS there are components for transmitting, receiving or regenerating the signal, but the only components able to perform logic functions with LVDS signals are FPGAs (after translating to CMOS inside the chip) or expensive general purpose logic components like NBSG86A, from ON Semiconductor [119]. As the cost of these chips is too high (at least 30\$), and the other logic families consumes much more power, the author of this theses preferred to develop his own family of logic gates, which were patented in December 2012 [7].

#### 4.9.1 Basic Principles

The purposed circuits are based on the so called “Diode Logic”, well known since the 1960s [120]-[122]. This is one of the easiest ways to develop a logic gate with no more components than two diodes and one resistor, as shown in figure 4.15.



Figure 4.15: OR and AND logic gates with diode logic

The principle of operation involves one diode for each input, shortcircuiting or isolating the output with the input or ground depending on what kind of gate is required. However, the limitations are clear:

- Voltage in silicon diodes drops around 0.7 V between anode and cathode, so the digital output for “1” in the OR gate is 0.7 V lower than the input, as well as the digital output for “0” in the AND gate is 0.7 V larger than the input. This limits the possibility to chain many gates and makes diode logic only useful for logic standards with large voltage differences between logic states.
- The diodes need some time to change their state and this delay can be critical if high speed is required.

By substituting the standard silicon diodes by modern zero bias Schottky diodes (fast and with a threshold voltage of only some mV) and regenerating the signal with a comparator, it is possible to develop logic gates for nearly all standards, including LVDS.

### 4.9.2 Generic Gates

A differential OR gate with  $n$  inputs has been designed with  $2n$  zero bias Schottky diodes and one comparator with differential output, connected like in figure 4.16(a).



Figure 4.16: Generic differential gates using zero-bias Schottky diodes

Positive inputs  $input1+$ ,  $input2+$ , ...,  $inputn+$  are connected to the anodes of the diodes 1, 2,... $n$  which have all their cathodes shortcircuited and connected to the positive input of the comparator. On the other hand, negative inputs  $input1-$ ,  $input2-$ , ...,  $inputn-$  are connected to the cathodes of the diodes  $n + 1$ ,  $n + 2$ , ...,  $2n$  which have all their anodes connected to the negative input of the comparator. If all inputs have a logic “0”, for example in LVDS standard, there will be 1 V in all positive inputs and 1.4 V in the negative ones. Using Zero Bias Schottky diodes, the voltage hardly drops in them, so the comparator will have nearly 1 V in the positive input and almost 1.4 V in the negative one giving a logic “0” at its output. On the contrary, if at least one of the differential inputs has a logic “1”, in the positive input of the comparator there will be 1.4 V and 1 V in the negative one, so the comparator will have a logic “1” at its output, performing the logic function OR. Obtaining the NOR function is trivial, with no more than inverting the output of the comparator.

The AND function can be performed with a similar circuit topology with only changing the connection of the diodes, as can be seen on figure 4.16(b). It works in a way very similar to the OR gate. If all inputs have logic “1” at their inputs, in LVDS standard there will be 1.4 V in all positive inputs and 1V in the negative ones. Thanks to the zero bias Schottky diodes there will be almost the same voltages at the inputs of the comparator, which will have a logic “1” in its output. However, if at least one of the inputs changes to logical “0”, there will be 1 V at the positive input of the comparator and 1.4 V in the negative one, so there will be a logical “0” at the output of the comparator as the logic AND function requires. Similarly to the case of the OR gate, a NAND gate can be obtained simply inverting the outputs of the comparator.

### 4.9.3 Transition Times

With the aim to study the expected behaviour of the proposed gates before manufacturing prototypes, several SPICE simulations were performed. An OR gate with 2, 3 and 6 inputs was simulated , using the spice models provided by Avago Technologies for the HSMS2855 Zero-Bias

Schottky diode [123] and an ideal comparator. The results, which can be seen in figure 4.17, show that the transition from “0” to “1” is almost instantaneous, while the transition from “1” to “0” changes fast until approximately middle amplitude and then decreases (increases for *input*–) slowly to recover the typical voltage values for “0”.



Figure 4.17: SPICE simulation of 2, 3 and 6 inputs OR gates

According to the simulations, the effect of the slow recovery becomes more apparent when the number of inputs is increased and, in figure 4.17 it can be seen that for 6 inputs, the width of the trigger pulse at the output of the comparator can be twice the width of the positive input. The reason of these differences between the transition times of the change from “0” to “1” and from “1” to “0” can be understood observing the equivalent circuit of the Schottky diode in figure 4.18.

According to the diode HSMS2855 datasheet [123], the values of the components of the equivalent circuit are:

- $R_s \approx 25\Omega$
- $C_j \approx 0.18pF$



Figure 4.18: Equivalent circuit of a Schottky diode

- $R_j = \frac{8.33 \cdot 10^5 nT}{I_a + I_b}$ , with few ohms in forward bias and several hundreds of  $k\Omega$  in reverse mode.

So, it is easy to distinguish two different situations:

### Reverse to Forward Bias

In this case, when the input voltage at the anode becomes higher than the voltage at the cathode, the value of  $R_j$  is reduced very quickly, so the two terminals are virtually shortcircuited and the voltage at the output of the diodes changes almost immediately.

### Forward to Reverse Bias

On the contrary, when the voltage at the anode becomes lower than the one in the cathode, the resistance  $R_j$  becomes very large and the capacitance  $C_j$  has to be discharged through this resistance, in a slow way. When there are several branches in the gate, the capacitances of the diodes are approximately added in a global larger capacitance, which takes longer to be discharged and at the end means a longer transition time.

Once the problem is properly understood, there are two possible strategies to mitigate its effect:

### Reducing $C_j$ capacitance

This is quite difficult to do, as the value of  $C_j$  in a Schottky diode is always very small. Nevertheless, diodes developed to detect microwave or millimetre wave signals have specially low capacitances, although they are more expensive and difficult to work with [124]. In figure 4.19 it is possible to compare simulation results of a 4 inputs OR gate, with the diode HSMS2855 ( $C_j = 0.18 pF$ ) [123] and the HSCH9161( $C_j = 0.035 pF$ ) [125]. It is clear that the gate with the diodes for millimeter wave signals has a shorter transition time than the first one.

### Reducing $R_j$ in reverse bias

This can be very easily accomplished by placing a resistor of a certain value in parallel connection with the diode. In reverse-to-forward bias it has no influence and it reduces the value of  $R_j$  in forward-to-reverse bias, improving the transition time. Figure 4.20 shows the simulated voltages at the inputs of the comparator (output of the diodes) for different values of the resistance in an OR gate with five inputs and a positive input pulse of 2 ns width. According to the simulations the technique works fine and the lower the resistance, the smaller the pulse stretching. Nevertheless, it is important to avoid too low resistance values, to keep a diode impedance high enough to avoid power consumption increase, load effects and impedance mismatching.

Placing the additional resistors is a simple and cheap technique which can be used to control the output width.



Figure 4.19: SPICE simulation of a 4 input OR gate, with two different Schottky diodes



Figure 4.20: Spice simulated voltages at the comparator inputs of an OR gate with five inputs, for different values of the resistor in parallel with the diode

#### 4.9.4 First gate prototypes

The first prototypes of the differential LVDS logic gates were manufactured using HSMS2855 Zero Bias Schottky diodes, an ADCMP604 comparator and resistors to control the output pulse width, all of them mounted over a 1.5 mm FR4 substrate with SMA connectors as shown in figure 4.21. This board not only includes a three input OR gate, but also three ADCMP604 comparators

used to generate the LVDS signals after comparing the input pulses generated in an Agilent 81110A with a threshold. The complete schematic is shown in figure 4.22.



Figure 4.21: Photograph of the anufactured gate test board



Figure 4.22: Schematic of the gate test board

After discounting the effect of the cables length (5.07 ns), we can see in plot 4.23 that the delay between the input and the output is only around 3.85 ns taking into account both the comparator

and the gate. As the comparator introduce a delay of c.a. 1.5 ns, this means that the time required by the OR gate can be estimated around 2.35 ns.



Figure 4.23: Measured delay between test board input and output

Regarding the pulse width, several resistors of different values were tested in a configuration like the one discussed in section 4.9.3. Figure 4.24 shows the measured output pulses of the OR gate for the different resistors, when there is a 2 ns width pulse at the input. It can be seen that, for a resistor of  $3.3\text{ k}\Omega$ , the stretching effect disappears. Although the qualitative behaviour is well understood,  $3.3\text{ k}\Omega$  is a value quite smaller than expected from the simulations in figure 4.20, which means that the SPICE models were not fully accurate. Anyway, no mismatch effects were observed and power consumption remains low, so  $3.3\text{ k}\Omega$  is still acceptable. The total power consumption of the test board was 231 mW, and, considering that each ADCMP604 consumes 57 mW (according to datasheet [102]), it means that the power consumption of the OR gate is roughly 60 mW, which is less than a third part of the consumption of a typical PECL gate.

#### 4.9.5 Level 1 OR Gate

Once the OR gate design was validated in the prototype, it was included in the Level 1 with HSMS2855 Zero-bias Schottky diodes, an ADCMP604 comparator and resistors of  $3.3\text{ k}\Omega$  (see figure 4.25), to have a positive output width as long as the duration of the inputs.

However, this design needed to be changed to compensate the time over threshold of the comparators. The effect of the time over threshold of the ADCMP604, as commented in section 4.7, makes the output signal 1 ns narrower than the input. This means that, for the typical  $2.5\text{ ns}^2$  width

<sup>2</sup>2.5 ns was the initially expected width of the analog pulses but, in fact, 3 ns is much more realistic due to the



Figure 4.24: Measurements of the OR gate differential output pulses, for different resistor values, when there is a 2 ns width pulse at the input



Figure 4.25: Schematic of the OR gate originally used in the Level 1.

pulse at the input of the comparator stage, there is a 1.5 ns LVDS width output pulse entering to the OR gate. As the OR gate also has another ADCMP604 with its corresponding time over threshold,

bandwidth limitations, see section 3.2.7.6

the output of the OR gate would be only 0.5 ns width. This time is not enough for the comparator to change from 1 V to 1.4 V (or from 1.4 V to 1 V for the negative signal). As a consequence, the output of the comparator can not reach the full amplitude corresponding to the LVDS standard and some pulses would be lost.

The solution was to remove the resistors used to control the output pulse width. Without these resistors, even for 1.5 ns width pulses at the input of the OR gate, the voltage at the inputs of the comparator in the OR gate exceeds the threshold during a longer time, as was shown in section 4.9.3. So the output of the OR gate is long enough to reach the correct LVDS standard voltage levels. This is a nice example of how a parasitic effect can be used in a smart way to fix another problem. The only drawback is that, without the resistors, the output pulses can last up to 10 ns when, in fact, the threshold of the comparator stage was only exceeded during 2.5 ns. This is not very relevant, because the important information is that the trigger condition was fulfilled at the time of the leading edge of the pulse. In fact, the Level 1 distribution reshapes the pulses, making them wider or narrower by moving the falling edge depending on its position with respect to the Level 1 distribution clock (see section 3.2.6).

It is also worth to mention that, as the input impedance of the diodes is quite high,  $100 \Omega$  resistors are required between the positive and negative lines from each of the 3 comparators to obtain the best and fastest response.

## 4.10 Other technological issues

All the different subsystems making up the basic version of the Level 1 have been described in previous sections. However, apart from how to implement the required functionality in the hardware, there are other technological questions related with the development of the mezzanine which also deserve to be commented.

### 4.10.1 Dimensions

The dimensions of the Level 1 mezzanine are  $118 \times 41$  mm ( $4635 \times 1625$  mils). These dimensions were standardized for both NECTAr and DRAGON front-end boards at the beginning of the development, and thus taken into account in the mechanic drawings. In this way, the mezzanines and the front-end boards can evolve independently, ensuring the mechanical compatibility. Figure 4.26 shows the Level 0 and Level 1 mezzanines connected to Dragon and NECTAr front-end boards, at different development steps.

### 4.10.2 Connectors

In the same way as the board dimensions, the connectors and the pin-out were standardized. The chosen connectors were the QMSS-016-6.75-L-D-DP-A and QMSS-016-6.75-L-D-DP-PC4, provided by SAMTEC [126] (figure 4.27). They were chosen because of their bandwidth (17.5 GHz), their high density, their good shielding and because they are specially optimized to carry differential signals. In addition QMSS-016-6.75-L-D-DP-PC4 has some pins for the power supply. On the negative side, these connectors are difficult to solder by hand and are not very robust, so, from the point of view



(a) Dragon front-end board with Level 0 sum and Level 1 mezzanines



(b) NECTAr front-end board with Level 0 sum and Level 1 mezzanines

Figure 4.26: Different front-end boards with similar mezzanines in different positions

of the writer, they are good for prototyping but they must disappear in the final design, after the integration of the mezzanines in the front-end board.



(a) Samtec QMSS connector



(b) Samtec QMSS...-PC4 connector

Figure 4.27: Samtec QMSS connectors

Regarding the pin-out, it was also fixed at the beginning of the design, although in this case the meaning of some signals was changed to include new functionalities. Some of the lines are directly connected to the front-end FPGA and, in this case, it is very simple to change its functionality just by changing the firmware, and thus allowing to manage most of the new Level 1 features. In figure 4.28, the last version of the pin-out is shown. It is also worth to mention that, as the input from

cluster 4 in figure 4.1 is not used, the lines in principle assigned for that signal are connected to a  $100\ \Omega$  load and the name AN\_L1\_IN4 is used for the signals coming from cluster 5, as well as the ones named AN\_L1\_IN5 are coming from cluster 6 in figure 4.1.



(a) QMSS-0.16-6.75-L-D-DP-A



(b) QMSS-0.16-6.75-L-D-DP-A-PC4

Figure 4.28: Level 1 pin-out