



## Full Length Article



## The upgrade of the general-purpose digital data acquisition system (GDDAQ)

H.Y. Wu <sup>a,b</sup>, Z.H. Li <sup>b,\*</sup>, M. Venaruzzo <sup>c</sup>, L. Colombini <sup>c</sup>, D.W. Luo <sup>b</sup>, H. Hua <sup>b</sup>, S. Nishimura <sup>d</sup>, A. Abba <sup>e</sup>, Y. Venturini <sup>c</sup>, C. Tintori <sup>c</sup>, M. Bianchini <sup>c</sup>

<sup>a</sup> Key Laboratory of Nuclear Data, China Institute of Atomic Energy, Beijing 102413, China

<sup>b</sup> School of Physics and State Key Laboratory of Nuclear Physics and Technology, Peking University, Beijing 100871, China

<sup>c</sup> Costruzioni Apparecchiature Elettroniche Nucleari (CAEN) S.p.A., Viareggio 55049, Italy

<sup>d</sup> RIKEN Nishina Center, 2-1 Hirosawa, Wako, Saitama 351-0198, Japan

<sup>e</sup> Nuclear Instruments srl, Via Lecco 3, Lambrugo 22045, Italy

## ARTICLE INFO

## Keywords:

Digital data acquisition system  
Trigger system  
Open FPGA  
Digital pulse processing  
Waveform digitizer

## ABSTRACT

Modern nuclear-physics experiments increasingly demand user-programmable triggering and real-time digital pulse processing under high channel density and high counting rates, where closed or fixed vendor firmware often becomes the limiting factor. We present a general-purpose digital data acquisition platform built around an open-FPGA firmware framework that enables users to develop and deploy custom trigger and pulse-processing algorithms on commercial waveform digitizers, while preserving a unified system-level control and monitoring workflow.

The system combines digitizers covering 125 MS/s–1 GS/s with a programmable logic module for crate-level coincidence/validation triggers, providing a hierarchical trigger architecture that supports both per-channel discrimination and external multi-board coincidences. Leveraging the open-FPGA approach, we implement and validate representative real-time algorithms, including (i) a five-segment summation energy filter designed to improve pile-up resilience at high rates, and (ii) pulse-shape-discrimination processing. A multi-threaded C++ software framework with a Qt-based GUI integrates configuration, high-throughput readout, real-time monitoring, and online analysis.

Performance evaluations demonstrate excellent energy resolution, stability at high count rates, and effective pulse share discrimination. The proposed framework provides a flexible and reproducible path to algorithm-driven DAQ customization for a wide range of detector systems.

## 1. Introduction

Digital data acquisition systems (DDAQs) have been increasingly implemented in nuclear physics experiments, offering unparalleled flexibility and performance over traditional analog systems [1–3]. Their ability to provide higher data throughput, better stability, and simplified integration has made them the standard solution for modern detector arrays, ranging from high-purity germanium detectors to complex particle identification setups.

In our previous work, we developed a General-purpose Digital Data Acquisition system (GDDAQ) based on Pixie-16 modules [4]. While this system was successfully deployed in various experiments [5–9], it highlighted a fundamental limitation inherent in many commercial DDAQs: the “black-box” nature of the firmware. In such systems, the signal processing logic is fixed by the manufacturer. Users are restricted to adjusting predefined parameters (e.g., trigger thresholds, shaping times) but cannot modify the underlying algorithms. This rigidity becomes a

bottleneck when dealing with specialized experimental requirements, such as resolving severe pulse pile-up in high-rate environments or implementing non-standard pulse shape discrimination (PSD) logic.

To overcome these limitations without incurring the prohibitive time and engineering costs of developing fully custom electronics, a methodological shift is required. The emerging class of digitizers with Open-FPGA architecture offers a solution, bridging the gap between standard commercial instruments and fully custom designs. However, the hardware serves only as a foundation; the true scientific value lies in the development and verification of custom algorithms tailored to specific physics goals.

In this work, we present a significant evolution of the GDDAQ. Unlike the incremental integration of standard modules, this upgrade represents a transition to a fully programmable signal processing platform. Leveraging the open FPGA architecture of CAEN 27xx series digitizers [10] and V2495 programmable logic modules, we have designed

\* Corresponding author.

E-mail address: [zhli@pku.edu.cn](mailto:zhli@pku.edu.cn) (Z.H. Li).



**Fig. 1.** (a) V2495 with three A395D mezzanine boards; (b) A typical GDDAQ setup.

and implemented custom digital pulse processing algorithms that were previously not possible in our fixed-firmware system. These include a trapezoidal trigger filter, a novel five-segment summation energy filter for enhanced pile-up resilience, and specialized PSD algorithms. This paper details the system architecture and, more importantly, validates the performance of these custom algorithms, demonstrating a flexible pathway for advanced nuclear signal processing.

The present paper is organized as follows: Section 2 presents the DAQ system, Section 3 describes the software framework. The triggering system is presented in Section 4. Lastly, open FPGA is described in Section 5 and a summary is given in Section 6.

## 2. Description of DAQ system

### 2.1. Hardware

The principal components of the hardware for this upgraded GDDAQ are VME64X crate, 27xx series digitizers and V2495/DT5495 [11] programmable logic modules.

The 27xx series digitizers (models 2740, 2745, 2730, and 2751) were designed to meet specific requirements for different sampling rates and dynamic ranges. The model 2740 and 2745 support up to 64 channels, both equipped with 16-bit ADCs operating at 125 MS/s, while programmable input gain is available on the 2745. The model 2730 provides up to 32 channels with 14-bit resolution and a sampling rate of 500 MS/s, also featuring programmable input gain. For even higher-speed applications, the model 2751 offers up to 16 channels with 14-bit resolution at 1 GS/s, and likewise includes programmable input gain capability. The core processing unit of the 27xx series digitizers is based on the Xilinx Zynq UltraScale+ Multiprocessor System-on-Chip (MPSoC), model XCZU19EG. In addition, CAEN provides an open FPGA framework for this series, allowing users to develop and deploy custom firmware tailored to specific experimental requirements.

The V2495 is a general-purpose programmable logic and I/O unit based on the User FPGA architecture. It has six mezzanine board slots, three of which are fixed and provide 64 LVDS input channels and 32 LVDS output channels. The other three slots are user-configurable and support a variety of mezzanine boards with different connectors. As shown in Fig. 1(a), the module can be equipped with three A395D mezzanine boards, each offering programmable NIM/TTL inputs or outputs. The V2495 serves as a versatile platform for implementing digital logic functions such as coincidence triggering, trigger logic operations, gate and delay generation, and input/output signal registration, making it well-suited for advanced trigger and control applications in nuclear and particle physics experiments.

Fig. 1(b) shows a typical GDDAQ setup with 5 modules installed in a VME64X crate. Each of 27xx series digitizers features USB 3.0 and Ethernet ports on the front panel for data transfer. The SFP+ receptacle supports either copper RJ45 or optical LC connections, for 1 GbE or 10 GbE communication, enabling high-performance data readout. Each module operates with an independent data transmission link, which ensures non-blocking data transfer and parallel readout capability across the entire system. The short black cables that connect every two modules are used for clock distribution and synchronization of acquisition start/stop commands. The V2495 is utilized to route signals between different 27xx digitizers and perform logical operations among them using the programmable logic resources within its FPGA.

### 2.2. Firmware

CAEN provides two primary categories of firmware for its digitizers: the scope firmware and the Digital Pulse Processing (DPP) firmware.

The scope firmware is mainly intended for waveforms acquisition, acting as a digital oscilloscope. All channels acquire data simultaneously with a common trigger which can be supplied externally or generated from a combination of individual channel discriminators. Options for zero suppression are available to remove non-relevant or unwanted data.

The DPP firmware is divided into four main types, each optimized for specific detection scenarios:

- **Pulse Height Analysis (PHA):** This firmware is designed for nuclear spectroscopy and processes signals from detectors such as high-purity germanium (HPGe), silicon (Si), and scintillators coupled with charge-sensitive preamplifiers. It operates on a per-channel, event-driven basis and outputs data in a timestamped list mode.
- **Pulse Shape Discrimination (PSD):** This firmware is suitable for current signals from scintillators, gas-filled tubes, silicon photomultipliers (SiPMs), and photomultiplier tubes (PMTs). It supports gated charge integration and particle discrimination, performing event-by-event acquisition for each channel. Energy and timing data are output in list mode. Key features include digital Constant Fraction Discrimination (CFD), high-resolution timing interpolation, and pulse shape analysis.
- **Zero Length Encoding (ZLE):** This firmware is intended for advanced zero suppression. It requires a common trigger and performs simultaneous acquisition across all channels. It compresses the digitized waveforms by suppressing baseline segments and empty channels.

- Dynamic Acquisition Window (DAW): This firmware is aimed at triggerless acquisition systems that employ zero suppression. It operates in waveform mode with independent acquisition on each channel and dynamically adjusts the length of the acquisition window to match the duration of the input pulse.

For users who wish customize the acquisition process by implementing their own pulse processing algorithms on the Open FPGA, CAEN provides two development frameworks, Open-Scope and Open-DPP, which enable the creation of user-defined firmware. For a comprehensive description of these frameworks, see the open FPGA in Section 5.

### 3. Software framework

The data acquisition software has been developed in C++ using Qt [12] and CERN ROOT [13] libraries for designing the Graphical User Interface (GUI). The offline data decoding and analysis are implemented by using ROOT libraries. It has been thoroughly tested on multiple Linux distributions, including Ubuntu, Fedora, and Rocky Linux.

During the early stages of an experiment, frequent detector debugging is often necessary. In such scenarios, real-time monitoring of key parameters, such as trigger rates, energy spectra, time-difference spectra, and acquired waveforms, is essential. A well-designed GUI is highly beneficial and provides intuitive access to relevant information while concealing underlying complexities.

Prior to initiating data acquisition, users must modify the configuration file to specify the number of modules to be used, along with their serial numbers and IP addresses. Upon launch of the acquisition program, the specified configuration file is loaded, after which the main acquisition program initializes. The main acquisition program consists of four threads: a main thread and three subordinate threads. It integrates a comprehensive set of functionalities including experiment parameter configuration, data readout, real-time monitoring, and online debugging. This program facilitates the simultaneous operation of multiple firmware types, such as Scope, DPP, Open-Scope, and Open-DPP, within a unified system. It is designed to deliver enhanced flexibility, seamless integration, and efficient control, making it available for a wide range of experimental physics applications.

#### 3.1. Front-end GUI

The main thread is responsible for constructing the GUI for human-computer interaction. Its primary functions include parameter configuration, as well as receiving signals from subordinate threads for data visualization and curve plotting. The GUI is designed for displaying the necessary information as soon as possible while hiding all complexities from the user.

**Fig. 2(a)** shows the main interface of the GUI. Features that are used less frequently, such as loading configuration files, setting data output options, and saving register parameters, are organized within a drop-down menu bar at the top of the interface.

**Fig. 2(b)** shows the sub-interface accessible via the “Basic” button on the main screen, which is used to set basic acquisition parameters. Users can switch between parameters of different modules by clicking the module labels on the left. The interface automatically displays relevant parameters based on the module type and loaded firmware. Similarly, the “Logic” button provides access to settings of trigger and acquisition logic configurations.

The real-time debugging window is shown in **Fig. 2(c)**. For PHA/PSD firmware, it displays filtered waveforms and trigger timing information for acquisition. For other firmware types, such as Scope and ZLE, it shows the waveforms from different channels. This interface is refreshed only when the “Send” button is clicked.

### 3.2. Back-end threads

The backend of the data acquisition software uses a multi-threaded architecture to efficiently execute multiple tasks concurrently, allowing for seamless integration of data collection, real-time plotting, and online analysis. The threads are implemented using the QThread class from the Qt framework, which facilitates inter-thread and GUI communication through its built-in ‘signals and slots’ mechanism. Three dedicated threads operate in the background during acquisition:

- The first subordinate thread is responsible for data readout and local storage. It polls each acquisition module, compresses the retrieved data, and writes them directly to disk in binary format. Simultaneously, it forwards the data to a shared memory space for real-time analysis. This thread also responds to data requests originating from the online debugging interface.
- The second thread monitors the count rates across all channels. Every three seconds, it reads the monitoring registers of each channel to obtain real-time threshold trigger rates and event output rates. **Figs. 2(d)** and **2(e)** illustrate the real-time count rates for different channels of a module. This functionality is essential for monitoring detector status and validating the coincidence logic during operation.
- The third thread receives and processes data from the first thread via shared memory. It launches a ROOT THtppServer instance, which processes the incoming data and fills the corresponding histograms. Users can conveniently access energy spectra, count rates, PSD distributions, waveforms, and other relevant information for each channel through a web browser.

### 4. Trigger system

In previous work [4], we proposed a highly flexible trigger system architecture designed to accommodate diverse experimental acquisition requirements. The core concept involves implementing advanced trigger logic which cannot be handled within a single module, such as coincidence and multiplicity selection, by using an external programmable logic module. This module processes the trigger conditions and then returns validated trigger gate signals to each channel of the acquisition modules. In the present work, we have upgraded the system by replacing the original MZTIO programmable logic module with a CAEN V2495 unit. The trigger system of the present GDDAQ consists of two interdependent parts: internal trigger and external trigger.

#### 4.1. Internal trigger

In DPP Mode, the channels acquire data independently, so the channel self-trigger is used locally to acquire pulse information. Each channel of the digitizer features a digital triangular filter with programmable rise time and threshold for the self-trigger of the input pulses and generate a self-trigger signal.

Each module has two independent Individual Trigger Logic (ITL), whose output can be combined in a second-level trigger logic for more complex trigger schemes.

**Fig. 3** shows the ITL scheme. Each ITL consists of an input enable mask, an optional pairing logic that combines the self triggers of two consecutive channels (e.g. paired coincidence), and the main trigger logic that combines all self-triggers with OR, AND or Majority logic. The output can be linear (without stretching) or reshaped by a programmable gate generator, which is either re-triggerable or not and finally programmed for polarity as direct or inverted.



Fig. 2. The graphical interface and some of its windows.



Fig. 3. Individual Trigger logic scheme.

#### 4.2. External trigger

The ITLs can propagate the signal outside via the TRGOUT or GPIO, thus enabling the combination of triggers from multiple boards in an external trigger logic, which eventually feeds back the TRGIN of the digitizers.

Each LVDS line can be assigned a combination of all channel self-triggers, implemented using a masked OR logic, with the mask set by the parameter. The LVDS I/Os are arranged in groups of four and each group is capable of meeting the requirements of various types of experiments. In our setup, one group comprising four lines labeled OR 1, OR 2, OR 3, and OR 4, is typically sufficient to implement the necessary trigger.

Triggering mechanism is shown in Fig. 4. For each selected module, the ITL is input to V2495 through the TRGOUT port via the LEMO connectors. The OR triggers (OR 1/2/3/4) are generated for each selected module and sent via the LVDS connectors to the LVDS inputs of



Fig. 4. System logic block diagram in the GDDAQ.

the V2495. In addition, trigger signals from other systems, i.e., external inputs, are also input to V2495. In V2495, the corresponding trigger signal is generated according to user-customized logic and then sent back as the second-level trigger of 27xx series digitizers either through the TRGIN port or GPIO port. By outputting the logic signal of V2495 to the oscilloscope, their timing relationship can be easily checked.

It is worth noting that the trigger from each digitizer to the central V2495 logic unit is implemented via four LVDS lines and TRGOUT. This physical constraint implies a limitation on the granularity of the global trigger logic, as the V2495 cannot perceive the individual state of every channel within a high-density digitizer (e.g., 64 channels). Instead, these five lines typically transmit aggregated signals, such as the logical OR of channel groups or specific pre-selected trigger flags configured within the digitizer's FPGA. While this architecture may limit flexibility for experiments requiring complex global logic based on single-channel granularity, it effectively balances cabling complexity and functionality. For the majority of nuclear physics applications, such as coincidence measurements between detector arrays, this group-level triggering is sufficient and highly effective.

#### 4.3. Second level trigger and monitoring

For each channel, the self-trigger does not automatically lead to event readout. The final acquisition decision is subject to a user-selectable “control logic”, which implements secondary trigger conditions such as coincidence or anti-coincidence. The source of this second-level trigger can be chosen flexibly from several options, which are broadly categorized into two types: internal logical combinations of channels within the module, and external trigger inputs as described earlier.

In most commercial digital data acquisition systems, adjusting coincidence and anti-coincidence logic is often not intuitive. Users typically struggle to visualize the timing relationships between coincidence gates and trigger signals, making parameter optimization particularly challenging. In analog electronics, such tuning is conventionally performed by observing the relevant logic signals with an oscilloscope. However, digital acquisition modules seldom provide sufficient monitoring output channels to facilitate this type of optimization via external oscilloscopes.

To address this issue, we have integrated the coincidence signals into the WaveDigitalProbe tool within the DPP firmware, enabling intuitive visualization of timing relationships among signals and greatly simplifying the parameter adjustment process. Fig. 2(c) illustrates the timing relationship between the trigger signal and the coincidence signal, as displayed in the Wave Digital Probe at the bottom of the panel.

For open-dpp firmware, the time difference measurement function has been developed. Fig. 2(f) displays the time difference measurement between the HPGe detector and the BGO detector, which served as a veto detector. Similarly, this feature is also being developed in V2495, which can measure the time difference spectrum of any two logic signals.

### 5. Open FPGA

The open FPGA architecture offers balance of performance, flexibility, transparency, sustainability, and cost-effectiveness for nuclear physics experiments. These advantages make it a mainstream choice and the preferred direction for future data acquisition system in the field. Among commercial solutions, CAEN is currently a leading provider in this domain. The entire 27xx series of digitizers supports open FPGA capabilities, allowing users to develop customized firmware to accommodate specific experimental requirements.

For firmware development, CAEN provides two distinct approaches. The first approach employs the Sci-Compiler [14] software, which utilizes a graphical block-diagram interface instead of traditional VHDL/

Verilog programming. This method significantly accelerates the development of custom firmware, making it accessible even to users with limited FPGA expertise. The second approach is designed for experienced developers skilled in Verilog or VHDL. For these users, CAEN provides a Firmware Development Kit (FDK), which includes a firmware template, a simulation test bench and other development resources. This dual-approach framework ensures that both novice and expert users can effectively implement custom acquisition logic tailored to their experimental needs. Below, we present some of the development work completed based on the FDK framework.

The following sections detail the digital pulse processing algorithms implemented in the FPGA. First, the trigger filter (Section 5.1) is introduced for identifying valid physics events. Next, the energy filter (Section 5.2) and pulse shape discrimination (PSD) (Section 5.3) are derived; for mathematical clarity, these derivations initially assume an ideal signal with a zero baseline. The algorithm for baseline determination is then independently described in Section 5.4. Finally, Implementation and Results (Section 5.5) integrates these components, demonstrating how the specific baseline strategy is applied on an event-by-event basis to the energy and PSD algorithms to achieve optimal measurement performance in practice.

#### 5.1. Trigger filter

The pulse signals from different types of detectors with a preamplifier or PMT are generated in a different mechanism. However, in most cases, these signals can be characterized by fast rise followed by exponential decay. A digital trigger is required to detect the presence of detector pulses in the stream of digital data from ADC.

Different trigger filter algorithms exhibit significant differences in noise suppression and the ability to discriminate close-in-time events. In this work, we use the fast trapezoidal filter algorithm for triggering, which is defined as:

$$s[k] = \sum_{i=k-L+1}^k x[i] - \sum_{i=k-2L-G+1}^{k-L-G} x[i] \quad (1)$$

where  $s[k]$  represents the filter output amplitude at the current time index  $k$ , and  $x[i]$  denotes the digitized raw input sample at index  $i$ . The parameter  $L$  corresponds to the length of the summing window (determining the rise time of the trapezoid), while  $G$  represents the gap between the two summing windows (determining the duration of the trapezoid's flat top).

It is important to compare the proposed method with standard triggering techniques often found in standard CAEN firmware, which typically utilize a Leading Edge Discriminator (LED) or a simple triangular filter (equivalent to Eq. (1) with  $G = 0$ ). While the LED is fast, it is highly susceptible to high-frequency noise, limiting the lowest achievable threshold. Similarly, the Triangular filter, while reducing noise, suffers from ballistic deficit when applied to detectors with slow or varying charge collection times (such as gas detectors). The optimization in this work lies in the implementation of the Fast Trapezoidal Filter with a configurable flat-top ( $G > 0$ ). Unlike the triangular filter, the introduction of the gap  $G$  allows the filter to integrate the full charge of slow-rising signals before decaying. This optimization significantly improves the stability of the trigger amplitude against rise-time variations, thereby enhancing trigger efficiency and timing precision compared to the standard LED and triangular methods.

#### 5.2. Energy filter

For pulse signals from a PMT, the energy information of pulse is determined by integrating the digital samples of the individual pulse. A simple integration of the current pulse is sufficient to determine the energy of the incident particle. The charge integration is given by:

$$Q = \sum_{n=0}^{\infty} i[n] \quad (2)$$

Fig. 5. 3-sum filter to a pulse with decay time  $\tau$ .

In high-resolution measurement systems that employ charge-sensitive preamplifiers, the shape of the voltage pulse is related to the input current. The voltage rises during the duration of the current pulse and reaches its maximum at the end of the current injection, followed by an exponential decay with the form  $e^{-t/\tau}$ . The impulse response of a charge-sensitive preamplifier is given by:

$$h[n] = -\frac{1}{C_f} e^{-n/\tau} \quad (3)$$

The  $C_f$  is the feedback capacitance and  $\tau$  is the decay time constant. For an arbitrary input current  $i[n]$  (detector output), the output voltage pulse  $v[n]$  from the preamplifier can be obtained through the discrete convolution operation  $v = i * h$ , that is,

$$v[n] = -\frac{1}{C_f} \sum_{k=0}^n i[k] e^{-(n-k)/\tau} \quad (4)$$

The area of the voltage pulse is obtained by summing  $v[n]$  as follows:

$$A_v = \sum_{n=0}^{\infty} v[n] = -\frac{1}{C_f} \frac{1}{1 - e^{-1/\tau}} Q \quad (5)$$

It should be noted that Eq. (5) is strictly valid under the condition that the pulse shape is characterized by a single decay time constant  $\tau$ . This approximation holds for most standard resistive feedback preamplifiers used in nuclear spectroscopy. For an arbitrary detector current pulse, the output of preamplifier voltage pulse area  $A_v$  is proportional to the total charge  $Q$ . This implies that the integrated pulse area is also proportional to the energy deposition ( $E \propto Q$ ) of the incident particle in the detector.

As previously discussed, the energy measurement of particles boils down to extracting the integrated pulse area, whether dealing with current pulses from a PMT or voltage pulses from a charge-sensitive preamplifier. A recursive implementation of the trapezoidal filter algorithm was introduced in [15], which is well-suited for efficient FPGA deployment and has been widely adopted in real-time systems. While it was reported in [16] that a 'directive' method (which calculates energy by directly comparing post-rise samples with the projected decay of pre-rise samples) outperforms the recursive approach in certain scenarios, this proposed method does not account for ballistic deficit effects.

As reported in [17], a method for estimating pulse area through local three-segment integration has been proposed. The approach is based on a mathematically rigorous derivation and is capable of processing overlapping pulses, thereby supporting higher count rates. A schematic diagram of this method is presented in Fig. 5. The pulse area can be derived from the three contiguous segments ( $L_0$ ,  $L_g$ ,  $L_1$ ) along the rising edge of the pulse, using their corresponding integrated values  $S'_0$ ,  $S'_g$ , and  $S'_1$ ,

$$A_v = -\frac{b^{L_0}}{1 - b^{L_0}} S'_0 + S'_g + \frac{1}{1 - b^{L_1}} S'_1 \quad (6)$$

where  $b = \exp(-\Delta t/\tau)$ ,  $\Delta t$  is the time between samples.



Fig. 6. Schematic diagram of baseline evaluation algorithm.



Fig. 7. Schematic diagram of five consecutive summation regions on a pulse.

### 5.3. Pulse shape discrimination

Digital PSD is widely used to extract various types of information from the output pulses of detectors. Among these techniques, the charge comparison method, based on long- and short-gate integration, has been extensively adopted due to its straightforward implementation in FPGA architectures. This method can be expressed as follows:

$$PSD_{Q_s/Q_l} = \frac{\sum_{i=0}^{L_{short}} x[i]}{\sum_{i=0}^{L_{long}} x[i]} \quad (7)$$

where  $L_{short}$  and  $L_{long}$  are the two different integration times.

In Ref. [18], a general-purpose PSD algorithm based on the cosine similarity measure was proposed. This method effectively integrates the principles of three established PSD techniques: rise-time discrimination, charge comparison, and signal power analysis. The algorithm is expressed as follows:

$$PSD_{\cos \theta} = \frac{\sum_{i=0}^L x[i]}{\sqrt{L} \sqrt{\sum_{i=0}^L x[i]^2}} \quad (8)$$

where  $L$  is the number of samples in the time window. The numerator corresponds to the area under the waveform, a quantity that is also used in the charge-comparison method for PSD. The denominator, which is the sum of the squared signal samples, reflects the signal power and is likewise dependent on pulse shape characteristics.

It is evident that this algorithm is also well-suited for implementation on an FPGA. Specifically, it can be efficiently realized using two sum values: the first one is the cumulative sum of the waveform samples, and the second one is the cumulative sum of their squares.



**Fig. 8.** The energy resolution at 1836 keV as a function of count rate for the four channels of the CLOVER detector.

#### 5.4. Baseline determination

The baseline restoration is not merely a monitoring tool but a critical prerequisite for the subsequent signal processing stages. Owing to factors such as thermal drift and detector leakage current, the output pulses from detectors are often superimposed on a typically unstable baseline. A stable and accurate baseline value is dynamically subtracted from the raw signal to eliminate low-frequency noise and thermal drift. This step is indispensable for the energy filter (described in Section 5.2) to determine the correct pulse energy, and for the PSD algorithm (Section 5.3) to perform precise charge integration, as any baseline deviation would directly distort the calculated energy and particle identification parameters.

A common method for baseline estimation involves calculating the mean values of samples within a time window before each pulse. However, this approach fails to handle cases where pulses overlap.

As illustrated in Fig. 6, a more versatile baseline estimation method is based on the integration of two consecutive pulse segments ( $L_M$ ,  $L_N$ ) selected from regions outside the pulse rising edge:

$$DC = \frac{S_M - A \cdot S_N}{L_M - A \cdot L_N} \quad (9)$$

where  $A = (1 - b^{L_M})/(b^{L_M} - b^{L_M + L_N})$ .

#### 5.5. Implementation and results

On an FPGA, each block or arithmetic operation consumes part of the available hardware resources. When using floating-point arithmetic on an FPGA, these operations require significantly more resources than fixed-point operations.

The FPGA embedded in these devices operates at 125 MHz. Each clock cycle receives a different number of ADC samples depends on the digitizer model: one sample for the 2470/2745 modules, four simultaneous samples for the 2730 module, and eight for the 2751 module. To simplify the processing architecture while maintaining acceptable measurement resolution, the multiple ADC samples arriving within a single FPGA clock cycle are summed and treated as a single combined sample.

In our design philosophy, floating-point operations are offloaded to an offline Central Processing Unit (CPU) whenever possible, offering greater flexibility for subsequent data analysis. We avoid recursive methods and instead use simple accumulators in the algorithms described above.

The trigger filter consists of two accumulators and requires only five addition operations per clock cycle, ensuring high computational efficiency and making it suitable for FPGA implementation. A trigger is generated when the output of the trigger filter waveform exceeds a preset threshold. This timestamp serves as the reference point for subsequent pulse information extraction.

For each valid trigger, the firmware records five partial sums of the waveform, denoted as  $S_M$ ,  $S_N$ ,  $S_0$ ,  $S_g$ , and  $S_1$ , along with two timestamps, one of which corresponds to the preceding trigger. These five integration regions are illustrated in Fig. 7.

During offline processing, the baseline (DC) is estimated under the following conditional logic:

- If the time interval  $\Delta T$  between two consecutive triggers exceeds  $L_M + L_N + L_0 + L_g$ , the baseline is derived from the partial sums  $S_M$  and  $S_N$ ;
- If  $\Delta T$  is greater than  $L_N + L_0 + L_g$  but does not exceed  $L_M + L_N + L_0 + L_g$ , the baseline is estimated using  $L_N$  and  $L_0$ ;
- If  $\Delta T$  is only greater than  $L_0 + L_g$ , the baseline value from the previous event is reused for the current event.

The energy can then be obtained using the following expression:

$$E = -\frac{b^{L_0}}{1 - b^{L_0}}(S_0 - DC \cdot L_0) + (S_g - DC \cdot L_g) + \frac{1}{1 - b^{L_1}}(S_1 - DC \cdot L_1) \quad (10)$$

Similarly, for each valid trigger, the firmware records the short-gate ( $S_s$ ) and long-gate ( $S_l$ ) integrals of the pulse, both of which include the baseline contribution, for particle identification via the charge-comparison method. The firmware also records the integrated pulse value ( $S_p$ ) and the integrated squared pulse value ( $S_{pp}$ ) over a time window of length  $L$  (also including the baseline contribution), which are used in the cosine similarity measure for pulse shape discrimination.



**Fig. 9.** The environment background measurement results obtained with the BaF<sub>2</sub> detector.

The two PSD metrics can be calculated using the expressions below:

$$PSD_{Q_s/Q_l} = \frac{S_s - DC \cdot L_s}{S_l - DC \cdot L_l} \quad (11)$$

$$PSD_{\cos \theta} = \frac{S_p - DC \cdot L}{\sqrt{L} \sqrt{S_{pp} - 2 \cdot DC \cdot S_p + L \cdot DC^2}} \quad (12)$$

Although these computations involve a considerable number of multiplication and square-root operations, they can be efficiently handled by a CPU.

The aforementioned algorithms have been implemented across the 27xx series digitizers. Due to resource constraints, the range of adjustable parameters varies among different digitizers. To evaluate the energy resolution of the algorithms, a CANBERRA CLOVER detector (4 × 60 × 60 BC) [19] and a high-activity <sup>88</sup>Y gamma source were used for testing.

**Fig. 8** shows the energy resolution of  $\gamma$  rays at 1836 keV as a function of count rate for the four channels of the CLOVER detector, measured using different digitizers and firmware configurations. All

systems employed identical trapezoidal filter rise and flat-top parameters. It indicates that the proposed algorithm in this work slightly outperforms the measurements obtained with the DPP-PHA firmware in terms of energy resolution. Its performance is comparable to that of the Pixie-16 500 MSPS module, although a slight degradation in resolution is observed at low count rates compared to the Pixie-16 100 MSPS module. This difference is primarily attributed to variations in the analog bandwidth of the front-end circuits across the digitizer models. The Pixie-16 100 MSPS module typically employs a lower cutoff frequency for its anti-aliasing filter compared to the higher-speed digitizers. This narrower analog bandwidth effectively suppresses high-frequency noise components before digitization, resulting in a superior signal-to-noise ratio and thus better energy resolution in the low-count-rate regime where electronic noise is the dominant factor.

To evaluate the implementation of the PSD algorithm, a BaF<sub>2</sub> detector manufactured by SCIONIX [20] was used. The detector crystal is of tapered cone geometry, with a large diameter of 25 mm, a small diameter of 19 mm, and a polished height of 25 mm. It is coupled to a Hamamatsu R3378-51 PMT.

**Fig. 9** presents the background spectrum measured with the BaF<sub>2</sub> detector, comparing the direct area integration method ( $Q_l$ ) and the



**Fig. 10.** The identification effectiveness of two particle identification methods for BaF<sub>2</sub> detectors.

five-segment summation energy filter algorithm, which are equivalent in principle. In the high-energy region of the spectrum, distinct peaks are clearly visible. These features are attributed to the intrinsic  $\alpha$  contamination within the BaF<sub>2</sub> crystal, rather than external environmental radiation. Due to the chemical similarity between Radium and Barium, trace amounts of <sup>226</sup>Ra form a substitutional solid solution within the crystal lattice during the growth process. While the precursor elements of the uranium series (such as <sup>238</sup>U) are effectively removed during chemical purification, <sup>226</sup>Ra persists. Consequently, the observed alpha peaks correspond to the decay of <sup>226</sup>Ra and its shorter-lived progeny (<sup>222</sup>Rn, <sup>218</sup>Po, etc.). The detection of these internal alpha events serves as a validation for the pulse processing algorithms in handling signals with different characteristics.

Fig. 10 shows the performance evaluation of the two particle discrimination algorithms, with clear separation between  $\gamma$  and  $\alpha$  events.

## 6. Summary

In this paper, we have presented the design, implementation, and performance verification of a next-generation General-purpose Digital Data Acquisition system (GDDAQ). Moving beyond the fixed-firmware limitations of the system described in our previous work [4], this study validates a methodological shift toward Open-FPGA architecture using commercial off-the-shelf (COTS) hardware.

The primary contribution of this work is not merely the integration of updated digitizers, but the establishment of a flexible development framework that allows for the injection of custom physics algorithms into standard hardware. We successfully implemented and verified advanced real-time processing logic within the FPGA fabric, most notably a five-segment summation energy filter. This custom algorithm specifically addresses the challenge of pulse pile-up in high-rate scenarios.

Performance evaluations confirm that this software-defined approach achieves excellent energy resolution and particle identification capabilities while retaining the stability of industrial-grade hardware. By decoupling the complexity of hardware design from the flexibility

of algorithm development, this work provides a reproducible pathway for nuclear physics laboratories. It demonstrates that researchers can achieve highly customized, ASIC-level signal processing performance required for cutting-edge experiments without the prohibitive cost and development time of fully custom electronics.

This project has been made public on GitHub (<https://github.com/wuhongyi/PKUCAENDAQ>).

## CRediT authorship contribution statement

**H.Y. Wu:** Writing – original draft, Software, Funding acquisition, Conceptualization. **Z.H. Li:** Writing – review & editing, Funding acquisition, Conceptualization. **M. Venaruzzo:** Writing – review & editing, Resources. **L. Colombini:** Software, Resources. **D.W. Luo:** Writing – review & editing. **H. Hua:** Writing – review & editing, Funding acquisition. **S. Nishimura:** Resources, Funding acquisition. **A. Abba:** Validation, Resources. **Y. Venturini:** Validation, Resources. **C. Tintori:** Validation, Resources. **M. Bianchini:** Validation, Resources.

## Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

## Acknowledgments

This work was supported by the National Key R&D Program of China (Grant No. 2022YFA1602302), the National Natural Science Foundation of China (Grants No. 12535008, No. 12035001, No. U2167201), the Japan Society for the Promotion of Science KAKENHI (Grant No. 25H01273), and the Continuous-Support Basic Scientific Research Project, China (Grant No. BJ010261223282).

## Data availability

Data will be made available on request.

## References

- [1] W. Warburton, P. Grudberg, Current trends in developing digital signal processing electronics for semiconductor detectors, Nucl. Instrum. Methods Phys. Res. A 568 (1) (2006) 350–358, <http://dx.doi.org/10.1016/j.nima.2006.07.021>, New Developments in Radiation Detectors.
- [2] A. Al-Adili, F.-J. Habsch, S. Oberstedt, et al., Comparison of digital and analogue data acquisition systems for nuclear spectroscopy, Nucl. Instrum. Methods Phys. Res. A 624 (3) (2010) 684–690, <http://dx.doi.org/10.1016/j.nima.2010.09.126>.
- [3] R. Grzywacz, C. Gross, A. Korgul, et al., Rare isotope discoveries with digital electronics, Nucl. Instrum. Methods Phys. Res. B 261 (1) (2007) 1103–1106, <http://dx.doi.org/10.1016/j.nimb.2007.04.234>.
- [4] H.Y. Wu, Z.H. Li, H. Tan, et al., A general-purpose digital data acquisition system (GDDAQ) at Peking University, Nucl. Instrum. Methods Phys. Res. A 975 (2020) 164200, <http://dx.doi.org/10.1016/j.nima.2020.164200>.
- [5] D. Luo, H. Wu, Z. Li, et al., Performance of digital data acquisition system in gamma-ray spectroscopy, Nucl. Sci. Tech. 32 (2021) 79, <http://dx.doi.org/10.1007/s41365-021-00917-8>.
- [6] D. Luo, C. Guo, S. Zhang, et al., The implementation of a focal plane detector system at the gas-filled recoil separator for the decay studies of heavy nuclei, Nucl. Instrum. Methods Phys. Res. A 1075 (2025) 170333, <http://dx.doi.org/10.1016/j.nima.2025.170333>.
- [7] H. Jian, X. Xu, K. Wang, et al., Detector array with digital data acquisition system for charged-particle decay studies, Nucl. Sci. Tech. 36 (2025) 73, <http://dx.doi.org/10.1007/s41365-025-01667-7>.
- [8] J. Zhang, H. Wu, W. Jiang, et al., Development of a pile-up pulse recovery algorithm for the labr3 detector, Nucl. Instrum. Methods Phys. Res. A 1063 (2024) 169273, <http://dx.doi.org/10.1016/j.nima.2024.169273>.
- [9] M. Kang, J. Zhang, H. Wu, et al., Commissioning of the fast neutron detector array at China Institute of Atomic Energy, Nucl. Sci. Tech. 36 (2025) 86, <http://dx.doi.org/10.1007/s41365-025-01649-9>.
- [10] <https://www.caen.it/>.

- [11] <https://www.caen.it/products/v2495/>.
- [12] <https://www.qt.io/>.
- [13] <https://root.cern.ch/>.
- [14] <https://www.sci-compiler.com/>.
- [15] V.T. Jordanov, G.P. Knoll, Digital synthesis of pulse shapes in real time for high resolution radiation spectroscopy, Nucl. Instrum. Methods Phys. Res. A 345 (2) (1994) 337–345, [http://dx.doi.org/10.1016/0168-9002\(94\)91011-1](http://dx.doi.org/10.1016/0168-9002(94)91011-1).
- [16] L. Begley, S. Zhu, M. Carpenter, et al., Algorithms of pulse shape analysis for Gammasphere under high count rate conditions, Nucl. Instrum. Methods Phys. Res. A 1040 (2022) 167113, <http://dx.doi.org/10.1016/j.nima.2022.167113>.
- [17] H. Tan, M. Momayez, A. Fallu-Labuyere, et al., A fast digital filter algorithm for gamma-ray spectroscopy with double-exponential decaying scintillators, IEEE Trans. Nucl. Sci. 51 (4) (2004) 1541–1545, <http://dx.doi.org/10.1109/TNS.2004.832984>.
- [18] M. Nakhostin, A general-purpose digital pulse shape discrimination algorithm, IEEE Trans. Nucl. Sci. 66 (5) (2019) 838–845, <http://dx.doi.org/10.1109/TNS.2019.2910153>.
- [19] <https://www.mirion.com/>.
- [20] <https://scionix.nl/>.