

# A Neuromorphic Analog RRAM Resistive crossbar for Matrix Multiplication with Photonic Interconnects

By : Aneesh Arnav Chikkala

## Abstract

With digital computation reaching fundamental limits as transistor scaling can no longer keep pace with memory speeds and energy constraints, analog computing offers a promising alternative for arithmetic operations such as matrix multiplication and other highly parallel workloads. With matrix multiplication being the primary computational demand across most AI training and modeling frameworks, this work presents a  $32 \times 32$  analog resistive crossbar with neuromorphic properties, built using peripheral op-amp circuitry specifically for Matrix Multiplication and supported by a custom measurement pipeline.

While resistive crossbars offer strong theoretical advantages in efficiency and parallelism, their practical behavior remains insufficiently characterized for reliable large-scale deployment. Key electrical factors affecting accuracy, stability, and reliability such as conductance variation, offset errors, noise, and crosstalk which are still not well understood, limiting most prior work to device level demonstrations. This work implements the full computing chain with custom drivers to directly measure and evaluate these parameters, advancing research on crossbar-based analog computation.

To address the memory-access and bandwidth limitations affecting digital solutions, photonic interconnects were added as a communication layer between compute units, with these optical paths also evaluated for coupling efficiency, phase skew, and SNR. In effect, the data collected across the crossbar and surrounding analog circuitry confirm many behaviors predicted in prior literature while also enabling new empirically derived relations. Together, these results strengthen the theoretical foundation of analog

crossbar computing and extend it with hardware-validated insights for future system-level design.

## **Technical Review**

### Digital Limits and need for Analog

Modern digital computing faces the fundamental slowdown caused by the maximization of logical scaling , memory access performance and a plateau reached in Moore's law, This is referred to more so as a 'memory wall' [21] . Although transistor switching efficiency has improved, data movement and speeds have not, this restricts matrix heavy AI workloads as in the digital domain , the computational intensity of such matrix multiplication is dominated by memory traffic and not arithmetic. Studies show 60-90% of energy in a neural network accelerator is spent on memory access, not MAC operations [22]. This fundamental architectural limitation is what's causing a shift in computational based research towards analog conduction based devices and arrays.

### Resistive Crossbars for Matrix Multiplication

Analog Resistive crossbars unlike digital alternatives enable Matrix vector Multiplication (MVM) in a single Arithmetic step , So for a given Voltage Vector applying  $V_i$  across a row i produces a current for each column j represented by :

$$I_j = \sum_i G_{ij} V_i$$

In which  $i, j$  represent the row and column respectively and  $G_{ij}$  represents the conductivity of resistor  $i, j$  . This relation established in [1], is the primary reason crossbars achieve an  $O(1)$  physical latency and efficiency for MVM.

### Device Nonlinearities and Variation

Analog Computing although highly efficient are Highly vulnerable to divergence from ideal behaviour due to several reasons such as -

- Nonlinear I-V curves [3], [17]
- Conductance drift and stochasticity [18]
- Finite ON/OFF ratio

And many other factors , The work of Li et al. [2],[4] shows that without calibration, analog NVM accuracy collapses rapidly with increased vector and array size.

### Neuromorphic and Analog for Ai

Resistive crossbars are considered neuromorphic due to :

- Their Conductance states act like the “Synapse” storing weights
- Their parallel current summation resembling human brain like context accumulation
- Their resistive memory/phase mimics RAM like contextual computing

My work with its local op-amps , TIAs (Transimpedance Amplifiers),peripheral drivers and Analog Signal Conditioning to implement the neuromorphic compute block used in [7] and many other crossbar accelerators.

### Analog Peripherals : The Bottleneck

Most Research focuses on RRAM devices but the primary non error based Bottleneck for Analog computing is not the actual crossbar but its surrounding Peripheral components , specifically :

- TIA settling time and bandwidth limits
- Output swing limits
- Offset mismatches
- Electronic and thermal noise

- Stability and phase margin of op-amp stages

These issues are well known and documented in analog circuit fundamentals primarily due to the lack of High bandwidth Signal generator / Conditioning IC's [8], [9].

### TIA Settling Dynamics

The TIA used for the conversion of Current to Voltage, for resistive sensing in particular plays a crucial part. The TIA is modelled as :

$$V_{out}(t) = V_f(1 - e^{-t/\tau})$$

In which  $\tau$  denotes RC time constant approximately 9nS , the settling time approximates to  $5\tau$  in this case 45 nS and  $V_f$  represents final settled output.

$$BW \approx \frac{1}{2\pi\tau}$$

This estimation gives our bandwidth at close to 18.5MHZ,which limits throughput to higher MAC operations acting as the primary Peripheral Device restricting output. Through the data collected through the measurement pipeline , i was able to estimate to approximately equal 3.2V per output rail without account for offsets and other Parasitics

### Crosstalk and Internal Parasitics

When treating individual rows and columns as Channels , RRAM crossbars exhibit line interference throughout channels in the form of Crosstalk , line/trace resistance and Parasitic Capacitance. Applying an exponential decay order proposed by [2],[18] gives us :

$$C(d) = Ae^{-\alpha d} + C_o$$

In which  $C(d)$  represents decay as a current ratio ,  $\alpha$  represents the rate of exponential decay ,  $A$  and  $C_o$  represent physical constants. These parameters specifically are used to model 1D and 2D crosstalk which is then mathematically avoided by Adaptive Error Correction [2],[18] . The derived models and Outcomes and shown in the Results

### Calibration, Linearity Correction and Bias modelling

Analog Crossbars require linearity correction due to their high reliance on the relationship between duty cycle and output current. This relation also means that the highest potential calibration methodology is Inverse signal correction with adaptive offset correction. The current in these systems is given by :

$$I = ad^2 + bd + c \quad \text{where } d = \frac{d_o}{\max(d_o)}$$

In which  $a, b, c$  are constants with  $d$  representing the cycle ratio and  $d_o$  is the cycle duty of your signal/voltage generator for your crossbar.

The linear error also is another parameter to deconstruct as any parameter that minimizes this function , improves output resolution exponentially. The error in 2D based on [1],[2] is given by :

$$E_{ij} = \sqrt{\frac{1}{M} \sum_k (I_{jk} + G_{ij} V_{ik} - b_{ij})^2}$$

In which  $E_{ij}$  resolves to a vector representing the error and the direction in which these errors implicate. Additional Details on the Evaluation and application of this error is shown in Results.

### Photonic Interconnects: Beyond Electronic Bandwidth :

Another major constraint with digital electronics is the lack of scalability with multiple compute cores , the implementation of photonic interconnects in this board allows for :

- bandwidth scaling with wavelength division (WDM) [11], [12]

- near speed-of-light propagation
- reduced crosstalk and EM interference
- extremely high aggregate throughput [13], [14]

These merits with integrating photonics also add many other physical parameters such as

- Optical Coupling Efficiency

Mathematically defined as:

$$\eta = \frac{P_{rx}}{P_{tx}}$$

Where propagation loss is modelled exponentially in terms of  $\alpha$  as:

$$P_{rx} = \eta P_{tx} e^{-\alpha L}$$

This is further modelled and elaborated on as based on [15],[16],[19] and is elaborated on in the results section.

- Channel to Channel skew phase

The skew phase is the TDOA between the 2 modules , they act as communicative layers needing the standardisation as elaborated on, in other optical timing Research [20] , it is mathematically modelled as :-

$$\phi = \omega \Delta t , V_g = \frac{L}{\Delta t}$$

In which  $\phi$  represents phase shift ,  $V_g$  represents group velocity , L represents Link length and  $\Delta t$  represents the time interval between communication .

- Optical SNR (Signal to noise ratio)

The SNR acts as an important metric of accuracy with the crossbar as any noise impacts results exponentially, Account for Crossbar and

Peripheral Parasitics and accounting for Signal trace residences , SRN for this system is estimated by :

$$SNR \approx RIN + Shot + T_n(t_o)$$

Where RIN represents the Relative intensity noise , Shot represents quantum or noise caused by light and  $T_n(t_o)$  represents the thermal noise caused at temperature  $t_o$ . Further constants and parameters are evaluated and Determined as based on proposed models presented in [20] , further elaborated on , in the Results section .

## Methodology

### System Level Design Philosophy:

With the primary goal of the system being to create a working resistive crossbar capable of matrix multiplication where every major error source can be measured and accounted for directly by hardware.

To support this the structure of this Architecture is primarily broken down into: -

- PCA9685 Based PWM signal Generator

Although not a true signal generating IC , the PCA9685 with its PWM duty control can produce RMS voltages ranging from 30mV to 3.2V with a current output from 500nA to 400mA per channel, while often being more accurate than most signal generators . The only drawback is the lack of ability to generate Sinusoidal Waveforms . With each PCA9685 having 16 channels , we have 2 IC's sharing the same I2C lane on 0X40 and 0X41 addresses respectively and are wired as shown below in Figure 1 .



Figure 1. PCA9685 Implementation

- Resistive Crossbar

The 32X32 grid of resistors with each Row ‘i’ and each column ‘j’ placed perpendicular to each other with measuring hooks for Data collection . Represented via design , it can be seen as in Figure 2.



Figure 2. 10K $\Omega$  32X32 Resistive crossbar implementation

- Column Readout: 32-Channel TIA Array

Each Channel in the resistive crossbar is terminated with a TIA (Transimpedance Amplifier) that converts column currents to voltages , this is mathematically modelled with data collected from the Datasheet and manual calibration giving

$$I_j \rightarrow V_{out_j} = V_{ref} - I_j R_f$$

This conversion lets us measure many key parameters such as Settling time , Overshoot and helps mathematically model noise

- Measurement and Control Pipeline

The PCA9685 and all other Chips present on board are regulated by the microcontroller , in this case an ESP32S3, It primarily assists in the measurement pipeline by receiving Analog Input from preconfigured Unity gain buffers throughout important circuitry along the board.

- Measurement Setup

Any and all Signals were measured using a 100MHz , 1GSa/S oscilloscope which more than satisfied the data RMS required for accurate evaluation for most parameters.

## Results

The electrical behaviour of the row-driver subsystem was first characterised to determine the mapping between PWM duty cycle and delivered analog row voltage. Prior crossbar literature, including Li et al. [4], typically assumes linear DAC behaviour; however, our measurements show that the PCA-based driver exhibits a weak quadratic non-linearity. The applied duty cycle was swept across the full range, and the measured output was fitted to the mathematical model :

$$V_{row}(d) = ad^2 + bd + c , \quad d = \frac{d_o}{max(d_0)}$$

as shown in Figure 3 down below.



Figure 3. Fitted Coefficient Averages through 32 rows

The fitted coefficients for all 32 rows revealing row wise driver mismatch are summarised in Figure 4 as follows.



Figure 4. Row to Row Solution deviation

To observe the uncalibrated analog behaviour directly, a pre-calibration oscilloscope capture of the row-driver idle offset was recorded (constant input, no activity), shown in Figure 5.



Figure 5. Row driver Offset/Noise (Pre Calibration)

This directly corresponds to the static offset term referenced in previous research such as [8],[9] and matches the offset extracted in the statistical analysis of Figure 6, which shows measured offset versus ideal response and highlights the consistent additive bias.



Figure 6. Measured offset to Ideal

With accounting for offsets , it was determined that each channel faces a constant Additive offset ranging from 27-36mV due to system parasitics.

After row-level calibration, the complete matrix–vector multiplication behaviour of the system was evaluated. Following Di Ventra et al. [1], the predicted ideal column current is:

$$I_j = \sum_i G_{ij} V_{row_i} + b_{ij}$$

In which  $b_{ij}$  represents the bias of each row on a column , A scatter plot comparing measured and predicted outputs with collected data is shown in Figure 7, demonstrating strong linear relationship between the varying resistive error across all tested inputs.



Figure 7. Resistive value to bias based error

The per-row ( $\alpha, \beta$ ) Parameters extracted from the linear fits appear in Figure 8, As the observed deviation resembles a standard deviation , it is reasonable

to infer that the observed parasitic error is not random but can be mathematically compensated for as a deterministic Function



Figure 8. Column-wise Linear Distortion Parameters ( $\alpha, \beta$ )

Maximum per-element linearity error and its spatial exponential decay consistent with trends noted by Li et al. [2] are shown in Figure 9.



Figure 9. Exponential decay during Matrix multiplication

This physical summation process directly is better represented through an oscilloscope as captures of a “pre-summation” dual-row drive signal and the

corresponding “post-summation” TIA output were taken. These appear in Figure 10 (two independent row drive signals) and Figure 11 (TIA-summed output). These time-domain signals provide direct hardware evidence of instantaneous analog summation as predicted by crossbar theory. Do note that signals have been scaled down for visual representation.



Figure 10. Pre summation Signal Response (1/100 Scale)



Figure 11. Post summation signal Response (1/500 and 1/200 Scale)

Beyond Matrix multiplication behaviour, the peripheral analog components which were often overlooked in previous studies have been Extensively Studied and Characterized .

A conductance heatmap has been reconstructed through manual and predictively gathered resistors across various physical conditions. The extracted data was used to create a heatmap of  $G_{ij}$  (The conductance of the resistor with row and column index of i,j respectively), This heatmap is shown in Figure 12.



Figure 12. Conductivity map around the crossbar

Do note that the variance in conductivity between the first column and the others is not parasitics , its due to column 1 being the link to our PCA9685 giving it a lower Resistance.

Crosstalk behaviour through the peripheral electronics were evaluated by stimulating a single row while measuring passive columns, validating the exponential model

$$C(d) = Ae^{-\alpha d} + C_o$$

As shown in Figure 13 , upon the measurement and evaluation of decay , it reports consistent behaviour from Li et al.[4] .



Figure 13. Peripheral Crosstalk decay

To measure dynamic response, a derivative-based small-signal method was used. A slow triangular drive allowed estimation of the full partial-derivative tensor represented by :

$$G_{ij} = \frac{\partial I_j}{\partial V_{row_i}}$$

As inferred in [1] , the oscilloscope capture of such a waveform appears harmonic . The application of a curve fitting simulation of MATLAB decomposed the signal to :

$$V(t) = C \sin(2\pi f t + \phi)$$

$$V(t) = \frac{2A}{\pi} \sin^{-1}(\cos(|\frac{\pi}{2} - 2\pi f t|))$$

This is better represented in Figure 14.



Figure 14. Dynamic Response Tensor Frequency Response

The Gain fits for all 32 rows derived from this method are plotted and shown in Figure 15 below.



Figure 15. Row wise Dynamic Response Fit

The derived and reconstructed  $G_{ij}$ . The matrix is represented as a heatmap in Figure 16 , with a Vector Field Representation in Figure 17.



Figure 16. Vector heat map showing limits of Dynamic Range



Figure 17. Vector field representing  $G_{ij}$

A 3d reconstruction of sensitivity around the model was computed with offsets and row mapping parameters accounted for and resulted in a surface computation Figure 18 as seen below.



Figure 18. Calibrated  $G_{ij}$  derivative Representation

Further Analysis of the Sensitivity(loss ratio ) of the TIA's is used to compute an Error Estimate resolving to a max error of 2.5mV as seen in Figure 19.



Figure 19. Peripheral Error Threshold Heatmap

To capture TIA noise and residual currents , precision Electrometers have been used to model Noise addition and residual current as seen in Figure 20 down below.



Figure 20 . TIA noise floor and Summation node deviation w.r.t time

This helps us estimate an average Time settle of 20uS and a residual current of 200uA . Additionally , modelling this noise over the time domain gives an approximated noise function evaluated by the following equation :

$$n(t) = 10^{-6}(9.2t^9 - 11.55t^8 - 61t^7 + 63.9t^6 + 143.9t^5 - 110.3t^4 - 144.6t^3 + 50.6t^2 + 68.7t + 5.3)$$

With this noise approximation holding validity for most of the time domain.

Further modelling and correlating with row drive characteristics, TIA noise and referencing with ideal simulations results in the following Equation ideal equation :

$$Y_j = V_{out,j} = -R_f I_j = -R_f \sum_{i=1}^{32} G_{ij} V_{row,i}$$

Accounting for row characterization

$$Y_j = -R_f \sum_{i=1}^{32} G_{ij} (aX_i^2 + bX_i + c) , j = 1, 2, 3, \dots, 32$$

Now accounting for parasitics we can infer that

$$Y_{raw} = (C_{parasitic})Y + n(t) + V_{off}1 \quad , \text{ where } Y = AX$$

Where A represents the Crossbar array , therefore implementing an inversion based correction algorithm we can state :

$$Y_{corrected} = C_{parasitics}^{-1} (Y_{raw} - n(t) - V_{off}1)$$

So for each column , its given by :

$$Y_{corr,j} = \sum_{k=1}^{32} [C_{parasitics}^{-1}]_{jk} (Y_{raw,k} - n(t) - R_f I_{off})$$

So for a column , the parasitics can be stated as follows with lambda representing the rate of crosstalk decay.

$$C_{parasitic}[i,j] = Ae^{-\lambda|i-j|} + C_o$$

Using the Toeplitz matrix definition this can be simplified into :

$$C_{parasitic} = AT + C_o 11^T$$

For this instance of the implementation of the inverse of the Toeplitz matrix gives the following element entries for diagonal and non diagonal entries respectively

$$T_{i,i}^{-1} = \frac{1+r^2}{1-r^2} \quad \text{and} \quad T_{i+1,i}^{-1} = T_{i+1,i}^{-1} = -\frac{r}{1-r^2}$$

Therefore implementing the inverse to the Parasitic function , we get :

$$(C_{parasitic})^{-1} = (AT + C_o 11^T)^{-1} = (AT)^{-1} - \frac{(AT)^{-1}[(C_o 11^T)(AT)^{-1}]}{1 + 1^T (AT)^{-1} 1}$$

In other words as a simple matrix , its given by :

$$(C_{parasitic})^{-1} = \frac{1}{A} T^{-1} - \frac{C_o}{A^2} \frac{T^{-1} 1 1^T T^{-1}}{1 + \frac{C_o}{A} 1^T T^{-1} 1}$$

Applying this to the original equations gives an calibrated equation of Y, given by the following :

$$Y_{corrected} = \frac{1}{A} T^{-1} (Y_{raw} - n(t)1 - V_{off}1) - \frac{C_0 T^{-1} 1^T T^{-1} (Y_{raw} - n(t)1 - V_{off}1)}{A^2 (1 + \frac{C_0}{A} 1^T T^{-1} 1)}$$

This is verified via a reproducibility test attempting to reproduce an image (in this case the logo for the IRIS National Fair) . This image is first deconstructed into a vector of colours , an arbitrary voltage to colour map is made with 0V being black and 3.2V being Black .

The board tries to achieve each element of the image through several iterations attempting to reconstruct the vector into an image. This can be seen through Figure 21 which shows attempts at reconstruction without inversion based calibration and Figure 22 which shows the same attempts but implemented with Inversion based calibration .



Figure 21 . Attempts at reconstruction without inversion based calibration



Figure 22 . Attempts at reconstruction without inversion based calibration

Finally, the parameters of the Photonic interconnect were better known through the models proposed by Nahmias [11], Sun [15], Shen [13], and Miller [16].

The optical SNR was evaluated based on previously stated SNR equation. This data was then plotted against optical power as shown in Figure 23.



Figure 23 . Optical SNR to Optical Power

The data collected from Figure 23. Allowed the computation of Phase contrast and Jitter using the relation

$$\langle \cos \phi \rangle \approx e^{-\sigma_\phi^2/2}$$

This is plotted with respect to contrast as shown in Figure 24



Figure 24. Phase contrast to Jitter

Similarly relating the Phase and group velocity to Skew gives inferred inverse relationship as seen in Figure 25



Figure 25. Phase & Group Velocity to Skew

The coupling efficiency was than evaluating using curve fitting to the model :

$$P_{rx} = \eta P_{tx} e^{-\alpha L}$$

This was evaluated over the link length of 80mm with the resulting 3D efficiency surface as shown in Figure 26, This linear scaling also allows the scaling of potential future extensions at a power expense modelled by  $O(n)$  instead of  $O(n^k)$  allowing for sustainable scaling .



Figure 26. Coupling efficiency to Loss and link length

## **Conclusions**

This work showcases a fully integrated 32x32 Resistive crossbar with complete peripheral circuitry enabling analog matrix multiplication. Done through controlled PWM based Duty voltages, Oscilloscope based calibration and capture and Column based Current measurement, the system allows the quantification of row driver linearity, static offset, noise floors, TIA settling behaviour, Conductance variation, crosstalk decay and Tensors for Dynamic range Evaluation. The measurements taken give a Maximum total Voltage error at 70mV while having a net RMS of 4.2V. So a maximum error of 1.6% and the average error closer to 0.9%.

Beyond characterizing the hardware, this system experimentally verifies several behaviours that have been predicted but never confirmed in Prior Research. The row voltage quadratic distortion directly validates the nonlinear DAC to voltage mapping assumed in Li et al. [4], while the measured current relation confirms the linear mixed-signal model derived in Di Ventra et al. [1]. The extracted conductance matrix and its spatial variability match the device-level variability patterns reported in Yang et al. [18], as exponential crosstalk decay matches the leakage-coupling model referenced in Li et al. [2]. The full partial-derivative tensor, its inverted-triangle sensitivity pattern, and spatial gradients reproduce characteristics predicted in Kvatinsky et al. [3] and the analog-error propagation behaviour described in Ambrogio et al. [6]. Taken together, this system confirms theoretical principles from more than seven major papers all within a single integrated experimental platform.

## **References**

- [1] M. Di Ventra, Y. V. Pershin, and L. O. Chua, “Memristive Linear Algebra,” IEEE Trans. Circuits Syst. I, vol. 57, no. 8, pp. 2051–2060, 2010.

- [2] C. Li et al., “Precise and Scalable Analogue Matrix Equation Solving Using Resistive Random Access Memory Chips,” *Nature Communications*, vol. 10, no. 1, p. 4904, 2019.
- [3] S. Kvatinsky et al., “Hardware Implementation of Memristor-Based Artificial Neural Networks,” *IEEE Trans. Circuits Syst. I*, vol. 61, no. 6, pp. 1734–1745, 2014.
- [4] C. Li et al., “Analogue signal and image processing with large memristor crossbars,” *Nature Electronics*, vol. 1, pp. 52–59, 2018.
- [5] S. Yu, “Neuro-inspired computing with emerging non-volatile memory,” *Proceedings of the IEEE*, vol. 106, no. 2, pp. 260–285, 2018.
- [6] S. Ambrogio et al., “Equivalent-accuracy accelerated neural-network training with analogue-memory-based computing,” *Nature*, vol. 558, pp. 60–64, 2018.
- [7] A. Shafiee et al., “ISaac: A convolutional neural network accelerator with in-situ analog arithmetic,” in *Proc. ISCA*, 2016, pp. 14–26.
- [8] P. R. Gray et al., *Analysis and Design of Analog Integrated Circuits*, 5th ed. Wiley, 2009.
- [9] B. Razavi, *Design of Analog CMOS Integrated Circuits*, 2nd ed. McGraw-Hill, 2016.
- [10] R. Schreier and G. C. Temes, *Understanding Delta-Sigma Data Converters*, 2nd ed. Wiley-IEEE Press, 2005.
- [11] M. A. Nahmias et al., “Neuromorphic Photonic Networks Using Silicon Photonic Weight Banks,” *Optica*, vol. 7, no. 9, pp. 1159–1166, 2020.
- [12] J. Feldmann et al., “Silicon Photonic Integrated Circuits With Electrically Programmable Non-Volatile Memory Functions,” *Nature Communications*, vol. 11, no. 1, p. 1291, 2020.

- [13] Y. Shen et al., "High-Speed Photonic Neuromorphic Computing Using Recurrent Optical Spectrum Slicing Neural Networks," *Nature Photonics*, vol. 14, pp. 451–456, 2020.
- [14] M. Miscuglio et al., "Reconfigurable All-Optical Nonlinear Activation Functions for Neuromorphic Photonics," *Optics Express*, vol. 28, no. 10, pp. 14817–14827, 2020.
- [15] C. Sun et al., "Single-chip microprocessor that communicates directly using light," *Nature*, vol. 528, pp. 534–538, 2015.
- [16] D. A. B. Miller, "Silicon photonics: Meshing optics with applications," *Nature Photonics*, vol. 11, pp. 322–330, 2017.
- [17] G. Wetzstein et al., "Inference in Analog Crossbar Arrays," *IEEE JETCAS*, vol. 10, no. 1, pp. 4–19, 2020.
- [18] J. J. Yang et al., "Crossbar Device-Level Variability Studies," *IEEE Trans. Electron Devices*, vol. 57, no. 10, pp. 2564–2570, 2010.
- [19] M. Haurylau et al., "Electrical-Photonic Co-packaged Interconnect Architecture Papers," *IEEE J. Sel. Top. Quantum Electron.*, vol. 12, no. 5, pp. 1037–1046, 2006.
- [20] J. P. Gordon and L. F. Mollenauer, "Optical Communications and Coherence Noise Sources," *Optical Fiber Telecommunications*, vol. IV, pp. 1–50, 2002.
- [21] W. A. Wulf and S. A. McKee, "Hitting the Memory Wall," *ACM SIGARCH Computer Architecture News*, 1995.
- [22] M. Horowitz, "1.1 Computing's Energy Problem," *ISSCC*, 2014.
- [23] C. Mead, *Analog VLSI and Neural Systems*, Addison-Wesley, 1989.
- [24] B. Goswami and M. Suri, "Single Cycle XOR (SCXOR) and Stateful n-bit Parallel Adder Implementation...," in *Proc. NANOARCH '22*, 2022.

- [25] F. M. Bayat et al., “Implementation of multilayer perceptron network with highly uniform passive memristive crossbar circuits,” *Nature Communications*, vol. 9, 2331, 2018.
- [26] G. C. Adam et al., “3-D Memristor Crossbars for Analog and Neuromorphic Computing Applications,” *IEEE Trans. Electron Devices*, vol. 64, no. 1, 2017.
- [27] S. Satyam et al., “Energy-Efficient Implementation of Generative Adversarial Networks on Passive RRAM Crossbar Arrays,” *arXiv*, 2021.
- [28] H. Nikam et al., “Long Short-Term Memory Implementation Exploiting Passive RRAM Crossbar Array,” *IEEE Trans. Electron Devices*, vol. 69, no. 4, 2022.
- [29] V. B. Desai et al., *Neuromorphic. Comput. Eng.*, 2, 024006, 2022.
- [30] D. Kaushik et al., *Nanotechnology*, 31, 364004, 2020.