Skip to content

fit.lda

gadorlhiac edited this page Jan 16, 2017 · 2 revisions

__init__(data)

Reads the working data parameters (wavelengths, time point indices, irf values) from the data object using updateData(). 100 taus are set up by default on a log scale from 1e-1 to 1e4 using numpy's logspace. The values of rho in the event that the elastic net regularization is used are set up from 0.1 to 0.9. Regularization type (stored in self.reg) is set by default to the string "L2", referring to Tikhonov regularization.

run_LDA(GA_taus=None)

Runs MATRIX regularization. Uses the routines corresponding to Tikhonov, LASSO, or elastic net depending on the value of self.reg. Plots the LDM using _plot_LDM(GA_taus). The GA_taus can be a list of lifetimes calculated from a global analysis which will be plotted as horizontal lines on the LDM.

replot(GA_taus=None, num_c=10)

Will close the current plot of the LDM and replot it using _plot_LDM and a new number of contour levels corresponding to num_c.

updateData(data)

Retrieves the working dataset, wavelengths, time points and IRF parameters from the data object storing these in object variables. Then runs genD() to create the matrix of exponential decays.

updateParams(taus, alphas, reg, L, simfit)

Updates parameters corresponding to regularization, and fitting routines. Taus is a list of lifetimes. Alphas, is a list of alpha values used for the matrix regularization routines. reg is a string taking the values of "L2" for Tikhonov regularization, "L1" for LASSO regularization, and "elnet" for elastic net regularization. L is the regularization matrix, which by default (during initialization) is an identity matrix, whose size is given by the number of lifetimes. simfit is a boolean, which if true means that fit statistics are calculated for all wavelengths simultaneously. (Note: Do not set to false at the moment or fit statistics will not be displayed.)

genD()

Constructs the matrix of exponential decays. If all IRF parameters are set to 0, the decays are constructed without IRF convolution. Otherwise the IRF parameters are used with the explicit analytical expression for the convolution, not using a convolution routine.

_L2()

Runs Tikhonov regularization for each alpha independently. The resultant LDMs are stored in a 3D array self.x_opts. The Tikhonov regularized solution at a particular alpha value is found using the _solve_L2(alpha) routine. The Hat or Projection matrix, along with a similar S matrix for the Cp statistic are also calculated by means of an SVD, through the routine _calc_H_and_S(). These matrices are used in the calculation of the GCV and Cp parameter at each alpha value. Both the GCVs and Cps are are stored in a 1D array (if simfit is true). If simfit is false they are stored in a 2D array whose dimensions are the [len(wls), len(alphas)]. The GCVs and Cps arrays are returned by the function.

_solve_L2(alpha)

An augmented decay matrix is constructed using the alpha value and regularization matrix L. The resultant ordinary least squares problem is solved by using the SVD of the augmented matrix. As described in the paper.

_calc_H_and_S(alpha)

Calculates and returns the H and S matrices used for the GCV and Cp statistics respectively. The SVD of (D^T)D + alpha*(L^T)L is used for the construction of these matrices.

_calc_GCV(alpha, H)

Using the hat matrix H, alpha value, and the routine _calc_res(), calculates and returns the GCV value. This is either a single value if simfit is true or an array of values corresponding to each wavelength if simfit is false.

_calc_Cp(alpha, S, wl=None)

Calculates and returns the Cp value, using the trace of S as the measure of degrees of freedom and the error variance of the lowest alpha solution as an estimate for the true error variance.

_lcurve()

Calculates and returns lcurve_x, lcurve_y, and the curvature k. lcurve_x is the residual between the fit and data. lcurve_y is the L2 smoothing norm. The curvature k is the curvature of the lcurve_y vs lcurve_x curve. As with GCVs and Cps, this will return 2D arrays corresponding to curves for each wavelength if simfit is set to False.

_calc_k(lx, ly)

Calculates and returns the curvature k for the curve ly vs lx.

_calc_res(alpha, wl=None)

Calculates the residual between the fitted solution at a given alpha value. If wl=None, this is done for the entire map; if wl=True, this residual is calculated wavelength by wavelength and returned as an array.

_calc_smoothNorm(alpha, wl=None)

Calculates the smoothing norm at a given alpha value. If wl=None, this is done for the entire map; if wl=True, this residual is calculated wavelength by wavelength and returned as an array.

_L1()

Runs the LASSO regularization using the orthogonalization of design matrix algorithm as described in the paper and implemented in the routine _L1_min(). This is performed for each alpha value. The Cps values are also calculated and are returned by the function.

_L1_min(D, A, alpha)

The implementation of the design matrix orthogonalization algorithm. D corresponds to the matrix of decays, or augmented matrix in the case of elastic net regularization. A is the data (or augmented data), and alpha is the regularization parameter.

_calc_L1_Cp(alpha, wl=None)

Calculates the Cp value for the LASSO regularized solutions. (Note: Approximate solution, will be fixed shortly. Do not set wl to True) Uses the error variance of the lowest alpha value solution as the estimate for the true error variance.

_l1curve()

Calculates and returns the l1x, l1y and the curvature k in analogy to the lcurve() for Tikhonov. Uses the same _calc_res() and _calc_k() routines. There is a separate _calc_L1Norm() for calculation of the L1 smoothing norm. Again, will return 1D arrays if simfit is set to True, and 2D arrays if it is set to False.

_elnet()

Runs the elastic net regularization. First constructs augmented matrices to remove the L2 penalties. Then for each alpha and rho value it runs _L1_min() to calculate the LASSO solutions for the augmented decay and data matrices.

run_tsvd(k, t1, t2, nt, GA_taus)

Runs a truncated SVD regularization, via _tsvd(k), to solve for the LDM. k is the parameter for design matrix truncation. t1, t2, and nt correspond to the lower bound, upper bound and number of lifetimes tau. The lifetimes are calculated using numpy's logspace(t1, t2, nt). GA_taus can be a list of lifetimes which will be displayed as horizontal dashed lines on the output LDM. After the LDM is calculated is plotted and displayed.

_tsvd(k)

Calculates and returns the LDM. Uses _tsvdInv(k) to calculate the truncated SVD inverse.

_tsvdInv(k)

Uses the SVD and k to calculate the truncated SVD inverse.

_picard()

Pass

display()

Pass

_plot_lcurve(lx, ly, k)

Plots a two subplot graph of the L-curve where ly are the y-values (Smoothing Norm) and lx are the x-values (Residuals). k is the array of curvatures. The plot on the left is the L-curve with the point of maximum curvature labelled and annotated with the corresponding alpha value. The plot on the right is the plot of the curvatures, again labelled and annotated.

_plot_GCV_Cp(self, Cps, GCVs=None)

Plots either the Cp values as a single plot, or both the GCVs and Cps as two subplots if GCVs are passed as a parameter. The minima of the graphs are labelled, and the points are annotated with the corresponding alpha value.

_plot_LDM(GA_taus=None, num_c=10)

Plots a two subplot graph. On the left is the LDM and on the right is a single vertical slice of the LDM at a particular wavelength. The LDM corresponding to different alpha values can be selected using a slider bar, as can the wavelength slice displayed on the right subplot. num_c is a parameter determining the number of contour levels above and below zero. The maximum and minimum contour level is determined by the maximum of the absolute values of the LDM. GA_taus is an array of lifetimes which are plotted horizontally on the LDM if provided.