Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Velocity estimation by piecewise linear regression #569

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from
Draft

Conversation

rpauszek
Copy link
Contributor

@rpauszek rpauszek commented Sep 1, 2023

work in progress

Comment on lines +174 to +186
diff_term = x - breakpoints[:, np.newaxis]
heavi = np.vstack([np.heaviside(d, -1) for d in diff_term])
u = diff_term * heavi
v = -heavi
design_matrix = np.vstack((np.ones(x.size), x, u, v)).T

xtx = np.linalg.pinv(np.matmul(design_matrix.T, design_matrix))
coeffs = np.dot(
xtx,
np.dot(design_matrix.T, y),
)

fit = np.dot(design_matrix, coeffs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if moving the linear fit part and the parameter remapping inside PiecewiseModel would make sense. Since it's a linear fit, the fit is fully determined from just the breakpoints and the data (which could be its constructor arguments).

The benefit would be that you could add a simple .plot() function to that model to see it (since you'd have all the pieces there to plot it already).

Then you have a self-contained PiecewiseModel model with everything needed to fit (which can just happen on construction making it immutable) and assess its quality for a given set of breakpoints in a single dataclass.

bic, prediction, rss, breaks_in_range and external parameters could be properties.

The only thing that seems like it wouldn't fit quite as nicely is the exitflag; but that feels a bit like a property of the overarching algorithm rather than the PiecewiseLinearModel itself.

from a uniform distribution. Ignored if `n_breakpoints == 0`.
"""
n_samples = len(x)
n_coeffs = 2 + 2 * n_breakpoints
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do wonder if BIC treating the breakpoint as just another parameter is sufficient penalty. In a sense, it is a very different type of parameter. See also the answer on this for example: https://stats.stackexchange.com/questions/337852/aic-bic-for-a-segmented-regression-model

When you run this with many simulations, how consistently does the BIC selection procedure recover the correct model?

v = -heavi
design_matrix = np.vstack((np.ones(x.size), x, u, v)).T

xtx = np.linalg.pinv(np.matmul(design_matrix.T, design_matrix))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a shorthand for matmul.

Suggested change
xtx = np.linalg.pinv(np.matmul(design_matrix.T, design_matrix))
xtx = np.linalg.pinv(design_matrix.T @ design_matrix)

slope_terms = np.hstack((alpha, beta))
slope_block = cov[1 : n_terms + 1, 1 : n_terms + 1]
slopes = np.array([np.sum(slope_terms[:j]) for j in range(1, n_terms + 1)])
slopes_std = np.array([np.sqrt(np.sum(slope_block[:j, :j])) for j in range(1, n_terms + 1)])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How close are these estimates to say a bootstrap with different realizations?

break

# update breakpoints and check in bounds
breakpoints = gamma / beta + breakpoints
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason why we're not just doing:

Suggested change
breakpoints = gamma / beta + breakpoints
breakpoints += gamma / beta

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants