Improve weights docstrings #757

nalimilan · 2022-01-18T21:36:21Z

AnalyticWeights have a precise definition, on which we rely in several functions.
Also make docstrings more consistent across types.

`AnalyticWeights` have a precise definition, on which we rely in several functions. Also make docstrings more consistent across types.

ParadaCarleton · 2022-01-18T22:13:46Z

I think we should make an important note for both AnalyticWeights and FrequencyWeights that the scale of the weights matters, i.e. f(x, weights) will usually not equal f(x, 2 .* weights).

It would also help to be very specific about the following, to avoid people making the mistake of trying to use normalized weights:
For AnalyticWeights, w[i] must be equal to 1/var(x[i]), and var(x[i]) must be known ahead of time.
For FrequencyWeights, w[i] must be equal to the number of observations for x[i].

I believe the docstring for ProbabilityWeights is already clear enough, because probability weights are scale-invariant. (For now; in theory we might want to make a distinction between self-normalized and unnormalized weights in the future, because unnormalized weights can give unbiased estimators, but ATM we only use self-normalized weights).

rofinn

LGTM. We already mentioned inverse variance weights, so maybe rewording that docstring to be less redundant would be good?

src/weights.jl

rofinn · 2022-01-19T18:50:57Z

src/weights.jl

-Analytic weights describe a non-random relative importance (usually between 0 and 1)
-for each observation. These weights may also be referred to as reliability weights,
+Analytic weights represent the inverse of the variance for each case.
+These weights may also be referred to as reliability weights,
 precision weights or inverse variance weights. These are typically used when the observations


Suggested change

precision weights or inverse variance weights. These are typically used when the observations

precision weights or regression weights. These are typically used when the observations

Are you sure the expression "regression weights" is used commonly? I'm afraid it could confuse users, as you can use any kind of weights in regression (e.g. Stata supports all three types).

I think SAS uses that term for them, but I don't know how common it is elsewhere. I'm also fine with just removing the redundant inverse variance and only listing two other common names (e.g., reliability weights, precision weights).

https://blogs.sas.com/content/iml/2017/10/02/weight-variables-in-statistics-sas.html

OK, let's do that.

src/weights.jl

nalimilan · 2022-01-20T08:56:32Z

I think we should make an important note for both AnalyticWeights and FrequencyWeights that the scale of the weights matters, i.e. f(x, weights) will usually not equal f(x, 2 .* weights).

It would also help to be very specific about the following, to avoid people making the mistake of trying to use normalized weights: For AnalyticWeights, w[i] must be equal to 1/var(x[i]), and var(x[i]) must be known ahead of time. For FrequencyWeights, w[i] must be equal to the number of observations for x[i].

I've added mentions regarding scale-invariance of frequency and probability weights. For analytic weights I'd rather wait until #758 is settled as we may want to adjust the definition a bit (maybe in the next breaking release). AFAICT currently they are scale-invariant, right?

nalimilan · 2022-01-23T21:45:54Z

I've updated the description of analytic weights in the light of #758. Does that sound correct?

ParadaCarleton · 2022-01-24T01:54:54Z

I've updated the description of analytic weights in the light of #758. Does that sound correct?

I think we should hold off until #758 is resolved.

ParadaCarleton · 2022-01-27T02:20:55Z

Oh, brief note that I think could be useful for users -- currently, all our methods for ProbabilityWeights normalize the weights before calculating an estimator. This is probably the best default, but sometimes it's useful to use the Hansen-Hurwitz estimators for means, variances, etc.; these estimators use the unnormalized weights, which makes them unbiased (but usually results in higher variance). The behavior of these estimators can be replicated by setting sum=1, in which case the weights won't be normalized.

nalimilan · 2022-03-20T16:28:44Z

@ParadaCarleton Don't you think that this PR is a strict improvement over the current situation, even if we decide to split AnalyticWeights into several types?

ParadaCarleton · 2022-03-21T02:26:44Z

I think it's an improvement, yeah, but I'd clarify that the weights:

Refer specifically to sample sizes for each observation.
I'd add a warning about std doing something different from what you'd expect.

Improve weigths docstrings

c03d404

`AnalyticWeights` have a precise definition, on which we rely in several functions. Also make docstrings more consistent across types.

nalimilan requested a review from rofinn January 18, 2022 21:36

nalimilan mentioned this pull request Jan 18, 2022

Weighted sem #754

Merged

rofinn approved these changes Jan 19, 2022

View reviewed changes

Add mentions about scale-invariance

538399f

Update weights.jl

0350e4e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve weights docstrings #757

Improve weights docstrings #757

nalimilan commented Jan 18, 2022

ParadaCarleton commented Jan 18, 2022

rofinn left a comment

rofinn Jan 19, 2022

nalimilan Jan 20, 2022

rofinn Jan 20, 2022

nalimilan Jan 23, 2022

nalimilan commented Jan 20, 2022

nalimilan commented Jan 23, 2022

ParadaCarleton commented Jan 24, 2022

ParadaCarleton commented Jan 27, 2022 •

edited

nalimilan commented Mar 20, 2022

ParadaCarleton commented Mar 21, 2022

	precision weights or inverse variance weights. These are typically used when the observations
	precision weights or regression weights. These are typically used when the observations

Improve weights docstrings #757

Are you sure you want to change the base?

Improve weights docstrings #757

Conversation

nalimilan commented Jan 18, 2022

ParadaCarleton commented Jan 18, 2022

rofinn left a comment

Choose a reason for hiding this comment

rofinn Jan 19, 2022

Choose a reason for hiding this comment

nalimilan Jan 20, 2022

Choose a reason for hiding this comment

rofinn Jan 20, 2022

Choose a reason for hiding this comment

nalimilan Jan 23, 2022

Choose a reason for hiding this comment

nalimilan commented Jan 20, 2022

nalimilan commented Jan 23, 2022

ParadaCarleton commented Jan 24, 2022

ParadaCarleton commented Jan 27, 2022 • edited

nalimilan commented Mar 20, 2022

ParadaCarleton commented Mar 21, 2022

ParadaCarleton commented Jan 27, 2022 •

edited