Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Too loose ASSERT_NEAR and wrong test values in compute_potential_scale_reduction_test.cpp #3273

Open
aleksgorica opened this issue Feb 26, 2024 · 3 comments

Comments

@aleksgorica
Copy link
Collaborator

Description:

I have tried reducing the accepted error in ASSERT_NEAR from 1 to 1e-4, but then the tests failed for ComputeRhat.compute_potential_scale_reduction and ComputeRhat.compute_potential_scale_reduction_convenience. Upon inspection, the test values are the same as for compute_split_potential_scale_reduction and compute_split_potential_scale_reduction_convenience, for which the corresponding functions output different values.

Current Version:

v2.34.1

@bob-carpenter
Copy link
Contributor

Any suggestion on how we get values with which to test? Are there tests in the R or Python packages that compute these quantities?

@aleksgorica
Copy link
Collaborator Author

I used Arviz to generate test values for split-rank, no-split-rank, split-no-rank, no-split-no-rank.
Here is the code I used:

iimport arviz
import numpy as np
from arviz.stats.diagnostics import _rhat, _backtransform_ranks, _z_scale, _split_chains, _rhat_rank
from arviz.stats.stats_utils import wrap_xarray_ufunc

posterior_glob = "blocker.*.csv"
cmdstan_data = arviz.from_cmdstan(posterior=posterior_glob)



def _rhat_rank_no_split(ary):
 
    ary = np.asarray(ary)
    rhat_bulk = _rhat(_z_scale(ary))

    split_ary_folded = abs(ary - np.median(ary))
    rhat_tail = _rhat(_z_scale(split_ary_folded))

    rhat_rank = max(rhat_bulk, rhat_tail)
    return rhat_rank

rhat_no_split_rank = wrap_xarray_ufunc(
        _rhat_rank_no_split,
        cmdstan_data.posterior,
        ufunc_kwargs={"ravel": False},
        func_kwargs={},
        dask_kwargs=None
    )
rhat_split_rank = arviz.rhat(cmdstan_data.posterior)
rhat_split_no_rank = arviz.rhat(cmdstan_data.posterior, method="split")
rhat_no_split_no_rank = arviz.rhat(cmdstan_data.posterior, method="identity")

def print_rhat(rhat_results):
    rhat_values = {var_name: rhat_results[var_name].values for var_name in rhat_results.data_vars}
    all_rhat_values = np.concatenate([values.ravel() for values in rhat_values.values()])
    print(",".join([str(el) for el in np.round(all_rhat_values, decimals=5)]))
print_rhat(rhat_split_rank)
print_rhat(rhat_split_no_rank)
print_rhat(rhat_no_split_no_rank)
print_rhat(rhat_no_split_rank)

@bob-carpenter
Copy link
Contributor

Great. I'm curious to see whether tests pass. Getting the details on this kind of thing exactly matching can be painful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants