Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixing some error metrics in profiling and network simulations #53

Open
1 of 3 tasks
mnarayan opened this issue Sep 30, 2016 · 4 comments
Open
1 of 3 tasks

Fixing some error metrics in profiling and network simulations #53

mnarayan opened this issue Sep 30, 2016 · 4 comments

Comments

@mnarayan
Copy link
Member

mnarayan commented Sep 30, 2016

  • Frobenious norm sometimes counts diagonal info (like in Average Power). Make sure this is changed across the package.
error_fro = np.linalg.norm(np.triu(adj - new_estimator.precision_,1), ord='fro')
  • Numerical issues in model selection error with banded networks.
    Why do these give inconsistent numbers, particularly exact_support versus count_support_diff
print 'Exact Recovery (T/F): {}, (TPR,FPR): {}, Count Diff: {}'.format(
    exact_support(adjacency,prec_hat),
    approx_support(adjacency,prec_hat,prob=0.1)[1:2],
    count_support_diff(adjacency,prec_hat),
)
  • Issues in [approx_support().] To be fixed by re-using other functions from AveragePower
@mnarayan mnarayan added the bug label Sep 30, 2016
mnarayan added a commit that referenced this issue Sep 30, 2016
…pport seems reasonable. Sample sizes changed for AveragePower.
@jasonlaska
Copy link
Member

jasonlaska commented Dec 11, 2016

Numerical issues with exact_support and approx_support persist in the .profiling submodule. It's real weird because I have tests on these metrics inverse_covariance/profiling/metrics and they pass.

https://github.com/skggm/skggm/blob/develop/inverse_covariance/profiling/tests/metrics_test.py#L187-L227

@mnarayan
Copy link
Member Author

@jasonlaska Does this mean we still have to do some artificial thresholding? I'll take a look at this again too.

@jasonlaska
Copy link
Member

As you play around with the tools, please advise and I'll integrate fixes in.

@mnarayan
Copy link
Member Author

So I have at least one potential clue regarding numerical issues. They might be related to using 3-fold CV.

Using cv=(3,10) gives the following support

array([  1.04618665e+00,   0.00000000e+00,   0.00000000e+00,
         0.00000000e+00,   0.00000000e+00,   8.41176079e-04,
         0.00000000e+00,   0.00000000e+00,   0.00000000e+00,
        -2.19816011e-01])

Whereas using cv=(2,10) gives the following

array([ 1.04241831,  0.        ,  0.        ,  0.        ,  0.        ,
        0.        ,  0.        ,  0.        ,  0.        , -0.21027987])

This makes sense as most regimes for covariance estimation, 3-fold results in too little data for model re-fitting in hold out and creates more false positive selections. The values are tiny though, however. We could consider changing the default fold number to 2 or adding in an example to show that unlike prediction problems, we might be more sensitive to choice of fold number.

@mnarayan mnarayan added this to the Public Version 0.3 milestone Jul 10, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
No open projects
Development

No branches or pull requests

2 participants