Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: goodness-of-fit tests for copulas #376

Open
extabl opened this issue Oct 7, 2018 · 4 comments
Open

Feature request: goodness-of-fit tests for copulas #376

extabl opened this issue Oct 7, 2018 · 4 comments

Comments

@extabl
Copy link

extabl commented Oct 7, 2018

It would be very nice to have feature, thinking of something like in the R copula package (or in gofCopula package) but with a much better performance, capable of handling inputs in the range of +500k.

Just as a note, by the way, it's a good work guys!!

@tnagler
Copy link
Collaborator

tnagler commented Oct 7, 2018

Not a bad idea considering their popularity (although I personally don't like GoF-test).

In any case, before we do this, we will first need derivatives in the library, which is a larger project. We have considered adding derivatives from the very beginning, but had other priorities and couldn't come up with a satisfying proof of concept. I'm quite sure this feature will come at some point and I opened an issue for further discussion (#377). However, I don't think this is going to happen soon.

@extabl
Copy link
Author

extabl commented Oct 7, 2018

Yes, sure. First things first, there's already quite much to come in this project. Keep going!

However, it makes me wonder why you don't like GoF-test? I've always thought that they are indispensable in statistical analysis of copulas.

@tnagler
Copy link
Collaborator

tnagler commented Oct 7, 2018

This has a somewhat philosophical reason. First of all, null-hypothesis significance testing (NHST) has many flaws, as summarized here or here. There are many recent papers on this subject, the American Statistical Association even issued a statement. I am not as strongly opposed as many of the prominent critics and use significance tests myself every now and then.

Regarding GoF-tests specifically, I think they are widely misused to solve problems they weren't designed for (especially model selection).

There is a valid use of GoF-tests though: If you really have the hypothesis that your parametric model is the one and only truth and you want to test for that. However, I have never seen GoF-tests used like this in the copula world and it's a rather odd hypothesis to have. Essentially, we already know that any parametric hypothesis is false. Then the GoF-test only tells you whether the sample size was large enough to detect this. I believe that this is rarely what people want to know.

Usually, the end goal is entirely different, like prediction or simulation. Predictions should be as accurate as possible and simulations as realistic as possible. For these goals it's pretty much irrelevant whether the selected copula is the one and only truth or not.

@extabl
Copy link
Author

extabl commented Oct 7, 2018

You certainly have a valid point regarding the questionable usefulness of GoF-tests in context of copulas.

I was not aware of these aspects, but that is good news as I can allocate more resources on another parts of the research (e.g. survival copulas). No concerns needed anymore on the compute performance of GoF-tests, which about 20 hours to calculate p-values on the sample bivariate data set of 500k observations (raw data set in the range of 25-36 m).. Definitely the code in the R copula package is not yet parallelized, as the CPUs have been hardly utilized ~ 6-7%. But as you wrote, in case of simulations, the goodness-of-fit is not the main point.

Thanks again for your above clarification! Vielen Dank!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants