Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alternative model summaries #54

Open
erex opened this issue Apr 14, 2020 · 1 comment
Open

Alternative model summaries #54

erex opened this issue Apr 14, 2020 · 1 comment

Comments

@erex
Copy link
Member

erex commented Apr 14, 2020

From Pat Lorch to the list on 11Apr

I run (and often re-run) Distance on data A LOT. So for me the detailed output, while useful for understanding one analysis, slows down comparisons across years. I am working with data from 2-4 parks per year, where we estimate density trends using spotlighting. We use the same methods on the same routes every 2-3 years. I am working my way back through 20 years of this data. I find errors or decide to look more closely at truncation or binning, etc., and I invariably need to run the model again. Then I have to extract the same 20 numbers that allow me to look at density estimates, abundance, and group size across years.

This year, I finally broke down and wrote a function to extract these numbers. As you will see, if you are interested enough to read my clunky function, the values I need are not easy to find in the output. This is one of the things that kept me from writing the function. Don't get me wrong, the Distance package is great. It does a ton of work and has really useful data in its output, serving a lot of different needs. However, writing the function made me think that there may be demand out there for something that pulls the kind of data out that I have been gathering.

I thought I would ask people whether they could use something like this.

How can this be done? There are already summary and print methods and a summarize_ds_models function that does some of this, but these are mostly formatted text outputs. I think that currently, one way to do this would be to create broom::tidy and glance methods for R Distance. If you are not familiar these are methods for grabbing, in the case of tidy, parameters that are commonly recorded or reported for an analysis. In the case of glance methods, a one row summary of the critical outputs, perhaps ones that tell you whether your model fits or not, so for us, GOF p value and maybe CV of D, A, S, and p estimates. broom also typically implements a augment method to add data to the original data table. I cannot see a use for this, but maybe someone else can.

What I need is "tidy" output with a row for each whole model output for a year and a row for each route within year. My function does what I need for now.

Let me know what you think about all this. If there is interest, maybe we can turn this into a set of "tidy" summaries.

@dill
Copy link
Contributor

dill commented Apr 21, 2020

For future reference the post is archived here with code from Pat.

@dill dill self-assigned this Jan 12, 2021
@dill dill removed their assignment Nov 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants