Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to Get Good with R? | Credibly Curious #71

Open
utterances-bot opened this issue Nov 13, 2023 · 6 comments
Open

How to Get Good with R? | Credibly Curious #71

utterances-bot opened this issue Nov 13, 2023 · 6 comments

Comments

@utterances-bot
Copy link

How to Get Good with R? | Credibly Curious

https://www.njtierney.com/post/2023/11/10/how-to-get-good-with-r/

Copy link

Thank you for this :)
I am a beginner in R, and its help me a lot!

Regards from 🇧🇷

@njtierney
Copy link
Collaborator

Glad to hear it! Let me know if there's anything you'd like to hear more about or what could be clearer :)

Copy link

I really liked this post! I think it's so important you mention debugging. It is definitely one of those skills that is not often picked-up on the go by self-taught R users (though more learning materias mention it now).

I am really looking forward to the second part two! Beyond typing speed, I've found that not being able to touch type can be a real barrier (e.g. they miss auto-complete prompts because they're looking at their keyboard -> are more likely to have typos, etc). I was really surprised when I moved to France to realise just how many people were not taught how to touch type at school, even in young generations...

@njtierney
Copy link
Collaborator

Thanks for the kind words! Great point about typing speed - I think it must first come from accuracy, so touch typing is necessary, in order to develop useful speed.

I'm currently chunking the second part of the blog post into smaller pieces, as it turned into a pretty big post and I was worried I'd never finish it all. So there should be one about typing and keyboard shortcuts soon enough :)

Copy link

dhduncan commented Dec 7, 2023

Thanks for sharing, Nick.

I'm a long time user who has rarely had the sense of having fledged beyond resources of Stack Overflow to get me around roadblocks and sticky puddles. Most problems posted in that and other platforms will typically attract solutions in base, tidyverse, and maybe data table and I for one have never really committed to one style. I switch between base and tidyverse style solutions, and - getting to my question now - do you think that as an extension of curating a consistent style to try and help reflex understanding of your own work, that one of the keys to success might be "joining a gang"?

@njtierney
Copy link
Collaborator

Thanks, @dhduncan !

There's a lot of really good solutions on stack overflow and co, for me, I find that sometimes the best solution is in base, and sometimes it's in data.table or tidyverse.

do you think that as an extension of curating a consistent style to try and help reflex understanding of your own work, that one of the keys to success might be "joining a gang"?

Yes, I do! But there are caveats. data.table is hands down the fastest way to do a lot data munging tasks and more in R. It's faster than python, it's just like, really good. Personally I prefer to use the tidyverse, as I find that for what I'm doing, I don't need to worry about the memory/time that data.table would solve. I personally find the data.table syntax too brief, and as a result harder to understand.

There's a balance with joining a team. You don't have to use only tidyverse or base, but using data.table in the middle of some tidyverse code might cause some friction. So I would say, pick tidyverse or data.table. Although you can do both with dtplyr - https://dtplyr.tidyverse.org/ - which allows you to write dplyr code and it uses data table as a backend for speed.

I wouldn't say that base solutions are mutually exclusive to either of these. But I think as you get more experience with these packages you will see places where staying in one group keeps the document in a consistent style.

Anyway that's a long winded way of saying, yes, I think it's a useful thing to stick with a consistent set of packages. In general I would avoid mixing up data.table code with dplyr/tidyverse code. They have different semantics, as shown below - tidyverse requires that you save the new data out to a new variable, base has a way to add new data, and data.table just writes the data without needing to create a new data frame. But I think base code can be mixed in to some extent.

data$x <- 1:10
data$y <- runif(10)
data_x_vars <- data %>% 
  mutate(
    x = 1:10,
    y = runif(10)
)

Data.table

dt[ , x := 1:10]
dt[ , y := runif(10)]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants