June 17th, 2019

dplyr

dplyr improves on capabilities already in R:

  • re-define data “wrangling” in R
  • simpler, more intuitive syntax
  • faster, better performance for large data tables

dplyr practical

Open BADAS3_practical.html and complete exercises 1 to 3

Naming things is (still) hard

A classical example

dat <- group_by(mydata, group)
dat2 <- summarise(dat,
  mean_var = mean(var, na.rm = TRUE))

The pipe %>%

Forward pipe operator in R: %>%

magrittr from Stefan Milton Bache

Forward pipe operator

data_summarized <- mydata %>% 
  group_by(group) %>% 
  summarise(mean_var = mean(var, na.rm = TRUE))

When not to use %>%

Piping adequat for short sequence functions calls

  • avoid pipes that are too long => create intermediate results (with good bnames !)
  • the pipe does not deal well with multiple inputs or multiple outputs

More ressources

dplyr & %>% practical

Open BADAS3_practical.html and complete exercises 4 to 10