Key Points

Before we Start


  • Use RStudio to write and run R programs.
  • Use install.packages() to install packages (libraries).

Introduction to R


  • Access individual values by location using [].
  • Access arbitrary sets of data using [c(...)].
  • Use logical operations and logical vectors to access subsets of data.
  • Use proper date types (Date and POSIXct) instead of strings for date arithmetic.

Starting with Data


  • Use read_csv to read tabular data in R.
  • Access rows and columns in a tibble in R.
  • Use factors to represent categorical data in R.
  • Use datetime to represent data in R.
  • Output an updated data set to CSV in R.

Data Wrangling with dplyr


  • Use the dplyr package to manipulate tibbles.
  • Use select() to choose variables from a tibble.
  • Use filter() to choose data based on values.
  • Use group_by() and summarize() to work with subsets of data.
  • Use mutate() to create new variables.

Data Wrangling with tidyr


  • Use the tidyr package to change the layout of tibbles.
  • Use pivot_wider() to go from long to wide format.
  • Use pivot_longer() to go from wide to long format.

Data Visualisation with ggplot2


  • ggplot2 is a flexible and useful tool for creating plots in R.
  • The data set and coordinate system can be defined using the ggplot function.
  • Additional layers, including geoms, are added using the + operator.
  • Time-series data can be visualized using geom_line() and geom_point().
  • Box plots are useful for visualizing the distribution of check-in times by location.
  • Bar plots are useful for visualizing counts of check-ins by categorical variables.
  • Faceting allows you to generate multiple plots based on a categorical variable like device.
  • Spatial data can be visualized on maps using the sf and ggplot2 packages.

Getting Started with R Markdown (optional)


  • R Markdown is a useful language for creating reproducible documents combining text and executable R-code.
  • You can specify chunk options to control formatting of the output document.