Summary and Setup

This is a new lesson built with The Carpentries Workbench.

Setup Instructions


First, it’s important to understand that R and RStudio are two different programs that need to be downloaded and installed separately. R serves as the underlying statistical computing environment, but using R by itself is very difficult. TO simplify the experience of using R, RStudio (a graphical integrated development environment, or IDE) is used, as it is much simpler and more interactive. However, before you install RStudio, you still need to install R, as it depends on the underlying processing of R to run. Additionally, there is no need to manually run R, as RStudio will automatically start it and run it in the background.

After ensuring the installation of both programs, you will need to install the tidyverse and here packages from within RStudio. The tidyverse package provides a powerful collection of data science tools within R (you can see the see the tidyverse website for more details), and the here package simplifies file access.

Follow the instructions below to install/update R and RStudio for your operating system, and then follow the instructions at the end to install tidyverse and here.

After installing R and RStudio:


  • If you are running Linux, before installing the tidyverse package, Ubuntu (and related) users may need to install the following dependencies: libcurl4-openssl-dev libssl-dev libxml2-dev (e.g. sudo apt install libcurl4-openssl-dev libssl-dev libxml2-dev).
  • To install the tidyverse package, in the console, type install.packages("tidyverse"), followed by the enter key.
  • To install the here package, in the console, type install.packages("here"), followed by the enter key.
  • To ensure both packages are installed, select Packages on the right, under User Library, check that tidyverse and here are listed.

Datasets


Throughout this workshop, we use four primary data sets:

We recommend that you download a single zip file with all of the files and then unzip it. Move the unzipped folder to somewhere on your system that you can find (e.g. Desktop or Documents).

The Check-In Dataset is based on a 2018 state election. The data set tracks check-in times and lengths at ballot scanners across various locations, as well as the precinct that each device belongs to. Additionally, all identifiable information has been anonymized using pseudo-anonymization. The direct download link for the data file is: https://raw.githubusercontent.com/EngineeringForDemocracy/r-election-workers/main/episodes/data/checkin_data.csv The direct download link for the sampled data file (for ggplot2) is: https://raw.githubusercontent.com/EngineeringForDemocracy/r-election-workers/main/episodes/data/checkin_sample_plotting.csv

The Messy Dataset is based on a real-life election example and tracks the amount of time individuals took to check-in to a voting location. For check-ins that took a longer amount of time than average, an explanation is given. The direct download link for the data file is: https://raw.githubusercontent.com/EngineeringForDemocracy/r-election-workers/main/episodes/data/messy_data.csv

The GoT Dataset is a fictional data set based on the Game of Thrones universe. It consists of graphing polygons and voting data representing the percentage of voters that voted for Jon Snow or Daenerys Targaryen. The direct download link for the CSV file is: https://raw.githubusercontent.com/EngineeringForDemocracy/r-election-workers/main/episodes/data/voting_GoT.csv The direct download link for the GeoJSON file is: https://raw.githubusercontent.com/EngineeringForDemocracy/r-election-workers/main/episodes/data/polygons_GoT.json

The Check-In Snippet is a JSON representation of a fictional data set based on the Anonymized Dataset. It includes information as to what precinct, polling location, and scanner was used, as well as the amount of arrivals, with the time of the first and last arrival. The direct download link for the data file is: https://raw.githubusercontent.com/EngineeringForDemocracy/r-election-workers/main/episodes/data/checkin_snippet.json