R for Music Research


Meet Alex


Getting started with R and RStudio


  • Write your code in an R script to be able to save it
  • Run code in an R script using Command + Return on Mac, Ctrl + Return on Windows/Linux, or by pressing the Run button
  • Use install.packages() to download and install a library package
  • Use library() to load the downloaded package in your environment
  • Use help(), help.search() and the ? and ?? help operators to look up documentation on commands and packages

Creating a directory structure


  • The term ‘directory’ in R has the same meaning as the common term ‘folder’
  • Use getwd() to check your current working directory
  • Establish a new working directory with setwd()
  • Create new directories using dir.create("nameofnewdirectoryhere") or through the navigation pane
  • Use list.files() to view all files in your working directory

Reading survey data in R


  • Use read.csv() to import a csv data file in R
  • Use read_excel() from the readxl package to import an Excel data file in R
  • Use the assign operator <- to give a name to your data set
  • Specify how to deal with missing values using the na.strings argument in read.csv() when importing a csv file

Inspecting your data in R


  • Inspect the dimensions of a data frame using dim()
  • Find out the number of rows and columns of a data frame using nrow and rcol
  • Use colnames() and rownames() to display column and row names respectively
  • Use head() to view the first six rows of a data frame
  • Look at data in a specific column by using the $ operator
  • Use [] to subset data from a data frame
  • Use str() to inspect the internal structure of the data frame

Cleaning your data


  • Use the select() function from the dplyr package to remove unneccesary data columns
  • Use the filter() function to omit data based on a specific parameter
  • Use is.na() to identify missing values in the data
  • Use na.omit() to exclude rows with any missing data
  • Use write.csv() to save the cleaned data as a new data file

Analysing survey data


  • Use mean() and sd() to calculate the mean and standard deviation of a variable
  • Use min() to identify the minimum value of a particular variable
  • Use max() to identify the maximum value of a particular variable
  • Include the na.rm = TRUE argument in functions when possible for the calculation to ignore missing values in the data
  • Use filter() to subset your data by a specific variable
  • Use the t.test function with the following syntax t.test(DependentVariable ~ IndependentVariable, data) to compare whether the means of two groups are statistically different or not

Visualising survey data with ggplot2


  • A ggplot has 3 main components: data, aesthetics, and geom
  • A ggplot may be customised by adding layers of elements
  • Use geom_point() to create a scatterplot, geom_boxplot() for a boxplot, and geom_bar() for bar graphs
  • Use facet_wrap(~variable) to create separate plots simultaneously based on the unique values of a variable
  • Give your plot a title with ggplot('title here') and label your axes with ylab() and xlab()
  • Save your plot with ggsave()

A second case study for music research


  • An advantage of using R to work with data is that the same code can be run for different data sets of different sizes, subject to the data sets being in similar formats

Some best practices when writing code in R


  • Keep your files organised in your working directory
  • Consider your working directory and what is required to reproduce your code (e.g., packages)
  • Be consistent in your naming conventions
  • Use # to add comments to your code