In this worksheet, we’ll be looking at some erroneous plots and fixing them.

I think you might not have these two packages installed:

install.packages("ggridges")
install.packages("scales")
library(tidyverse)
library(gapminder)
library(ggridges)
library(scales)

Exercise 1: Overlapping Points

After fixing the error, fix the overlapping problem in the following plot (attribution: “R for data science”).

# BEFORE
ggplot(mpg, aes(cty, hwy)) +
  geom_point()

# AFTER
# Use geom_jitter to spread the points out
ggplot(mpg, aes(cty, hwy)) +
  geom_jitter()

# use geom_smooth, method = "lm" means linear regression model
ggplot(mpg,aes(cty,hwy)) +
    geom_jitter(alpha = 0.5, size =1) +
    geom_smooth(method = "lm") +
    theme_bw()

Exercise 2: Line for each Country

Fix this plot so that it shows life expectancy over time for each country. Notice that ggplot2 ignores the grouping of a tibble!

# plots look the same besides group_by
# group_by mostly used for summarise
gapminder %>% 
  group_by(country) %>% 
  ggplot(aes(year, lifeExp, group = country, colour = country == "Canada")) +
  geom_line(alpha = 0.2) +
  scale_colour_discrete("", labels = c("Other", "Canada"))

gapminder %>% 
#  group_by(country) %>% 
  ggplot(aes(year, lifeExp, group = country, colour = country == "Canada")) +
  geom_line(alpha = 0.2) +
  scale_colour_discrete("", labels = c("Other", "Canada"))

Exercise 3: More gdpPercap vs lifeExp

3(a) Facets

  • Change the x-axis text to be in “comma format” with scales::comma_format().
  • Separate each continent into sub-panels.
# facet_wrap(~ continent, scales="free")
# can use scales = "free_y" etc
ggplot(gapminder, aes(gdpPercap, lifeExp)) +
  geom_point(alpha = 0.2) +
  scale_x_log10(labels = comma_format()) +
  facet_wrap(~ continent)

3(b) Bubble Plot

  • Put the plots in one row, and free up the axes.
  • Make a bubble plot by making the size of the points proportional to population.
    • Try adding a scale_size_area() layer too (could also try scale_radius()).
  • Use shape=21 to distinguish between fill (interior) and colour (exterior).
  # colour identifies with perimeter/outside of the object
 # fill identifies with the fill of the shape
gapminder %>% 
  filter(continent != "Africa") %>% 
  ggplot(aes(gdpPercap, lifeExp, size = pop, fill = continent)) +
  facet_wrap(~ continent, nrow = 1) +
  geom_point(alpha = 0.5, shape = 21) +
  scale_x_log10(labels = scales::comma_format())

  scale_size_area()
## <ScaleContinuous>
##  Range:  
##  Limits:    0 --    1

A list of shapes can be found at the bottom of the scale_shape documentation.

3(c) Size “not working”

Instead of alpha transparency, suppose you’re wanting to fix the overplotting issue by plotting small points. Why is this not working? Fix it.

# move size out of aes()
ggplot(gapminder) +
  geom_point(aes(gdpPercap, lifeExp), size = 0.1) +
  scale_x_log10(labels = scales::dollar_format())