We have pulled some data on platypuses from the Atlas of Living Australia using the “ALA4R” package, and done some subsetting of the data to only look at a few of the key variables, and for the state of Victoria.
Let’s read in data with read_csv:
library(readr)
platypus <- read_csv(here::here("data/platypus.csv"))
## Parsed with column specification:
## cols(
## id = col_character(),
## commonName = col_character(),
## scientificName = col_character(),
## state = col_character(),
## latitude = col_double(),
## longitude = col_double(),
## eventDate = col_date(format = ""),
## sex = col_character()
## )
# platypus <- read_csv(here::here("exercises/1b/data/platypus.csv"))
This code can be read as: “read in this .csv file from the data folder”.
(We’ll talk more about what here::here means in an upcoming lecture.)
There are observations of platypus that we have seen.
Task: - Add a section header called “About the data” to your document. Write a paragraph about the “Atlas of Living Australia” (use your internet search skills!). - You could even try adding a picture of a platypus into your report. You can do this using markdown syntax or include_graphics() - try looking online for the rstudio markdown cheatsheet, or typing ?include_graphics() into the console.
Let’s make some plots.
Two of the variables in the data set are the latitude and longitude indicating where the animal was spotted. This is going to be the first plot, made using the ggplot2 package from tidyverse suite.
Use View(platypus) to open a “spreadsheet” view of the data.
What do you see?
What was the year with the most platypus?
How do you think you would work that out?
What steps would you need to take?
We will cover how to answer these kinds of questions and more in class!
While we can look at the data in a spreadsheet, we can’t answer those questions.
We need to do some data manipulation and summarising using tools from R!
Let’s subset the data to only look at platypus sightings from Victoria
platypus_vic <- platypus %>%
filter(state == "Victoria")
Now try plotting this data, using some of the code from above
Below contains the code used to download the data
# load the ALA4R package, which contains the platypus data
library(ALA4R)
# Take a look at what the package does using the code `help(package="ALA4R")`
# Look up the scientific name for platypus using:
specieslist("platypus")
# This returns a lot of different organisms with "platypus" in the name, but you should be able to find one line with the relevant information, that its scientific name is "Ornithorhynchus anatinus".
platypus <- occurrences("Ornithorhynchus anatinus", download_reason_id=10)
# 518426.7 is NOWHERE near Australia. Let's filter it out
# subset the data using dplyr filter and select commands
platypus <- platypus$data %>%
filter(longitude < 518426) %>%
select(id,
commonName,
scientificName,
state,
latitude,
longitude,
eventDate,
sex) %>%
as_tibble()
# write_csv(platypus, here::here("exercises/1b/data/platypus.csv"))
write_csv(platypus, here::here("data/platypus.csv"))
# You can learn more about `read_csv` and `write_csv` by typing `?write_csv`.