Photo by Madeleine Kohler on Unsplash
Once upon a time, people traveled all over the world, and some stayed in hotels and others chose to stay in other people’s houses that they booked through Airbnb. As in many cities, Airbnb has had an impact on the housing market of New York. Using data provided by Airbnb, we can explore how Airbnb availability and prices vary by neighborhood.
hw-01-airbnb-YOUR_GITHUB_USERNAME
.hw-01.Rmd
and Knit it. Make sure it compiles without errors. The output will be in the file markdown .md
file with the same name. (You’ll turn in .md file, and the plots generated along with it).Before we introduce the data, let’s warm up with some simple exercises. Keep an eye out in the instructions for where you are instructed to: 🧶 knit ✅ commit ⬆️ push
We’ll use the tidyverse package for much of the data wrangling and visualisation. The data lives on the course website, and is loaded below. These packages are already installed for you as long as you are working in the BST430-fall2021 rstudio.cloud workspace) You can load them, and the nycbnb
data by running the following in your Console:
library(tidyverse)
= read_csv("https://urmc-bst.github.io/bst430-fall2021-site/hw_lab_instruction/hw-01-airbnb/data/nylistings.csv") nycbnb
The data is loaded in the first code chunk in your template into an object called nycbnb
.
You can view the dataset as a spreadsheet using the View()
function. Note that you should not put this function in your R Markdown document, but instead type it directly in the Console, as it pops open a new window (and the concept of popping open a window in a static document doesn’t really make sense…). When you run this in the console, you’ll see the following data viewer window pop up.
View(nycbnb)
You can find out more about the dataset by inspecting its data dictionary, available here: https://docs.google.com/spreadsheets/d/1iWCNJcSutYqpULSQHlNyGInUvHg2BoUGoNRIGa6Szc4/edit#gid=982310896, and you can read more about the project that collected it here: http://insideairbnb.com/.
Hint: The Markdown Quick Reference sheet has an example of inline R code. You can access it from the Help menu in RStudio. You can also look at the markdown cheatsheet available on the course website.
View(nycbnb)
in your Console to view the data in the data viewer. What does each row in the dataset represent?🧶 ✅ ⬆️ Knit, commit, and push your changes to GitHub with an appropriate commit message. Make sure to commit and push all changed files so that your Git pane is cleared up afterwards.
Each column represents a variable. We can get a list of the variables in the data frame using the names()
function.
names(nycbnb)
ggplot(data = ___, mapping = aes(x = ___)) +
geom_histogram(binwidth = ___) +
facet_wrap(~___) # or facet_grid...
Let’s de-construct this code:
ggplot()
is the function we are using to build our plot, in layers.🧶 ✅ ⬆️ Knit, commit, and push your changes to GitHub with an appropriate commit message. Make sure to commit and push all changed files (including the hw-01_files
folder) so that your Git pane is cleared up afterwards.
geom_density
) of the distributions of listing prices in these five neighborhoods. In a third pipeline calculate the minimum, mean, median, standard deviation, IQR, maximum listing price, and the number of listings, in each of these neighborhoods. Use the visualization and the summary statistics to describe the distribution of listing prices in the neighborhoods. (Your answer will include three pipelines, one of which ends in a visualization, and a narrative.)review_scores_rating
) across neighborhoods. You get to decide what type of visualization to create and there is more than one correct answer! In your answer, include a brief interpretation of how Airbnb guests rate properties in general and how the neighborhoods compare to each other in terms of their ratings.🧶 ✅ ⬆️ Knit, commit, and push your changes to GitHub with an appropriate commit message. Make sure to commit and push all changed files so that your Git pane is cleared up afterwards.
review_scores_rating
) and price? Make a plot and explain what you think might be going on.🧶 ✅ ⬆️ Knit, commit, and push your changes to GitHub with an appropriate commit message. Make sure to commit and push all changed files so that your Git pane is cleared up afterwards and review the md document on GitHub to make sure you’re happy with the final state of your work.