class: ur-title, center, middle, title-slide # BST430 Lecture 12 ## Functions (i) ### Andrew McDavid ### U of Rochester ### 2021-10-13 (updated: 2021-10-26) --- [cran]: https://cloud.r-project.org [cran-faq]: https://cran.r-project.org/faqs.html [cran-R-admin]: http://cran.r-project.org/doc/manuals/R-admin.html [cran-add-ons]: https://cran.r-project.org/doc/manuals/R-admin.html#Add_002don-packages [r-proj]: https://www.r-project.org [stat-545]: https://stat545.com [software-carpentry]: https://software-carpentry.org [cran-r-extensions]: https://cran.r-project.org/doc/manuals/r-release/R-exts.html <!--RStudio Links--> [rstudio-preview]: https://www.rstudio.com/products/rstudio/download/preview/ [rstudio-official]: https://www.rstudio.com/products/rstudio/#Desktop [rstudio-workbench]: https://www.rstudio.com/wp-content/uploads/2014/04/rstudio-workbench.png [rstudio-support]: https://support.rstudio.com/hc/en-us [rstudio-R-help]: https://support.rstudio.com/hc/en-us/articles/200552336-Getting-Help-with-R [rstudio-customizing]: https://support.rstudio.com/hc/en-us/articles/200549016-Customizing-RStudio [rstudio-key-shortcuts]: https://support.rstudio.com/hc/en-us/articles/200711853-Keyboard-Shortcuts [rstudio-command-history]: https://support.rstudio.com/hc/en-us/articles/200526217-Command-History [rstudio-using-projects]: https://support.rstudio.com/hc/en-us/articles/200526207-Using-Projects [rstudio-code-snippets]: https://support.rstudio.com/hc/en-us/articles/204463668-Code-Snippets [rstudio-dplyr-cheatsheet-download]: https://github.com/rstudio/cheatsheets/raw/master/data-transformation.pdf [rstudio-regex-cheatsheet]: https://www.rstudio.com/wp-content/uploads/2016/09/RegExCheatsheet.pdf [rstudio-devtools]: https://www.rstudio.com/products/rpackages/devtools/ <!--HappyGitWithR Links--> [happy-git]: https://happygitwithr.com [hg-install-git]: https://happygitwithr.com/install-git.html [hg-git-client]: https://happygitwithr.com/git-client.html [hg-github-account]: https://happygitwithr.com/github-acct.html [hg-install-r-rstudio]: https://happygitwithr.com/install-r-rstudio.html [hg-connect-intro]: https://happygitwithr.com/connect-intro.html [hg-browsability]: https://happygitwithr.com/workflows-browsability.html [hg-shell]: https://happygitwithr.com/shell.html <!--Package Links--> [rmarkdown]: https://rmarkdown.rstudio.com [knitr-faq]: https://yihui.name/knitr/faq/ [tidyverse-main-page]: https://www.tidyverse.org [tidyverse-web]: https://tidyverse.tidyverse.org [tidyverse-github]: https://github.com/hadley/tidyverse [dplyr-web]: https://dplyr.tidyverse.org [dplyr-cran]: https://CRAN.R-project.org/package=dplyr [dplyr-github]: https://github.com/hadley/dplyr [dplyr-vignette-intro]: https://cran.r-project.org/web/packages/dplyr/vignettes/dplyr.html [dplyr-vignette-window-fxns]: https://cran.r-project.org/web/packages/dplyr/vignettes/window-functions.html [dplyr-vignette-two-table]: https://dplyr.tidyverse.org/articles/two-table.html [lubridate-web]: https://lubridate.tidyverse.org [lubridate-cran]: https://CRAN.R-project.org/package=lubridate [lubridate-github]: https://github.com/tidyverse/lubridate [lubridate-vignette]: https://cran.r-project.org/web/packages/lubridate/vignettes/lubridate.html [tidyr-web]: https://tidyr.tidyverse.org [tidyr-cran]: https://CRAN.R-project.org/package=tidyr [readr-web]: https://readr.tidyverse.org [readr-vignette-intro]: https://cran.r-project.org/web/packages/readr/vignettes/readr.html [stringr-web]: https://stringr.tidyverse.org [stringr-cran]: https://CRAN.R-project.org/package=stringr [ggplot2-web]: https://ggplot2.tidyverse.org [ggplot2-tutorial]: https://github.com/jennybc/ggplot2-tutorial [ggplot2-reference]: https://docs.ggplot2.org/current/ [ggplot2-cran]: https://CRAN.R-project.org/package=ggplot2 [ggplot2-github]: https://github.com/tidyverse/ggplot2 [ggplot2-theme-args]: https://ggplot2.tidyverse.org/reference/ggtheme.html#arguments [gapminder-web]: https://www.gapminder.org [gapminder-cran]: https://CRAN.R-project.org/package=gapminder [assertthat-cran]: https://CRAN.R-project.org/package=assertthat [assertthat-github]: https://github.com/hadley/assertthat [ensurer-cran]: https://CRAN.R-project.org/package=ensurer [ensurer-github]: https://github.com/smbache/ensurer [assertr-cran]: https://CRAN.R-project.org/package=assertr [assertr-github]: https://github.com/ropensci/assertr [assertive-cran]: https://CRAN.R-project.org/package=assertive [assertive-bitbucket]: https://bitbucket.org/richierocks/assertive/src/master/ [testthat-cran]: https://CRAN.R-project.org/package=testthat [testthat-github]: https://github.com/r-lib/testthat [testthat-web]: https://testthat.r-lib.org [viridis-cran]: https://CRAN.R-project.org/package=viridis [viridis-github]: https://github.com/sjmgarnier/viridis [viridis-vignette]: https://cran.r-project.org/web/packages/viridis/vignettes/intro-to-viridis.html [colorspace-cran]: https://CRAN.R-project.org/package=colorspace [colorspace-vignette]: https://cran.r-project.org/web/packages/colorspace/vignettes/hcl-colors.pdf [cowplot-cran]: https://CRAN.R-project.org/package=cowplot [cowplot-github]: https://github.com/wilkelab/cowplot [cowplot-vignette]: https://cran.r-project.org/web/packages/cowplot/vignettes/introduction.html [devtools-cran]: https://CRAN.R-project.org/package=devtools [devtools-github]: https://github.com/r-lib/devtools [devtools-web]: https://devtools.r-lib.org [devtools-cheatsheet]: https://www.rstudio.com/wp-content/uploads/2015/03/devtools-cheatsheet.pdf [devtools-cheatsheet-old]: https://rawgit.com/rstudio/cheatsheets/master/package-development.pdf [devtools-1-6]: https://blog.rstudio.com/2014/10/02/devtools-1-6/ [devtools-1-8]: https://blog.rstudio.com/2015/05/11/devtools-1-9-0/ [devtools-1-9-1]: https://blog.rstudio.com/2015/09/13/devtools-1-9-1/ [googlesheets-cran]: https://CRAN.R-project.org/package=googlesheets [googlesheets-github]: https://github.com/jennybc/googlesheets [tidycensus-cran]: https://CRAN.R-project.org/package=tidycensus [tidycensus-github]: https://github.com/walkerke/tidycensus [tidycensus-web]: https://walkerke.github.io/tidycensus/index.html [fs-web]: https://fs.r-lib.org/index.html [fs-cran]: https://CRAN.R-project.org/package=fs [fs-github]: https://github.com/r-lib/fs [plumber-web]: https://www.rplumber.io [plumber-docs]: https://www.rplumber.io/docs/ [plumber-github]: https://github.com/trestletech/plumber [plumber-cran]: https://CRAN.R-project.org/package=plumber [plyr-web]: http://plyr.had.co.nz [magrittr-web]: https://magrittr.tidyverse.org [forcats-web]: https://forcats.tidyverse.org [glue-web]: https://glue.tidyverse.org [stringi-cran]: https://CRAN.R-project.org/package=stringi [rex-github]: https://github.com/kevinushey/rex [rcolorbrewer-cran]: https://CRAN.R-project.org/package=RColorBrewer [dichromat-cran]: https://CRAN.R-project.org/package=dichromat [rdryad-web]: https://docs.ropensci.org/rdryad/ [rdryad-cran]: https://CRAN.R-project.org/package=rdryad [rdryad-github]: https://github.com/ropensci/rdryad [roxygen2-cran]: https://CRAN.R-project.org/package=roxygen2 [roxygen2-vignette]: https://cran.r-project.org/web/packages/roxygen2/vignettes/rd.html [shinythemes-web]: https://rstudio.github.io/shinythemes/ [shinythemes-cran]: https://CRAN.R-project.org/package=shinythemes [shinyjs-web]: https://deanattali.com/shinyjs/ [shinyjs-cran]: https://CRAN.R-project.org/package=shinyjs [shinyjs-github]: https://github.com/daattali/shinyjs [leaflet-web]: https://rstudio.github.io/leaflet/ [leaflet-cran]: https://CRAN.R-project.org/package=leaflet [leaflet-github]: https://github.com/rstudio/leaflet [ggvis-web]: https://ggvis.rstudio.com [ggvis-cran]: https://CRAN.R-project.org/package=ggvis [usethis-web]: https://usethis.r-lib.org [usethis-cran]: https://CRAN.R-project.org/package=usethis [usethis-github]: https://github.com/r-lib/usethis [pkgdown-web]: https://pkgdown.r-lib.org [gh-github]: https://github.com/r-lib/gh [httr-web]: https://httr.r-lib.org [httr-cran]: https://CRAN.R-project.org/package=httr [httr-github]: https://github.com/r-lib/httr [gistr-web]: https://docs.ropensci.org/gistr [gistr-cran]: https://CRAN.R-project.org/package=gistr [gistr-github]: https://github.com/ropensci/gistr [rvest-web]: https://rvest.tidyverse.org [rvest-cran]: https://CRAN.R-project.org/package=rvest [rvest-github]: https://github.com/tidyverse/rvest [xml2-web]: https://xml2.r-lib.org [xml2-cran]: https://CRAN.R-project.org/package=xml2 [xml2-github]: https://github.com/r-lib/xml2 [jsonlite-paper]: https://arxiv.org/abs/1403.2805 [jsonlite-cran]: https://CRAN.R-project.org/package=jsonlite [jsonlite-github]: https://github.com/jeroen/jsonlite [readxl-web]: https://readxl.tidyverse.org [readxl-github]: https://github.com/tidyverse/readxl [readxl-cran]: https://CRAN.R-project.org/package=readxl [janitor-web]: http://sfirke.github.io/janitor/ [janitor-cran]: https://CRAN.R-project.org/package=janitor [janitor-github]: https://github.com/sfirke/janitor [purrr-web]: https://purrr.tidyverse.org [curl-cran]: https://CRAN.R-project.org/package=curl <!--Shiny links--> [shinydashboard-web]: https://rstudio.github.io/shinydashboard/ [shinydashboard-cran]: https://CRAN.R-project.org/package=shinydashboard [shinydashboard-github]: https://github.com/rstudio/shinydashboard [shiny-official-web]: https://shiny.rstudio.com [shiny-official-tutorial]: https://shiny.rstudio.com/tutorial/ [shiny-cheatsheet]: https://shiny.rstudio.com/images/shiny-cheatsheet.pdf [shiny-articles]: https://shiny.rstudio.com/articles/ [shiny-bookdown]: https://bookdown.org/yihui/rmarkdown/shiny-documents.html [shiny-google-groups]: https://groups.google.com/forum/#!forum/shiny-discuss [shiny-stack-overflow]: https://stackoverflow.com/questions/tagged/shiny [shinyapps-web]: https://www.shinyapps.io [shiny-server-setup]: https://deanattali.com/2015/05/09/setup-rstudio-shiny-server-digital-ocean/ [shiny-reactivity]: https://shiny.rstudio.com/articles/understanding-reactivity.html [shiny-debugging]: https://shiny.rstudio.com/articles/debugging.html [shiny-server]: https://www.rstudio.com/products/shiny/shiny-server/ <!--Publications--> [adv-r]: http://adv-r.had.co.nz [adv-r-fxns]: http://adv-r.had.co.nz/Functions.html [adv-r-dsl]: http://adv-r.had.co.nz/dsl.html [adv-r-defensive-programming]: http://adv-r.had.co.nz/Exceptions-Debugging.html#defensive-programming [adv-r-fxn-args]: http://adv-r.had.co.nz/Functions.html#function-arguments [adv-r-return-values]: http://adv-r.had.co.nz/Functions.html#return-values [adv-r-closures]: http://adv-r.had.co.nz/Functional-programming.html#closures [r4ds]: https://r4ds.had.co.nz [r4ds-transform]: https://r4ds.had.co.nz/transform.html [r4ds-strings]: https://r4ds.had.co.nz/strings.html [r4ds-readr-strings]: https://r4ds.had.co.nz/data-import.html#readr-strings [r4ds-dates-times]: https://r4ds.had.co.nz/dates-and-times.html [r4ds-data-import]: http://r4ds.had.co.nz/data-import.html [r4ds-relational-data]: https://r4ds.had.co.nz/relational-data.html [r4ds-pepper-shaker]: https://r4ds.had.co.nz/vectors.html#lists-of-condiments [r-pkgs2]: https://r-pkgs.org/index.html [r-pkgs2-whole-game]: https://r-pkgs.org/whole-game.html [r-pkgs2-description]: https://r-pkgs.org/description.html [r-pkgs2-man]: https://r-pkgs.org/man.htm [r-pkgs2-tests]: https://r-pkgs.org/tests.html [r-pkgs2-namespace]: https://r-pkgs.org/namespace.html [r-pkgs2-vignettes]: https://r-pkgs.org/vignettes.html [r-pkgs2-release]: https://r-pkgs.org/release.html [r-pkgs2-r-code]: https://r-pkgs.org/r.html#r [r-graphics-cookbook]: http://shop.oreilly.com/product/0636920023135.do [cookbook-for-r]: http://www.cookbook-r.com [cookbook-for-r-graphs]: http://www.cookbook-r.com/Graphs/ [cookbook-for-r-multigraphs]: http://www.cookbook-r.com/Graphs/Multiple_graphs_on_one_page_(ggplot2)/ [elegant-graphics-springer]: https://www.springer.com/gp/book/9780387981413 [testthat-article]: https://journal.r-project.org/archive/2011-1/RJournal_2011-1_Wickham.pdf [worry-about-color]: https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=2&cad=rja&uact=8&ved=2ahUKEwi0xYqJ8JbjAhWNvp4KHViYDxsQFjABegQIABAC&url=https%3A%2F%2Fwww.researchgate.net%2Fprofile%2FAhmed_Elhattab2%2Fpost%2FPlease_suggest_some_good_3D_plot_tool_Software_for_surface_plot%2Fattachment%2F5c05ba35cfe4a7645506948e%2FAS%253A699894335557644%25401543879221725%2Fdownload%2FWhy%2BShould%2BEngineers%2Band%2BScientists%2BBe%2BWorried%2BAbout%2BColor_.pdf&usg=AOvVaw1qwjjGMd7h_z6TLUjzu7Nb [escaping-rgbland-pdf]: https://eeecon.uibk.ac.at/~zeileis/papers/Zeileis+Hornik+Murrell-2009.pdf [escaping-rgbland-doi]: https://doi.org/10.1016/j.csda.2008.11.033 <!--R Documentation--> [rdocs-extremes]: https://rdrr.io/r/base/Extremes.html [rdocs-range]: https://rdrr.io/r/base/range.html [rdocs-quantile]: https://rdrr.io/r/stats/quantile.html [rdocs-c]: https://rdrr.io/r/base/c.html [rdocs-list]: https://rdrr.io/r/base/list.html [rdocs-lm]: https://rdrr.io/r/stats/lm.html [rdocs-coef]: https://rdrr.io/r/stats/coef.html [rdocs-devices]: https://rdrr.io/r/grDevices/Devices.html [rdocs-ggsave]: https://rdrr.io/cran/ggplot2/man/ggsave.html [rdocs-dev]: https://rdrr.io/r/grDevices/dev.html <!--Wikipedia Links--> [wiki-snake-case]: https://en.wikipedia.org/wiki/Snake_case [wiki-hello-world]: https://en.wikipedia.org/wiki/%22Hello,_world!%22_program [wiki-janus]: https://en.wikipedia.org/wiki/Janus [wiki-nesting-dolls]: https://en.wikipedia.org/wiki/Matryoshka_doll [wiki-pure-fxns]: https://en.wikipedia.org/wiki/Pure_function [wiki-camel-case]: https://en.wikipedia.org/wiki/Camel_case [wiki-mojibake]: https://en.wikipedia.org/wiki/Mojibake [wiki-row-col-major-order]: https://en.wikipedia.org/wiki/Row-_and_column-major_order [wiki-boxplot]: https://en.wikipedia.org/wiki/Box_plot [wiki-brewer]: https://en.wikipedia.org/wiki/Cynthia_Brewer [wiki-vector-graphics]: https://en.wikipedia.org/wiki/Vector_graphics [wiki-raster-graphics]: https://en.wikipedia.org/wiki/Raster_graphics [wiki-dry]: https://en.wikipedia.org/wiki/Don%27t_repeat_yourself [wiki-web-scraping]: https://en.wikipedia.org/wiki/Web_scraping [wiki-xpath]: https://en.wikipedia.org/wiki/XPath [wiki-css-selector]: https://en.wikipedia.org/wiki/Cascading_Style_Sheets#Selector <!--Misc. Links--> [split-apply-combine]: https://www.jstatsoft.org/article/view/v040i01 [useR-2014-dropbox]: https://www.dropbox.com/sh/i8qnluwmuieicxc/AAAgt9tIKoIm7WZKIyK25lh6a [gh-pages]: https://pages.github.com [html-preview]: http://htmlpreview.github.io [tj-mahr-slides]: https://github.com/tjmahr/MadR_Pipelines [dataschool-dplyr]: https://www.dataschool.io/dplyr-tutorial-for-faster-data-manipulation-in-r/ [xckd-randall-munroe]: https://fivethirtyeight.com/features/xkcd-randall-munroe-qanda-what-if/ [athena-zeus-forehead]: https://tinyurl.com/athenaforehead [tidydata-lotr]: https://github.com/jennybc/lotr-tidy#readme [minimal-make]: https://kbroman.org/minimal_make/ [write-data-tweet]: https://twitter.com/vsbuffalo/statuses/358699162679787521 [belt-and-suspenders]: https://www.wisegeek.com/what-does-it-mean-to-wear-belt-and-suspenders.htm [research-workflow]: https://www.carlboettiger.info/2012/05/06/research-workflow.html [yak-shaving]: https://seths.blog/2005/03/dont_shave_that/ [yaml-with-csv]: https://blog.datacite.org/using-yaml-frontmatter-with-csv/ [reproducible-examples]: https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example [blog-strings-as-factors]: https://notstatschat.tumblr.com/post/124987394001/stringsasfactors-sigh [bio-strings-as-factors]: https://simplystatistics.org/2015/07/24/stringsasfactors-an-unauthorized-biography [stackexchange-outage]: https://stackstatus.net/post/147710624694/outage-postmortem-july-20-2016 [email-regex]: https://emailregex.com [fix-atom-bug]: https://davidvgalbraith.com/how-i-fixed-atom/ [icu-regex]: http://userguide.icu-project.org/strings/regexp [regex101]: https://regex101.com [regexr]: https://regexr.com [utf8-debug]: http://www.i18nqa.com/debug/utf8-debug.html [unicode-no-excuses]: https://www.joelonsoftware.com/2003/10/08/the-absolute-minimum-every-software-developer-absolutely-positively-must-know-about-unicode-and-character-sets-no-excuses/ [programmers-encoding]: http://kunststube.net/encoding/ [encoding-probs-ruby]: https://www.justinweiss.com/articles/3-steps-to-fix-encoding-problems-in-ruby/ [theyre-to-theyre]: https://www.justinweiss.com/articles/how-to-get-from-theyre-to-theyre/ [lubridate-ex1]: https://www.r-exercises.com/2016/08/15/dates-and-times-simple-and-easy-with-lubridate-part-1/ [lubridate-ex2]: https://www.r-exercises.com/2016/08/29/dates-and-times-simple-and-easy-with-lubridate-exercises-part-2/ [lubridate-ex3]: https://www.r-exercises.com/2016/10/04/dates-and-times-simple-and-easy-with-lubridate-exercises-part-3/ [google-sql-join]: https://www.google.com/search?q=sql+join&tbm=isch [min-viable-product]: https://blog.fastmonkeys.com/?utm_content=bufferc2d6e&utm_medium=social&utm_source=twitter.com&utm_campaign=buffer [telescope-rule]: http://c2.com/cgi/wiki?TelescopeRule [unix-philosophy]: http://www.faqs.org/docs/artu/ch01s06.html [twitter-wrathematics]: https://twitter.com/wrathematics [robbins-effective-graphs]: https://www.amazon.com/Creating-Effective-Graphs-Naomi-Robbins/dp/0985911123 [r-graph-catalog-github]: https://github.com/jennybc/r-graph-catalog [google-pie-charts]: https://www.google.com/search?q=pie+charts+suck [why-pie-charts-suck]: https://www.richardhollins.com/blog/why-pie-charts-suck/ [worst-figure]: https://robjhyndman.com/hyndsight/worst-figure/ [naomi-robbins]: http://www.nbr-graphs.com [hadley-github-index]: https://hadley.github.io [scipy-2015-matplotlib-colors]: https://www.youtube.com/watch?v=xAoljeRJ3lU&feature=youtu.be [winston-chang-github]: https://github.com/wch [favorite-rgb-color]: https://manyworldstheory.com/2013/01/15/my-favorite-rgb-color/ [stowers-color-chart]: https://web.archive.org/web/20121022044903/http://research.stowers-institute.org/efg/R/Color/Chart/ [stowers-using-color-in-R]: https://www.uv.es/conesa/CursoR/material/UsingColorInR.pdf [zombie-project]: https://imgur.com/ewmBeQG [tweet-project-resurfacing]: https://twitter.com/JohnDCook/status/522377493417033728 [rgraphics-looks-tips]: https://blog.revolutionanalytics.com/2009/01/10-tips-for-making-your-r-graphics-look-their-best.html [rgraphics-svg-tips]: https://blog.revolutionanalytics.com/2011/07/r-svg-graphics.html [zev-ross-cheatsheet]: http://zevross.com/blog/2014/08/04/beautiful-plotting-in-r-a-ggplot2-cheatsheet-3/ [parker-writing-r-packages]: https://hilaryparker.com/2014/04/29/writing-an-r-package-from-scratch/ [broman-r-packages]: https://kbroman.org/pkg_primer/ [broman-tools4rr]: https://kbroman.org/Tools4RR/ [leeks-r-packages]: https://github.com/jtleek/rpackages [build-maintain-r-packages]: https://thepoliticalmethodologist.com/2014/08/14/building-and-maintaining-r-packages-with-devtools-and-roxygen2/ [murdoch-package-vignette-slides]: https://web.archive.org/web/20160824010213/http://www.stats.uwo.ca/faculty/murdoch/ism2013/5Vignettes.pdf [how-r-searches]: http://blog.obeautifulcode.com/R/How-R-Searches-And-Finds-Stuff/ # Motivation * Logic (and data...) should live in only one place * Abstraction and isolation * Clarity -- * As an interpreted, interactive language, the __process__ for writing functions is as important as the laws and syntax of R --- ## Load the Gapminder data [Gapminder](https://www.gapminder.org/about/) is a Swedish foundation that combats misconceptions about global development. These are just a tiny exerpt of the full data available there -- containing populaton, life expectancy, deflated GDP per capita in $USD. ```r library(gapminder) glimpse(gapminder) ``` ``` ## Rows: 1,704 ## Columns: 6 ## $ country <fct> "Afghanistan", "Afghanistan", "Afghanistan", … ## $ continent <fct> Asia, Asia, Asia, Asia, Asia, Asia, Asia, Asi… ## $ year <int> 1952, 1957, 1962, 1967, 1972, 1977, 1982, 198… ## $ lifeExp <dbl> 28.801, 30.332, 31.997, 34.020, 36.088, 38.43… ## $ pop <int> 8425333, 9240934, 10267083, 11537966, 1307946… ## $ gdpPercap <dbl> 779.4453, 820.8530, 853.1007, 836.1971, 739.9… ``` --- ```r ggplot(gapminder, aes(x = year, y = lifeExp, color = continent)) + geom_line(aes(group = country), alpha = .5) + scale_color_brewer(type = 'qual') + geom_boxplot(data = filter(gapminder, year %in% seq(from=1952, to = 2002, by = 20)), aes(group = interaction(year,continent)), width = 8, outlier.shape = NA, position = 'dodge') ``` <img src="l12-functions_files/figure-html/unnamed-chunk-3-1.png" width="60%" style="display: block; margin: auto;" /> --- ## Max - min You've got a numeric vector--`lifeExp` or `pop` or `gdpPercap` -- and you want to compute the difference between its max and min. Perhaps you want to do this after you slice up the Gapminder data by year, country, continent, or combinations thereof. --- ## Get something that works First, develop some working .alert[code for interactive use, using a representative input], say `lifeExp`. R functions that will be useful: `min()`, `max()`, `range()`: ```r min(gapminder$lifeExp) ``` ``` ## [1] 23.599 ``` ```r max(gapminder$lifeExp) ``` ``` ## [1] 82.603 ``` ```r range(gapminder$lifeExp) ``` ``` ## [1] 23.599 82.603 ``` --- ## Some natural solutions ```r max(gapminder$lifeExp) - min(gapminder$lifeExp) ``` ``` ## [1] 59.004 ``` ```r range(gapminder$lifeExp)[2] - range(gapminder$lifeExp)[1] ``` ``` ## [1] 59.004 ``` ```r range(gapminder[['lifeExp']])[2] - range(gapminder[['lifeExp']])[1] ``` ``` ## [1] 59.004 ``` ```r diff(range(gapminder$lifeExp)) ``` ``` ## [1] 59.004 ``` Internalize this "answer" because our informal testing relies on you noticing departures from this. --- ### Skateboard `\(\gg\)` perfectly formed rear-view mirror .pull-left[ Build [that skateboard](https://en.wikipedia.org/wiki/Minimum_viable_product) before you build the car or some fancy car part. A limited-but-functioning thing is useful to learn the dimensions of a problem. This is related to the [Telescope Rule](https://wiki.c2.com/?TelescopeRule) > It is faster to make a four-inch mirror then a six-inch mirror than to make a six-inch mirror. ] .pull-right[ <img src="l12/img/spotify-howtobuildmvp.jpg" width="100%" style="display: block; margin: auto;" /> .small[[From your ultimate guide to Minimum Viable Product ](https://blog.fastmonkeys.com/2014/06/18/minimum-viable-product-your-ultimate-guide-to-mvp-great-examples). Image attributed to the Spotify team] ] --- ## Turn the working interactive code into a function Add NO new functionality! Just write your very first R function. ```r max_minus_min = function(x){ max(x) - min(x) } max_minus_min(gapminder$lifeExp) ``` ``` ## [1] 59.004 ``` Check that you're getting the same answer as you did with your interactive code. Test it eyeball-o-metrically at this point. --- ## Test your function ### Test on new inputs Pick some new artificial inputs where you know (at least approximately) what your function should return. ```r max_minus_min(1:10) ``` ``` ## [1] 9 ``` ```r max_minus_min(runif(1000)) ``` ``` ## [1] 0.9989612 ``` I know that 10 minus 1 is 9. I know that random uniform [0, 1] variates will be between 0 and 1. Therefore max - min should be less than 1. If I take LOTS of them, max - min should be pretty close to 1. It is intentional that I tested on integer input as well as floating point. Likewise, I like to use valid-but-random data for this sort of check. --- ### Test on real data but *different* real data Back to the real world now. Two other quantitative variables are lying around: `gdpPercap` and `pop`. Let's have a go. ```r max_minus_min(gapminder$gdpPercap) ``` ``` ## [1] 113282 ``` ```r max_minus_min(gapminder$pop) ``` ``` ## [1] 1318623085 ``` Either check these results "by hand" or apply the "does that even make sense?" test. --- ### Test on weird stuff Don't get truly diabolical (yet). Just make the kind of mistakes you can imagine making at 2am when, 3 years from now, you rediscover this useful function you wrote. ```r max_minus_min(gapminder) ## hey sometimes things "just work" on data.frames! ``` ``` ## Error in FUN(X[[i]], ...): only defined on a data frame with all numeric-alike variables ``` ```r max_minus_min(gapminder$country) ## factors are kind of like integer vectors, no? ``` ``` ## Error in Summary.factor(structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, : 'max' not meaningful for factors ``` ```r max_minus_min("eggplants are purple") ## i have no excuse for this one ``` ``` ## Error in max(x) - min(x): non-numeric argument to binary operator ``` --- How happy are you with those error messages? You must imagine that some entire __script__ has failed and that you were hoping to just knit it without re-reading it. If a colleague or future you encountered these errors, how hard is it to pinpoint the usage problem? --- ### Application exercise Browse to [rstudio cloud](https://rstudio.cloud/spaces/162296/project/3099211) and take a few minutes with a partner to try to break `max_minus_min`. --- ### I will scare you now Here are some great examples students in UBC's STAT545 devised where the function __should break but it does not.__ ```r max_minus_min(gapminder[c('lifeExp', 'gdpPercap', 'pop')]) ``` ``` ## [1] 1318683072 ``` ```r max_minus_min(c(TRUE, TRUE, FALSE, TRUE, TRUE)) ``` ``` ## [1] 1 ``` In the first case, a data.frame containing just the quantitative variables is eventually coerced into numeric vector. The second case is less odious -- a logical vector is converted to zeroes and ones. --- ## Check the validity of arguments For functions that will be used again -- which is not all of them! -- it is good to check the validity of arguments. This implements a rule from [the Unix philosophy](https://homepage.cs.uri.edu/~thenry/resources/unix_art/ch01s06.html): > Rule of Repair: When you must fail, fail noisily and as soon as possible. --- ### `stopifnot()` I use it here to make sure the input `x` is a numeric vector. ```r max_minus_min = function(x) { stopifnot(is.numeric(x)) max(x) - min(x) } max_minus_min(gapminder) ``` ``` ## Error in max_minus_min(gapminder): is.numeric(x) is not TRUE ``` ```r max_minus_min(gapminder$country) ``` ``` ## Error in max_minus_min(gapminder$country): is.numeric(x) is not TRUE ``` ```r max_minus_min("eggplants are purple") ``` ``` ## Error in max_minus_min("eggplants are purple"): is.numeric(x) is not TRUE ``` --- ```r max_minus_min(gapminder[c('lifeExp', 'gdpPercap', 'pop')]) ``` ``` ## Error in max_minus_min(gapminder[c("lifeExp", "gdpPercap", "pop")]): is.numeric(x) is not TRUE ``` ```r max_minus_min(c(TRUE, TRUE, FALSE, TRUE, TRUE)) ``` ``` ## Error in max_minus_min(c(TRUE, TRUE, FALSE, TRUE, TRUE)): is.numeric(x) is not TRUE ``` And we see that it catches all of the self-inflicted damage we would like to avoid. --- ### if then stop `stopifnot()` doesn't provide a very good error message. The next approach is very widely used. Put your validity check inside an `if()` statement and call `stop()` yourself, with a custom error message, in the body. ```r max_minus_min = function(x) { if(!is.numeric(x)) { stop('I am so sorry, but this function only works for numeric input!\n', 'You have provided an object of class: ', class(x)[1]) } max(x) - min(x) } max_minus_min(gapminder) ``` ``` ## Error in max_minus_min(gapminder): I am so sorry, but this function only works for numeric input! ## You have provided an object of class: tbl_df ``` --- ## Error checking In addition to a gratuitous apology, the error raised also contains two more pieces of helpful info: * *Which* function threw the error. * Hints on how to fix things: expected class of input vs actual class. If it is easy to do so, I highly recommend this template: "you gave me THIS, but I need THAT". The tidyverse style guide has a very useful [chapter on how to construct error messages](https://style.tidyverse.org/error-messages.html). --- ### `if()`: a first taste of flow control ``` if( <BOOLEAN 1> ){ <STATEMENT 1> } else if ( <BOOLEAN 2> ) { <STATEMENT 2> } ... # else if else{ <FINAL STATEMENT> } ``` lets us alter the program behavior depending on conditions. We'll see more other examples of flow control when we discuss iteration with `for`, `while` `repeat`. --- ```r life = function(condition) { good_things = c("skiing", "cats", "health", "wilderness", "coffee") if (sum(condition %in% good_things) >= 3) { condition = work_without_question(condition) } else{ condition = life_crisis(condition) } } ``` ```r life(c("cats", "health")) ``` ``` ## Warning in life_crisis(condition): Joining a commune. ``` ```r life(c("cats", "skiing", "health")) ``` ``` ## [1] "Good capitalistic worker bee 🍯" ``` --- ### `if` vs `ifelse`/`if_else`/`case_when` * Use `if_else`/`ifelse` or `case_when` selectively alter portions of a vector * Use `if` to change program logic depending on a **length one boolean** - length > 1 will trigger a warning that you should treat as an error * Rule of thumb: use `case_when`/`if_else`/`ifelse` in a `mutate` statement. Otherwise think. --- ## Assertions: not just for functions! Also use `stopifnot` and other types of .alert[assertions] for checks in data analytical scripts! E.g.: - No duplicates in primary keys: `stopifnot(all(!duplicated(data$primary_key)))` - Data are of the correct type (numeric not character): `stopifnot(is.numeric(data$column))` - Same number of rows after joining on primary keys: `stopifnot(nrow(data)==nrow(data_join))` --- class: middle .hand[Another example of an R function] --- ## Read page for 23 Oct speech ```r url = "https://www.gov.scot/publications/coronavirus-covid-19-update-first-ministers-speech-23-october/" speech_page = read_html(url) ``` ```r speech_page ``` ``` ## {html_document} ## <html dir="ltr" lang="en"> ## [1] <head>\n<meta http-equiv="Content-Type" content="text/html ... ## [2] <body class="fontawesome site-header__container">\n\n\n\n\ ... ``` --- ## Extract components of 23 Oct speech ```r title = speech_page %>% html_node(".article-header__title") %>% html_text() date = speech_page %>% html_node(".content-data__list:nth-child(1) strong") %>% html_text() %>% dmy() location = speech_page %>% html_node(".content-data__list+ .content-data__list strong") %>% html_text() abstract = speech_page %>% html_node(".leader--first-para p") %>% html_text() text = speech_page %>% html_nodes("#preamble p") %>% html_text() %>% list() ``` --- ## Put it all in a data frame .pull-left[ ```r oct_23_speech = tibble( title = title, date = date, location = location, abstract = abstract, text = text, url = url ) oct_23_speech ``` ``` ## # A tibble: 1 × 6 ## title date location abstract text url ## <chr> <date> <chr> <chr> <lis> <chr> ## 1 Coronaviru… 2020-10-23 St Andrew… Statement g… <chr… https://w… ``` ] .pull-right[ <img src="l12/img/fm-speech-oct-23.png" width="75%" style="display: block; margin: auto;" /> ] --- ## When should you write a function? -- .pull-left[ <img src="l12/img/funct-all-things.png" width="100%" style="display: block; margin: auto;" /> ] -- .pull-right[ When you’ve copied and pasted a block of code more than twice. ] --- ## Turn your code into a function - Pick a short but informative **name**, preferably a verb. <br> <br> <br> <br> ```r scrape_speech = ``` --- ## Turn your code into a function - Pick a short but evocative **name**, preferably a verb. - List inputs, or **arguments**, to the function inside `function`. If we had more the call would look like `function(x, y, z)`. <br> ```r scrape_speech = function(x){ } ``` --- ## Turn your code into a function - Pick a short but informative **name**, preferably a verb. - List inputs, or **arguments**, to the function inside `function`. If we had more the call would look like `function(x, y, z)`. - Place the **code** you have developed in body of the function, a `{` block that immediately follows `function(...)`. ```r scrape_speech = function(url){ # code we developed earlier to scrape info # on single art piece goes here } ``` --- class: code50 ## `scrape_speech()` ```r scrape_speech = function(url) { speech_page = read_html(url) title = speech_page %>% html_node(".article-header__title") %>% html_text() date = speech_page %>% html_node(".content-data__list:nth-child(1) strong") %>% html_text() %>% dmy() location = speech_page %>% html_node(".content-data__list+ .content-data__list strong") %>% html_text() abstract = speech_page %>% html_node(".leader--first-para p") %>% html_text() text = speech_page %>% html_nodes("#preamble p") %>% html_text() %>% list() tibble( title = title, date = date, location = location, abstract = abstract, text = text, url = url ) } ``` --- class: middle # Writing functions --- ## What goes in / what comes out? - They take input(s) defined in the function definition ```r function([inputs separated by commas]){ # what to do with those inputs } ``` - By default they return the last value computed in the function ```r scrape_page = function(x){ # do bunch of stuff with the input... # return a tibble tibble(...) } ``` - You can define more outputs to be returned in a list as well as nice print methods (but we won't go there for now...) --- .question[ What is going on here? ] ```r add_2 = function(x){ x + 2 1000 } ``` ```r add_2(3) ``` ``` ## [1] 1000 ``` ```r add_2(10) ``` ``` ## [1] 1000 ``` --- ## Naming functions > "There are only two hard things in Computer Science: cache invalidation and naming things." - Phil Karlton --- ## Naming functions - Names should be short but clearly evoke what the function does -- - Names should be verbs, not nouns -- - Multi-word names should be separated by underscores (`snake_case` as opposed to `camelCase`) -- - A family of functions should be named similarly (`scrape_page()`, `scrape_speech()` OR `str_remove()`, `str_replace()` etc.) -- - Avoid overwriting existing (especially widely used) functions ```r # JUST DON'T mean = function(x){ x * 3 } ``` --- ## Resources / reading Reading: [R4ds Chapter 20](https://r4ds.had.co.nz/functions.html) [Advanced R][adv-r] [defensive programming][adv-r-defensive-programming] Packages for runtime assertions (the last 3 seem to be under more active development than `assertthat`): * assertthat on [CRAN][assertthat-cran] and [GitHub][assertthat-github] - *the Tidyverse option* * ensurer on [CRAN][ensurer-cran] and [GitHub][ensurer-github] - *general purpose, pipe-friendly* * assertr on [CRAN][assertr-cran] and [GitHub][assertr-github] - *explicitly data pipeline oriented* * assertive on [CRAN][assertive-cran] and [Bitbucket][assertive-bitbucket] - *rich set of built-in functions* --- # Acknowledgments Adapted from Jenny Bryan's STAT545 https://stat545.com/functions-part1.html and [data science in a box](https://rstudio-education.github.io/datascience-box/course-materials/slides/u2-d21-functions/u2-d21-functions.html#1)