The goal of the DSPG package is to provide access to commonly used publicly available data for the DSPG projects at ISU.

Installation

You can install the development version from GitHub with:

# install.packages("devtools")
devtools::install_github("DSPG-ISU/DSPG")

The above code might not work as supposed to because the repo is private. If the installation fails with an error along the lines that the repo doesn’t exist, make sure to clone the repo onto your machine. Unzip the folder. Open DSPG.Rpoj in RStudio (by double-clicking the file). Install the package using the command

# install.packages("devtools")
devtools::install()

Example

This is a basic example which shows you how to solve a common problem:

library(DSPG)

All of the data sets have an extensive example section with code that should help you get started.

The datasets available are

#>  [1] "acs"                          "asac_locations"              
#>  [3] "cf_resources"                 "churches"                    
#>  [5] "colleges"                     "cross_mental_health"         
#>  [7] "gambling"                     "health.clinics"              
#>  [9] "hospital_buildings"           "hospitals"                   
#> [11] "ia_cities"                    "ia_counties"                 
#> [13] "ia_election_2016"             "ia_features"                 
#> [15] "ia_precincts"                 "iowa_211"                    
#> [17] "iowaworks"                    "mat_locations"               
#> [19] "meetings"                     "parks"                       
#> [21] "regional_MHDS"                "regional_substance_treatment"
#> [23] "Rx_Drop_Off_Locations"        "scbhr_mhds"                  
#> [25] "southwest_mhds"               "sud"

Check each file with the R help, e.g. ?ia_cities, for more details.

How to contribute

This repository is organized in form of an R package. The structure of folders and files is sensitive, so be sure to first read the following instructions before making any changes (more on data in R packages):

Assume you have some csv file (or some other data object) that you want to include:

  1. Go to the folder stuff. Take a look at one of the .R files and copy the one that is the closest to what you would like to do. Let’s assume you call that file reading-my-file.R.

  2. Modify the file reading-my-file.R to read the csv file. Check that the resulting object looks like you expected it to look like. Make sure that your data set adheres to the naming conventions listed below.

  3. The name of the object will be the name of the data set that will be included. Do not use a name of an object that is already included in the package. Let’s assume that you called the object mydata.

  4. Use the command usethis::use_data(mydata). This creates a file mydata.rda in the data folder of the package. Check that it is there now.

  5. Now the documentation starts: open the file data.R in the folder R. We are using Roxygen2 for documentation. Take a look at the structure: each dataset has a separate section. The easiest will be to copy one of these sections and adjust for your own dataset. Make sure to describe each variable in the dataset; follow along the same order as in the data set. Use names(mydata) to double check on the order of the variables.

  6. Almost done! Use the command devtools::document() to make sure that the documentation worked. This command creates the file mydata.Rd in the man folder.

  7. Check your package with the command devtools::check(). This command will run a series of tests and should result in 0 errors, 0 warnings, and 0 notes. If that is not the case, make sure to address each one of these issues.

  8. Add all newly created files to your github repo: the .rda file, the .Rd file and the .R file in stuff. Push to the master.

  9. The package is placed under Travis-CI, i.e. on the server side a check on the package will be run. In case of an error during this check, an email will be sent to you with the log resulting in the error. Make sure to address the error and push changes to fix the repo again.

Naming conventions

We are trying to ensure maximum consistency between data sets and use the following conventions:

  1. lower case for all variable names
  2. name describes the name of a facility/organization/meeting
  3. longitude, latitude or geometry write out geographic locations as longitude and latitude
  4. use _ to separate between multiple words in a name