It is highly recommended to use R as well as RStudio. RStudio is a separate piece of software used to write and run R code.
This will only need to be run once, ever. It will install the necessary packages and functions to your computer.
|id||Unique identifier for each person|
|gender||Sex of person (“M” = male / “F” = female)||numeric|
|race||Race of person (“W” = white / “N” = nonwhite)||numeric|
|dob||date of birth||character|
|pybegin||date to begin follow-up||character|
|dlo||date last observed. Minimum of end of study, date of death, date lost to follow-up||character|
|vs||an indicator of ‘D’ for those who are deceased. All other values will be treated as alive or censored as of dlo.||character|
|code||ICD code of death; missing if person is censored or alive at study end date||character|
|rev||revision of ICD code (7-10); missing if person is censored or alive at study end date||numeric|
Note: R variable names are case sensitive! All variable names must be lower case. This can be done in any software. Below is an example person file format:
NOTE: Paths in R use forward slashes (/) instead of the standard back slashes () used by windows.
The date columns will need to be converted to R date values. This can be done using the as.Date() function. This function requires the user to specify the format of the dates being converted. Below gives a key in how to describe the date format:
So, the above date columns in the above example person file have the
general format of “%m/%d/%Y”.
Below gives example code to read in the person file as well as code to convert the date columns to date values:
library(LTASR) library(tidyverse) person <- read_csv('C:/person.csv') %>% mutate(dob = as.Date(dob, format='%m/%d/%Y'), pybegin = as.Date(pybegin, format='%m/%d/%Y'), dlo = as.Date(dlo, format='%m/%d/%Y'))
The LTASR package comes with an example person file, called person_example, that can be used for testing. For this example, it will be read in, its dates will be converted and will be saved as ‘person’ similar to the above code:
person <- person_example %>% mutate(dob = as.Date(dob, format='%m/%d/%Y'), pybegin = as.Date(pybegin, format='%m/%d/%Y'), dlo = as.Date(dlo, format='%m/%d/%Y'))
rateobj <- parseRate('C:/UCOD_WashingtonState012018.xml')
Alternatively, the LTASR package comes pre-installed with a rate object for the 119 LTAS minors of the US population from 1960-2020. This can be used in any analyses, but must be explicitly called first:
rateobj <- us_119ucod_19602020
py_table <- get_table(person, rateobj)
Once the py_table has been created, it can be saved using the write_csv() function. This file contains person day counts (pdays) and outcome counts for each minor (_o1, _o2, …..).
With your stratified data, you can quickly create a table with of all SMRs for each minor found in the rate file using the smr_minor() function as below:
smr_minor_table <- smr_minor(py_table, rateobj)
View(smr_minor_table) write_csv(smr_minor_table, "C:/SMR_Minors.csv")
smr_major_table <- smr_major(smr_minor_table, rateobj) View(smr_major_table) write_csv(smr_major_table, "C:/SMR_Majors")
Additionally, any custom grouping can be calculated as well using the smr_custom() function where grouping are defined as a vector of numbers. Below defines groupings for all deaths and all cancers:
minor_grouping <- 1:119 all_causes <- smr_custom(smr_minor_table, minor_grouping) View(all_causes) minor_grouping <- 4:40 all_cancers <- smr_custom(smr_minor_table, minor_grouping) View(all_cancers)