Registers managed by the Swedish Cancer centers (quality registers and the cancer register) have date variables in different formats. This package can recognise and handle such dates.
library(incadata)
RCC dates are usually in the form %Y-%m-%d
, such as “2016-06-17”. These are recognised by ordinary R-functions such as as.Date
if there are no missing values or if missing values are coded as NA. It is however common for RCC that missing dates are coded as empty strings. Then:
d <- c("", "2016-06-17")
as.Date(d)
## Error in charToDate(x): character string is not in a standard unambiguous format
The as.Dates
function (note the plural) might then be easier to use.
as.Dates(d)
## [1] NA "2016-06-17"
The oringinal motivation for the package was to handle old date variables from the cancer register. Days and even months are sometimes coded as “00” (unknown). If so happens, as.Dates
(note the plural) might still recognise the date and will replace “00” by an approximate date:
as.Date("2000-01-00") # as.Date fails!
## Error in charToDate(x): character string is not in a standard unambiguous format
as.Dates("2000-01-00") # Missing day
## [1] "2000-01-15"
as.Dates("2000-00-00") # Missing month and day
## [1] "2000-07-01"
Some old dates might also be in the format %Y%V
(see ?strptime
), such as “7403” for week 3 in 1974. This is tricky for four reasons:
as.Date("7403")
## Error in charToDate(x): character string is not in a standard unambiguous format
as.Dates("7403")
## [1] "1974-01-17"
It is also possible to have a mixture of different dates within the same vector:
as.Dates(c("", NA, "2000-01-01", "20000101", "20000000", "7403"))
## [1] NA NA "2000-01-01" "2000-01-01" "2000-07-01"
## [6] "1974-01-17"