| Title: | Analyzing Population-Based Cancer Registry Data |
|---|---|
| Description: | Tools for cleaning, analyzing, visualizing, and reporting data from Population-Based Cancer Registries (PBCRs), with standardized workflows for filtering, selecting, and restructuring data, designed for routine registry operations. |
| Authors: | Qiong Chen [aut, cre, cph] (ORCID: <https://orcid.org/0000-0003-2401-0046>, affiliation: Department of Cancer Epidemiology, The Affiliated Cancer Hospital of Zhengzhou University & Henan Cancer Hospital, Henan Cancer Registry in China) |
| Maintainer: | Qiong Chen <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.2.10 |
| Built: | 2026-06-04 09:07:14 UTC |
| Source: | https://github.com/gigu003/canregtools |
add_labels() adds labels for selected variables such as sex, cancer,
and areacode in the dataset based on the specified language and label type.
add_labels( x, vars = c("sex", "cancer", "areacode"), names = NULL, label_type = "full", lang = "zh", sep = " ", as_factor = TRUE )add_labels( x, vars = c("sex", "cancer", "areacode"), names = NULL, label_type = "full", lang = "zh", sep = " ", as_factor = TRUE )
x |
A data frame or tibble to be labeled. |
vars |
A character vector of variable names to label. Default includes
|
names |
A character vector contains the labels added. |
label_type |
Type of the label used ("full" or "abbr"). |
lang |
Character, specify the output language, options are 'cn', or 'en', default is 'cn'. |
sep |
String used to separate labels when multiple languages are specified. |
as_factor |
Logical, indicate whether output value as factor. |
A data frame or tibble with labeled variables and reordered columns.
data("canregs") asr <- create_asr(canregs[[1]], year, sex, cancer) asr <- add_labels(asr, label_type = "full", lang = "zh") asr <- add_labels(asr, label_type = "full", lang = "en")data("canregs") asr <- create_asr(canregs[[1]], year, sex, cancer) asr <- add_labels(asr, label_type = "full", lang = "zh") asr <- add_labels(asr, label_type = "full", lang = "en")
Returns formatted labels (with optional units) for commonly used statistical variables in cancer registry reporting, such as incidence, mortality, crude rate, age-standardized rate, and proportion.
add_var_labels(x, label_type = "abbr", lang = "cn", break_line = TRUE)add_var_labels(x, label_type = "abbr", lang = "cn", break_line = TRUE)
x |
A character vector of variable codes to be labeled (e.g., |
label_type |
Label style: |
lang |
Language for the labels. One of |
break_line |
Logical. If |
This function looks up the corresponding label for each input variable code
in the internal dictionary tidy_var_maps[["stats"]], and optionally adds
a unit suffix (e.g., (1/10<sup>5</sup>) or (%)) depending on the variable type.
It supports Chinese and English, full names or abbreviations, and can format
the unit with or without a line break.
A character vector of the same length as x, where each element is a formatted label.
add_var_labels(c("cr", "asr")) add_var_labels(c("cr", "prop"), label_type = "full", lang = "en") add_var_labels("asr", break_line = FALSE)add_var_labels(c("cr", "asr")) add_var_labels(c("cr", "prop"), label_type = "full", lang = "en") add_var_labels("asr", break_line = FALSE)
This function calculates the age-standardized rate (ASR) using the direct method of standardization. It allows for adjustment of crude disease or mortality rates to a standard population structure, enabling valid comparisons across populations with different age distributions. The function also supports computation of confidence intervals using gamma, normal, or log-normal methods, and returns both crude rate (CR) and ASR with associated variances and interval estimates.
ageadjust( count, pop, rate = NULL, stdpop = NULL, method = "gamma", conf_level = 0.95, mp = 1e+05 )ageadjust( count, pop, rate = NULL, stdpop = NULL, method = "gamma", conf_level = 0.95, mp = 1e+05 )
count |
The number of cases of a specific disease or condition. |
pop |
The total population of the same group or region where the disease cases (count) were observed. |
rate |
Disease rate, which is the number of cases (count) per unit of population (pop). |
stdpop |
Standardized population for age standardization. |
method |
Method used for calculating the age-standardized rate, options are 'gamma', 'normal', or 'lognormal', default is 'gamma'. |
conf_level |
Confidence level for calculating confidence intervals, value between 0 and 1, default is 0.95. |
mp |
A multiplier used to scale the calculated rates. Default is 100000. |
Age standardized rate and its confidence interval.
cases <- c(50, 60, 45, 70) pop <- c(1000, 1200, 1100, 900) spop <- c(800, 1000, 1100, 900) ageadjust(cases, pop, stdpop = spop, mp = 100000)cases <- c(50, 60, 45, 70) pop <- c(1000, 1200, 1100, 900) spop <- c(800, 1000, 1100, 900) ageadjust(cases, pop, stdpop = spop, mp = 100000)
Computes the exact age in full years at the time of an event (e.g., diagnosis, death, or survey) by comparing a person's birth date to the date of the event.
calc_age(birth_date, onset_date)calc_age(birth_date, onset_date)
birth_date |
A vector of birth dates in |
onset_date |
A vector of corresponding event dates in |
This function calculates age in completed years, taking into account
whether the birthday has occurred before the event date in the given year.
If birth_date or onset_date contains NA,
the corresponding result will be NA.
A numeric vector of ages in years, with the same length
as birth_date and onset_date.
# Generate random birth dates set.seed(123) sdate <- as.Date("1960-01-01") edate <- as.Date("1980-12-31") bdate <- sample(seq(sdate, edate, by = "1 day"), 100, replace = TRUE) # Generate random event dates sdate <- as.Date("2020-01-01") edate <- as.Date("2023-07-08") event <- sample(seq(sdate, edate, by = "1 day"), 100, replace = TRUE) # Calculate ages ages <- calc_age(bdate, event) head(ages) # Handle missing values bdate[1] <- NA event[2] <- NA calc_age(bdate, event)[1:5]# Generate random birth dates set.seed(123) sdate <- as.Date("1960-01-01") edate <- as.Date("1980-12-31") bdate <- sample(seq(sdate, edate, by = "1 day"), 100, replace = TRUE) # Generate random event dates sdate <- as.Date("2020-01-01") edate <- as.Date("2023-07-08") event <- sample(seq(sdate, edate, by = "1 day"), 100, replace = TRUE) # Calculate ages ages <- calc_age(bdate, event) head(ages) # Handle missing values bdate[1] <- NA event[2] <- NA calc_age(bdate, event)[1:5]
calc_ax() calculates the average number of person-years lived in the interval
by those dying in the interval (ax) based on age-specific mortality rates (mx),
starting ages (sage), and sex.
calc_ax(mx, sage, sex = "male")calc_ax(mx, sage, sex = "male")
mx |
Numeric vector of age-specific mortality rates. |
sage |
Numeric vector of starting ages for the age groups. |
sex |
Character string specifying the sex: "male", "female", or "total" (default is "male"). |
For most age groups, , where is the width
of the age interval.
For the infant group (age 0), ax is adjusted based on sex and m0:
Male: if m0 >= 0.1, ax=0.33; else 0.045 + 2.684 * m0
Female: if m0 >= 0.1, ax=0.35; else 0.053 + 2.8 * m0
Total: if m0 >= 0.1, ax=0.34; else 0.049 + 2.742 * m0
A numeric vector of ax values.
Coale, A. J., Demeny, P., & Vaughan, B. (1983). Regional Model Life Tables and Stable Populations (2nd ed.). New York: Academic Press.
Preston, S. H., Heuveline, P., & Guillot, M. (2001). Demography: Measuring and Modeling Population Processes. Malden, MA: Blackwell Publishers.
mx <- c(0.05, 0.01, 0.005) sage <- c(0, 1, 5) calc_ax(mx, sage, "male")mx <- c(0.05, 0.01, 0.005) sage <- c(0, 1, 5) calc_ax(mx, sage, "male")
canregs)This dataset provides example data from population-based cancer registries
(PBCRs), structured using the canregs class. It can be used for
demonstration, testing, or development purposes within the canregtools
package.
canregscanregs
A list of PBCR datasets with class canregs.
Henan Province Cancer Registry, China.
data("canregs") summary(canregs[[1]])data("canregs") summary(canregs[[1]])
Categorizes six-digit administrative division codes of the People's Republic of China (as per GB/T 2260-2007) into several structured components, including province, city, area type (urban/rural), registry code, and regional classification.
classify_areacode(x)classify_areacode(x)
x |
A vector of six-digit Chinese administrative area codes, either as numeric or character strings. |
This function standardizes and validates area codes, identifies their
administrative levels, and attaches metadata used in cancer registration
systems. It also supports external dictionaries (from the canregtools
configuration folder) to provide more accurate classification of area types
and registry mapping. Classify Codes for the administrative divisions of
the People's Republic of China(GB/T 2260-2007) into different categories,
including 'province', city', 'area_type', and 'registry' attributes.
A list with the following named elements:
areacode: Validated area codes. Invalid entries are replaced with NA.
registry: Registry codes corresponding to each area, using a built-in
or cached dictionary.
province: Province-level codes formed by taking the first two digits and
appending "0000".
city: City-level codes formed by taking the first four digits and
appending "00".
area_type: Urban-rural classification codes: "910000" for urban,
"920000" for rural. This can be updated using write_registry()
function which stored the dictionary in (area_type_dict.rds)
region: Region classification codes derived from province codes,
ending in "0000".
classify_areacode(c("110000", "320500", "440300"))classify_areacode(c("110000", "320500", "440300"))
Categorizes six-digit administrative division codes of the People's Republic of China (as per GB/T 2260-2007) into several structured components, including province, city, area type (urban/rural), registry code, and regional classification.
classify_areacode2(x, attr = "registry")classify_areacode2(x, attr = "registry")
x |
A vector of six-digit Chinese administrative area codes, either as numeric or character strings. |
attr |
A character vector of attributes to return. Options include:
|
This function standardizes and validates area codes, identifies their
administrative levels, and attaches metadata used in cancer registration
systems. It also supports external dictionaries (from the canregtools
configuration folder) to provide more accurate classification of area types
and registry mapping. Classify Codes for the administrative divisions of
the People's Republic of China(GB/T 2260-2007) into different categories,
including 'province', city', 'area_type', and 'registry' attributes.
A list with the following named elements:
areacode: Validated area codes. Invalid entries are replaced with NA.
registry: Registry codes corresponding to each area, using a built-in
or cached dictionary.
province: Province-level codes formed by taking the first two digits and
appending "0000".
city: City-level codes formed by taking the first four digits and
appending "00".
area_type: Urban-rural classification codes: "910000" for urban,
"920000" for rural. This can be updated using write_registry()
function which stored the dictionary in (area_type_dict.rds)
region: Region classification codes derived from province codes,
ending in "0000".
classify_areacode(c("110000", "320500", "440300"))classify_areacode(c("110000", "320500", "440300"))
classify_childhood()classifies childhood cancer based on ICD-O-3 codes,
which include topography, morphology, and behavior codes, using the
International Classification of Childhood Cancer, Third Edition (ICCC3).
classify_childhood(topo, morp, beha, type = "sub", version = "v2005")classify_childhood(topo, morp, beha, type = "sub", version = "v2005")
topo |
A character vector of ICD-O-3 topography codes
(e.g., |
morp |
A character vector of ICD-O-3 morphology codes
(e.g., |
beha |
A numeric or character vector representing ICD-O-3 behavior codes. |
type |
A string specifying the type of classification to return:
|
version |
A string specifying the version of the ICCC-3 rules to use:
either |
A numeric vector of ICCC-3 classification codes. If type = "sub",
returns subgroup codes; if type = "main", returns main group codes.
Steliarova-Foucher, E., Stiller, C., Lacour, B. and Kaatsch, P. (2005), International Classification of Childhood Cancer, third edition†‡. Cancer, 103: 1457-1467. doi:10.1002/cncr.20910
topo <- c("C15.2", "C16.2", "C34.2") morp <- c("8000", "8040", "8170") beha <- c("3", "3", "3") child_code <- classify_childhood(topo, morp, beha, type = "main")topo <- c("C15.2", "C16.2", "C34.2") morp <- c("8000", "8040", "8170") beha <- c("3", "3", "3") child_code <- classify_childhood(topo, morp, beha, type = "main")
Classify ICD10 codes into Cancer Categories according to the specified category type and language.
classify_icd10( x, cancer_type = "big", lang = "code", label_type = "abbr", as_factor = FALSE )classify_icd10( x, cancer_type = "big", lang = "code", label_type = "abbr", as_factor = FALSE )
x |
The ICD10 codes of cancer part ('C00-C98 and D00-D48') which used in by the Population-Based Cancer Registration ('PBCR'). |
cancer_type |
A character string specifying the classification method
used to categorize ICD-10 codes. This determines how ICD-10 codes are
classified. Options include |
lang |
Character, specify the output language, options are 'cn', or 'en', default is 'cn'. |
label_type |
Type of the label used ("full" or "abbr"). |
as_factor |
Logical, indicate whether output value as factor. |
Cancer code.
icd10 <- c("C15.2", "C33.4", "C80.9", "C26.2", "C16.3") classify_icd10(icd10, cancer_type = "big") classify_icd10(icd10, cancer_type = "small") classify_icd10(icd10, cancer_type = "system") classify_icd10(icd10, cancer_type = "gco")icd10 <- c("C15.2", "C33.4", "C80.9", "C26.2", "C16.3") classify_icd10(icd10, cancer_type = "big") classify_icd10(icd10, cancer_type = "small") classify_icd10(icd10, cancer_type = "system") classify_icd10(icd10, cancer_type = "gco")
Classify ICD-O-3 morphology codes into categories
classify_morp(x)classify_morp(x)
x |
Character vector of ICD-O-3 morphology codes.
Values not present in the internal dictionary will be returned as |
Character vector of categories corresponding to x.
morps <- c("8140","8070","8050","8051","9900","9800","9993") classify_morp(morps)morps <- c("8140","8070","8050","8051","9900","9800","9993") classify_morp(morps)
Classify ICD-O-3 topography codes into categories
classify_topo(x, cancer_type = "big")classify_topo(x, cancer_type = "big")
x |
Character vector of ICD-O-3 topography codes.
Values not present in the internal dictionary will be returned as |
cancer_type |
Cancer type. |
Character vector of categories corresponding to x.
Take a list of named numeric vectors and sum values with the same names
combine_tp(object)combine_tp(object)
object |
A list where each element is a named numeric vector |
A named integer vector with names sorted alphabetically and values summed across vectors.
canreg dataThe count_canreg() function is a generic method used to summarize
population-based cancer registry data. It supports both single (canreg)
and multiple (canregs) registry objects. The function aggregates cancer
cases by age group and classifies cancer types using standardized coding
systems.
count_canreg( x, age_breaks = c(0, 1, seq(5, 85, 5)), label_tail = NULL, cancer_type = "big" ) ## S3 method for class 'canregs' count_canreg(x, ...) ## S3 method for class 'canreg' count_canreg( x, age_breaks = c(0, 1, seq(5, 85, 5)), label_tail = NULL, cancer_type = "big" )count_canreg( x, age_breaks = c(0, 1, seq(5, 85, 5)), label_tail = NULL, cancer_type = "big" ) ## S3 method for class 'canregs' count_canreg(x, ...) ## S3 method for class 'canreg' count_canreg( x, age_breaks = c(0, 1, seq(5, 85, 5)), label_tail = NULL, cancer_type = "big" )
x |
Object with class of 'canreg' or 'canregs'. |
age_breaks |
A numeric vector specifying the breakpoints for age
grouping. Defaults to |
label_tail |
Optional. A string to append to age group labels
(e.g., |
cancer_type |
A character string specifying the classification method
used to categorize ICD-10 codes. This determines how ICD-10 codes are
classified. Options include |
... |
Additional arguments passed to the method for individual
|
Object with class of 'fbswicd' or 'fbswicds'.
cr_clean(), classify_icd10(), create_asr()
data("canregs") fbsw <- count_canreg(canregs, age_breaks = c(0, 15, 65), cancer_type = "big") fbsw <- count_canreg(canregs, cancer_type = "gco") # Count object with class of `canregs` fbsw <- count_canreg(canregs, cancer_type = "small") # Count object with class of `canreg` fbsw <- count_canreg(canregs[[1]], cancer_type = "big")data("canregs") fbsw <- count_canreg(canregs, age_breaks = c(0, 15, 65), cancer_type = "big") fbsw <- count_canreg(canregs, cancer_type = "gco") # Count object with class of `canregs` fbsw <- count_canreg(canregs, cancer_type = "small") # Count object with class of `canreg` fbsw <- count_canreg(canregs[[1]], cancer_type = "big")
Clean canreg data.
cr_clean( x, age_breaks = c(0, 1, seq(5, 85, 5)), label_tail = NULL, cancer_type = "big" ) ## S3 method for class 'canregs' cr_clean( x, age_breaks = c(0, 1, seq(5, 85, 5)), label_tail = NULL, cancer_type = "big" ) ## S3 method for class 'canreg' cr_clean( x, age_breaks = c(0, 1, seq(5, 85, 5)), label_tail = NULL, cancer_type = "big" ) ## S3 method for class 'FBcases' cr_clean( x, age_breaks = c(0, 1, seq(5, 85, 5)), label_tail = NULL, cancer_type = "big" ) ## S3 method for class 'SWcases' cr_clean( x, age_breaks = c(0, 1, seq(5, 85, 5)), label_tail = NULL, cancer_type = "big" ) ## S3 method for class 'POP' cr_clean( x, age_breaks = c(0, 1, seq(5, 85, 5)), label_tail = NULL, cancer_type = "big" )cr_clean( x, age_breaks = c(0, 1, seq(5, 85, 5)), label_tail = NULL, cancer_type = "big" ) ## S3 method for class 'canregs' cr_clean( x, age_breaks = c(0, 1, seq(5, 85, 5)), label_tail = NULL, cancer_type = "big" ) ## S3 method for class 'canreg' cr_clean( x, age_breaks = c(0, 1, seq(5, 85, 5)), label_tail = NULL, cancer_type = "big" ) ## S3 method for class 'FBcases' cr_clean( x, age_breaks = c(0, 1, seq(5, 85, 5)), label_tail = NULL, cancer_type = "big" ) ## S3 method for class 'SWcases' cr_clean( x, age_breaks = c(0, 1, seq(5, 85, 5)), label_tail = NULL, cancer_type = "big" ) ## S3 method for class 'POP' cr_clean( x, age_breaks = c(0, 1, seq(5, 85, 5)), label_tail = NULL, cancer_type = "big" )
x |
Data of class 'FBcases', 'SWcases' or 'population'. |
age_breaks |
Cut points for age groups. Default is
|
label_tail |
Tail of the labels. |
cancer_type |
A character string specifying the classification method
used to categorize ICD-10 codes. This determines how ICD-10 codes are
classified. Options include |
Class 'canreg'.
data("canregs") data <- cr_clean(canregs) data <- cr_clean(canregs, cancer_type = "small") data <- cr_clean(canregs[[1]], cancer_type = "big") fbcases <- purrr::pluck(canregs[[1]], "FBcases") fbcases <- cr_clean(fbcases) swcases <- purrr::pluck(canregs[[1]], "SWcases") swcases <- cr_clean(swcases) pop <- purrr::pluck(canregs[[1]], "POP") pop <- cr_clean(pop)data("canregs") data <- cr_clean(canregs) data <- cr_clean(canregs, cancer_type = "small") data <- cr_clean(canregs[[1]], cancer_type = "big") fbcases <- purrr::pluck(canregs[[1]], "FBcases") fbcases <- cr_clean(fbcases) swcases <- purrr::pluck(canregs[[1]], "SWcases") swcases <- cr_clean(swcases) pop <- purrr::pluck(canregs[[1]], "POP") pop <- cr_clean(pop)
canreg or canregs
Filter cases from objects of class canreg or canregs
cr_filter(.data, ..., drop = c("none"), part = "all") ## S3 method for class 'canregs' cr_filter(.data, ..., part = "all") ## S3 method for class 'canreg' cr_filter(.data, ..., part = "all") ## S3 method for class 'asrs' cr_filter(.data, ..., drop = c("none")) ## S3 method for class 'age_rates' cr_filter(.data, ..., drop = c("none")) ## S3 method for class 'qualities' cr_filter(.data, ..., drop = c("none")) ## Default S3 method: cr_filter(.data, ..., drop = c("none"))cr_filter(.data, ..., drop = c("none"), part = "all") ## S3 method for class 'canregs' cr_filter(.data, ..., part = "all") ## S3 method for class 'canreg' cr_filter(.data, ..., part = "all") ## S3 method for class 'asrs' cr_filter(.data, ..., drop = c("none")) ## S3 method for class 'age_rates' cr_filter(.data, ..., drop = c("none")) ## S3 method for class 'qualities' cr_filter(.data, ..., drop = c("none")) ## Default S3 method: cr_filter(.data, ..., drop = c("none"))
.data |
An object with class of |
... |
Filtering conditions passed to |
drop |
Drop specific cancer categories. |
part |
A character vector specifying which components of |
An object of the same class as .data (canreg or canregs) with
the specified components filtered accordingly. The structure and component
order are preserved.
canregs, fbswicds, or asrs
Merge elements from object with class of canregs, fbswicds, asrs,
qualities, age_rates, or summaries into object with class of canreg,
fbswicd, asr, quality, age_rate, or summary.
cr_merge(data) ## S3 method for class 'canregs' cr_merge(data) ## S3 method for class 'fbswicds' cr_merge(data) ## S3 method for class 'asrs' cr_merge(data) ## S3 method for class 'qualities' cr_merge(data) ## S3 method for class 'age_rates' cr_merge(data) ## S3 method for class 'summaries' cr_merge(data)cr_merge(data) ## S3 method for class 'canregs' cr_merge(data) ## S3 method for class 'fbswicds' cr_merge(data) ## S3 method for class 'asrs' cr_merge(data) ## S3 method for class 'qualities' cr_merge(data) ## S3 method for class 'age_rates' cr_merge(data) ## S3 method for class 'summaries' cr_merge(data)
data |
An object with class of |
An object with merged elements.
data("canregs") canreg <- cr_merge(canregs) class(canreg) # Merge obejct with class of `fbswicds` into obejct with class of `fbswicd` fbsws <- count_canreg(canregs) fbsw <- cr_merge(fbsws) # Merge obejct with class of `asrs` into object with class of `asr` asrs <- create_asr(canregs, year, sex, cancer, collapse = FALSE) asr <- cr_merge(asrs) # Merge obejct with class of `qualities` into object with class of `quality` quas <- create_quality(canregs, year, sex, cancer, collapse = FALSE) qua <- cr_merge(quas) # Merge obejct with class of `age_rates` into object with class of `age_rate` agerates <- create_age_rate(canregs, year, sex, cancer, collapse = FALSE) agerate <- cr_merge(agerates) # Merge obejct with class of `summaries` into object with class of `summary` summs <- summary(canregs, collapse = FALSE) summ <- cr_merge(summs)data("canregs") canreg <- cr_merge(canregs) class(canreg) # Merge obejct with class of `fbswicds` into obejct with class of `fbswicd` fbsws <- count_canreg(canregs) fbsw <- cr_merge(fbsws) # Merge obejct with class of `asrs` into object with class of `asr` asrs <- create_asr(canregs, year, sex, cancer, collapse = FALSE) asr <- cr_merge(asrs) # Merge obejct with class of `qualities` into object with class of `quality` quas <- create_quality(canregs, year, sex, cancer, collapse = FALSE) qua <- cr_merge(quas) # Merge obejct with class of `age_rates` into object with class of `age_rate` agerates <- create_age_rate(canregs, year, sex, cancer, collapse = FALSE) agerate <- cr_merge(agerates) # Merge obejct with class of `summaries` into object with class of `summary` summs <- summary(canregs, collapse = FALSE) summ <- cr_merge(summs)
Reframe data of class canregs or fbswicds
cr_reframe(x, strat = "registry") ## S3 method for class 'canregs' cr_reframe(x, strat = "registry") ## S3 method for class 'fbswicds' cr_reframe(x, strat = "registry")cr_reframe(x, strat = "registry") ## S3 method for class 'canregs' cr_reframe(x, strat = "registry") ## S3 method for class 'fbswicds' cr_reframe(x, strat = "registry")
x |
Object with class of canregs or fbswicds. |
strat |
Stratification variables used to reframe 'canregs' or 'fbswicds'. |
Reframed canregs or fbswicds.
# list reframe vars that could be used in `strat` parameter ls_vars("reframe") data("canregs") # Reframe the `canregs` data according to `area_type` attribute city <- cr_reframe(canregs, strat = "area_type") # Reframe object with class of `canregs` # Reframe the `canregs` according to the `province` attribute province <- cr_reframe(canregs, strat = "province") class(province) names(province) # Reframe object with class of `fbswicds` # Convert object with class of `canregs` into object with class of `fbswicds` fbsw <- count_canreg(canregs, cancer_type = "small") # Reframe the `fbswicds` according to the `city` attribute city <- cr_reframe(fbsw, strat = "city")# list reframe vars that could be used in `strat` parameter ls_vars("reframe") data("canregs") # Reframe the `canregs` data according to `area_type` attribute city <- cr_reframe(canregs, strat = "area_type") # Reframe object with class of `canregs` # Reframe the `canregs` according to the `province` attribute province <- cr_reframe(canregs, strat = "province") class(province) names(province) # Reframe object with class of `fbswicds` # Convert object with class of `canregs` into object with class of `fbswicds` fbsw <- count_canreg(canregs, cancer_type = "small") # Reframe the `fbswicds` according to the `city` attribute city <- cr_reframe(fbsw, strat = "city")
"canregs", or "fbswicds"
This function allows you to select specific elements from objects of class 'canregs', 'fbswicds', or 'asrs' based on provided indices, logical conditions, or expressions. The selected elements are returned while preserving the class of the input object.
cr_select(data, ..., index = names(data)) ## S3 method for class 'canregs' cr_select(data, ..., index = names(data)) ## S3 method for class 'asrs' cr_select(data, ..., index = names(data)) ## S3 method for class 'age_rates' cr_select(data, ..., index = names(data)) ## S3 method for class 'fbswicds' cr_select(data, ..., index = names(data)) ## S3 method for class 'summaries' cr_select(data, ..., index = names(data))cr_select(data, ..., index = names(data)) ## S3 method for class 'canregs' cr_select(data, ..., index = names(data)) ## S3 method for class 'asrs' cr_select(data, ..., index = names(data)) ## S3 method for class 'age_rates' cr_select(data, ..., index = names(data)) ## S3 method for class 'fbswicds' cr_select(data, ..., index = names(data)) ## S3 method for class 'summaries' cr_select(data, ..., index = names(data))
data |
An object of class 'canregs', 'fbswicds', or 'asrs' from which elements will be selected. |
... |
Optional conditions or expressions used to filter elements within the list or data frame. Conditions are evaluated for each element of the input object. |
index |
A vector of indices specifying the elements to select. This can be a character vector (matching element names), a numeric vector (specifying positions), or a logical vector (indicating inclusion). |
An object of the same class as the input object, containing only the selected elements that meet the specified indices or conditions.
data("canregs") # Select elements which mi greather than 0.5 from `canregs` canregs_mi <- cr_select(canregs, mi > 0.5) # Select elements from obejct with class of `fbswicds` fbsws <- count_canreg(canregs) # Select elements which `inci` greater than 250 per 100000 population fbsws_inci <- cr_select(fbsws, inci > 250) # Select elements from object with class of `summaries` summ <- summary(canregs, collapse = FALSE) # Select elements for whcih `mi` greater than 0.5 summ_mi <- cr_select(summ, mi > 0.5) names(summ_mi)data("canregs") # Select elements which mi greather than 0.5 from `canregs` canregs_mi <- cr_select(canregs, mi > 0.5) # Select elements from obejct with class of `fbswicds` fbsws <- count_canreg(canregs) # Select elements which `inci` greater than 250 per 100000 population fbsws_inci <- cr_select(fbsws, inci > 250) # Select elements from object with class of `summaries` summ <- summary(canregs, collapse = FALSE) # Select elements for whcih `mi` greater than 0.5 summ_mi <- cr_select(summ, mi > 0.5) names(summ_mi)
Clean canreg data.
cr_write(x) ## S3 method for class 'canregs' cr_write(x) ## S3 method for class 'canreg' cr_write(x)cr_write(x) ## S3 method for class 'canregs' cr_write(x) ## S3 method for class 'canreg' cr_write(x)
x |
Object with class of |
Object with class of canreg or canregs.
create_age_rate() computes age-specific rates from object with class of
canreg,fbswicd, or canregs, fbswicds. It calculates the rates for
specified events (e.g., fbs) across age groups, stratified by variables
such as year, sex, or cancer type.
create_age_rate( x, ..., event = "fbs", cancer_type = "big", format = "long", mp = 1e+05, decimal = 6, show_pop = FALSE, collapse = TRUE ) ## S3 method for class 'canreg' create_age_rate(x, ..., cancer_type = "big") ## S3 method for class 'canregs' create_age_rate(x, ..., cancer_type = "big", collapse = TRUE) ## S3 method for class 'fbswicds' create_age_rate( x, ..., event = "fbs", format = "long", mp = 1e+05, decimal = 6, show_pop = FALSE, collapse = TRUE ) ## S3 method for class 'fbswicd' create_age_rate( x, ..., event = "fbs", format = "long", mp = 1e+05, decimal = 6, show_pop = FALSE )create_age_rate( x, ..., event = "fbs", cancer_type = "big", format = "long", mp = 1e+05, decimal = 6, show_pop = FALSE, collapse = TRUE ) ## S3 method for class 'canreg' create_age_rate(x, ..., cancer_type = "big") ## S3 method for class 'canregs' create_age_rate(x, ..., cancer_type = "big", collapse = TRUE) ## S3 method for class 'fbswicds' create_age_rate( x, ..., event = "fbs", format = "long", mp = 1e+05, decimal = 6, show_pop = FALSE, collapse = TRUE ) ## S3 method for class 'fbswicd' create_age_rate( x, ..., event = "fbs", format = "long", mp = 1e+05, decimal = 6, show_pop = FALSE )
x |
The input data, object with class of |
... |
One or more variables used for stratification. For example, you
can stratify by |
event |
A variable used to specify the type of calculation, options are "fbs" or "sws", "fbs" for cancer incidence, and "sws" for cancer mortality. |
cancer_type |
A character string specifying the classification method
used to categorize ICD-10 codes. This determines how ICD-10 codes are
classified. Options include |
format |
Format of the output data frame, either "long" or "wide". |
mp |
A constant to multiply rates by (e.g. mp=1000 for rates per 1000). |
decimal |
This parameter specifies the number of decimal places to round the results. The default is 2, which means rates will be rounded to two decimal places. |
show_pop |
Logical value whether output population or not. |
collapse |
Logical value whether output result as age_rate or age_rates. |
A data frame of age-specific rates.
data("canregs") agerate <- create_age_rate(canregs, year, sex, cancer) data <- canregs[[1]] agerate <- create_age_rate(data, year, sex) agerate <- create_age_rate(canregs, year, cancer_type = "system") fbsws <- count_canreg(canregs) agerate <- create_age_rate(fbsws, year, sex) data <- canregs[[2]] fbsw <- count_canreg(data, cancer_type = "small") agerate <- create_age_rate(fbsw, year, sex, cancer)data("canregs") agerate <- create_age_rate(canregs, year, sex, cancer) data <- canregs[[1]] agerate <- create_age_rate(data, year, sex) agerate <- create_age_rate(canregs, year, cancer_type = "system") fbsws <- count_canreg(canregs) agerate <- create_age_rate(fbsws, year, sex) data <- canregs[[2]] fbsw <- count_canreg(data, cancer_type = "small") agerate <- create_age_rate(fbsw, year, sex, cancer)
create_asr() calculate age-standardized rates (ASRs) from object with
class of canreg, canregs, fbswicd, or fbswicds. It supports
stratification by multiple variables, allows the specification of different
standard population structures, and provides flexibility in the inclusion
of variance, confidence intervals, and population data.
create_asr( x, ..., event = "fbs", std = c("cn2000", "wld85"), cancer_type = "big", mp = 1e+05, decimal = 2, show_var = FALSE, show_ci = FALSE, collapse = TRUE ) ## S3 method for class 'canregs' create_asr(x, ..., cancer_type = "big", collapse = TRUE) ## S3 method for class 'canreg' create_asr(x, ..., cancer_type = "big") ## S3 method for class 'fbswicds' create_asr( x, ..., event = "fbs", std = c("cn2000", "wld85"), mp = 1e+05, decimal = 2, show_pop = FALSE, show_var = FALSE, show_ci = FALSE, collapse = TRUE ) ## S3 method for class 'fbswicd' create_asr( x, ..., event = "fbs", std = c("cn2000", "wld85"), mp = 1e+05, decimal = 2, show_pop = FALSE, show_var = FALSE, show_ci = FALSE )create_asr( x, ..., event = "fbs", std = c("cn2000", "wld85"), cancer_type = "big", mp = 1e+05, decimal = 2, show_var = FALSE, show_ci = FALSE, collapse = TRUE ) ## S3 method for class 'canregs' create_asr(x, ..., cancer_type = "big", collapse = TRUE) ## S3 method for class 'canreg' create_asr(x, ..., cancer_type = "big") ## S3 method for class 'fbswicds' create_asr( x, ..., event = "fbs", std = c("cn2000", "wld85"), mp = 1e+05, decimal = 2, show_pop = FALSE, show_var = FALSE, show_ci = FALSE, collapse = TRUE ) ## S3 method for class 'fbswicd' create_asr( x, ..., event = "fbs", std = c("cn2000", "wld85"), mp = 1e+05, decimal = 2, show_pop = FALSE, show_var = FALSE, show_ci = FALSE )
x |
The input data, object with class of |
... |
One or more variables used for stratification. For example, you
can stratify by |
event |
A variable used to specify the type of calculation, options are "fbs" or "sws", "fbs" for cancer incidence, and "sws" for cancer mortality. |
std |
Specify the standard population structure in the 'std_pop' data frame used for calculating standardized rates. When calculating standardized rates for multiple standard populations, specify std = c(segi, china). |
cancer_type |
A character string specifying the classification method
used to categorize ICD-10 codes. This determines how ICD-10 codes are
classified. Options include |
mp |
A constant to multiply rates by (e.g. mp=1000 for rates per 1000). |
decimal |
This parameter specifies the number of decimal places to round the results. The default is 2, which means rates will be rounded to two decimal places. |
show_var |
Logical value whether output variance or not. |
show_ci |
Logical value whether output confidence(lower or upper bound) or not. |
collapse |
Logical value whether output result as asr or asrs. |
show_pop |
Logical value whether output population or not. |
A data frame or tibble contains the age standard rates and CIs.
ageadjust for age-adjusted rate calculations.
truncrate for truncated rate calculations.
data("canregs") asr_inci <- create_asr(canregs, event = "fbs", year, sex, cancer) asr_mort <- create_asr(canregs, event = "sws", year, sex, cancer) # calculate ASR based on object with class of `canregs` asr <- create_asr(canregs, event = "sws", year, sex, cancer) # calculate ASR based on object with class of `canreg` data <- canregs[[1]] # calculate ASR using default parameter asr <- create_asr(data, year, sex, cancer) head(asr) # calculate ASR using multiple standard population asr_multi_std <- create_asr(data, year, sex, cancer, std = c("cn82", "cn2000", "wld85") ) head(asr_multi_std) # calculate ASR with confidence interval asr_with_ci <- create_asr(data, year, sex, cancer, show_ci = TRUE) head(asr_with_ci) # calculate ASR with population at risk asr_with_pop <- create_asr(data, year, sex, cancer, show_pop = TRUE) head((asr_with_pop)) # calculate ASR with variance asr_with_var <- create_asr(data, year, sex, cancer, show_var = TRUE) head(asr_with_var) # calculate ASR based on object with class of `fbswicds` # convert object with class of `canregs` to object with class of `fbswicds` fbsws <- count_canreg(canregs) asrs <- create_asr(fbsws, event = "sws", year, sex, cancer) # calculate ASR based on object with class of `fbswicd` # convert object with class of `canreg` to object with class of `fbswicd` fbsw <- count_canreg(canregs[[1]]) asr <- create_asr(fbsw, event = "sws", year, sex, cancer)data("canregs") asr_inci <- create_asr(canregs, event = "fbs", year, sex, cancer) asr_mort <- create_asr(canregs, event = "sws", year, sex, cancer) # calculate ASR based on object with class of `canregs` asr <- create_asr(canregs, event = "sws", year, sex, cancer) # calculate ASR based on object with class of `canreg` data <- canregs[[1]] # calculate ASR using default parameter asr <- create_asr(data, year, sex, cancer) head(asr) # calculate ASR using multiple standard population asr_multi_std <- create_asr(data, year, sex, cancer, std = c("cn82", "cn2000", "wld85") ) head(asr_multi_std) # calculate ASR with confidence interval asr_with_ci <- create_asr(data, year, sex, cancer, show_ci = TRUE) head(asr_with_ci) # calculate ASR with population at risk asr_with_pop <- create_asr(data, year, sex, cancer, show_pop = TRUE) head((asr_with_pop)) # calculate ASR with variance asr_with_var <- create_asr(data, year, sex, cancer, show_var = TRUE) head(asr_with_var) # calculate ASR based on object with class of `fbswicds` # convert object with class of `canregs` to object with class of `fbswicds` fbsws <- count_canreg(canregs) asrs <- create_asr(fbsws, event = "sws", year, sex, cancer) # calculate ASR based on object with class of `fbswicd` # convert object with class of `canreg` to object with class of `fbswicd` fbsw <- count_canreg(canregs[[1]]) asr <- create_asr(fbsw, event = "sws", year, sex, cancer)
create_quality()calculate quality indicators from object with class
of canreg, canregs, fbswicd, or fbswicds. The quality indicators for
population-based cancer registries (PBCRs) including:
fbs: Number of incident cases.
inci: Cancer incidence rate.
sws: Number of death cases.
mort: Mortality rate.
mv: Percentage of cases with microscopic verification.
mi: Mortality-to-incidence ratio.
And other relevant quality metrics for cancer data evaluation.
create_quality(x, ..., decimal = 2, collapse = TRUE) ## S3 method for class 'canreg' create_quality(x, ..., cancer_type = "big") ## S3 method for class 'canregs' create_quality(x, ..., cancer_type = "big", collapse = TRUE) ## S3 method for class 'fbswicds' create_quality(x, ..., decimal = 2, collapse = TRUE) ## S3 method for class 'fbswicd' create_quality(x, ..., decimal = 2)create_quality(x, ..., decimal = 2, collapse = TRUE) ## S3 method for class 'canreg' create_quality(x, ..., cancer_type = "big") ## S3 method for class 'canregs' create_quality(x, ..., cancer_type = "big", collapse = TRUE) ## S3 method for class 'fbswicds' create_quality(x, ..., decimal = 2, collapse = TRUE) ## S3 method for class 'fbswicd' create_quality(x, ..., decimal = 2)
x |
The input data, object with class of |
... |
One or more variables used for stratification. For example, you
can stratify by |
decimal |
The number of decimal places to include in the resulting quality indicator values. Defaults to 2. |
collapse |
Logical value whether output result as quality or qualites. |
cancer_type |
A character string specifying the classification method
used to categorize ICD-10 codes. This determines how ICD-10 codes are
classified. Options include |
A data frame (if applied to a single registry
object, 'canreg' or 'fbswicd') or a list of data frames (if applied to a
grouped registry object, 'canregs' or 'fbswicds') with a class of either
'quality' or 'qualities'.
data("canregs") fbsws <- count_canreg(canregs, cancer_type = "system") qua2 <- create_quality(fbsws, year, sex, cancer) head(qua2) # Calculate the quality indicators based on object with class of `canreg` data <- canregs[[1]] qua <- create_quality(data, year, sex, cancer, cancer_type = "big") head(qua) # Calculate the quality indicators based on object with class of `canregs` qua <- create_quality(canregs, year, sex, cancer, cancer_type = "big") head(qua) # Calculate the quality indicators based on object with class of `fbswicds` fbsws <- count_canreg(canregs, cancer_type = "small") qua <- create_quality(fbsws, year, sex, cancer) head(qua) # Calculate the quality indicators based on object with class of `fbswicd` fbsw <- count_canreg(canregs[[1]], cancer_type = "big") qua <- create_quality(fbsw, year, sex, cancer) head(qua)data("canregs") fbsws <- count_canreg(canregs, cancer_type = "system") qua2 <- create_quality(fbsws, year, sex, cancer) head(qua2) # Calculate the quality indicators based on object with class of `canreg` data <- canregs[[1]] qua <- create_quality(data, year, sex, cancer, cancer_type = "big") head(qua) # Calculate the quality indicators based on object with class of `canregs` qua <- create_quality(canregs, year, sex, cancer, cancer_type = "big") head(qua) # Calculate the quality indicators based on object with class of `fbswicds` fbsws <- count_canreg(canregs, cancer_type = "small") qua <- create_quality(fbsws, year, sex, cancer) head(qua) # Calculate the quality indicators based on object with class of `fbswicd` fbsw <- count_canreg(canregs[[1]], cancer_type = "big") qua <- create_quality(fbsw, year, sex, cancer) head(qua)
The create_report() function generates a report from objects with class of
canreg or canregs using pre-defined rmarkdown templates.
create_report( data, template = "annual", title = "Cancer Registry Report", output_format = "html_document", output_dir = NULL, ... ) ## S3 method for class 'canregs' create_report( data, template = "annual", title = "Cancer Registry Report", output_format = "html_document", output_dir = NULL, ... ) ## S3 method for class 'canreg' create_report( data, template = "annual", title = "Cancer Registry Report", output_format = "html_document", output_dir = NULL, ... )create_report( data, template = "annual", title = "Cancer Registry Report", output_format = "html_document", output_dir = NULL, ... ) ## S3 method for class 'canregs' create_report( data, template = "annual", title = "Cancer Registry Report", output_format = "html_document", output_dir = NULL, ... ) ## S3 method for class 'canreg' create_report( data, template = "annual", title = "Cancer Registry Report", output_format = "html_document", output_dir = NULL, ... )
data |
An object of class |
template |
Character string specifying the report template to use.
Options include |
title |
Character. Title of the generated report. Default is
|
output_format |
Character. Format of the rendered report.
Options are |
output_dir |
Character. Directory where the report will be saved. |
... |
Additional arguments passed to |
No return value; generates a report as a side effect.
## Not run: data("canregs") create_report(canregs, template = "quality", title = "QC Report") ## End(Not run) ## Not run: create_report(canregs, template = "annual", title = "Annual Report") ## End(Not run) ## Not run: data <- canregs[[1]] create_report(data, template = "annual", title = "Annual Report") ## End(Not run)## Not run: data("canregs") create_report(canregs, template = "quality", title = "QC Report") ## End(Not run) ## Not run: create_report(canregs, template = "annual", title = "Annual Report") ## End(Not run) ## Not run: data <- canregs[[1]] create_report(data, template = "annual", title = "Annual Report") ## End(Not run)
Count
create_site(x, ..., wrap_subsite = FALSE, class_morp = TRUE, drop_nos = TRUE)create_site(x, ..., wrap_subsite = FALSE, class_morp = TRUE, drop_nos = TRUE)
x |
Fbswicd |
... |
Strata |
wrap_subsite |
Logical |
class_morp |
Logical |
drop_nos |
Logical |
Data frame
Computes the cumulative rate up to a specified age limit, typically used in cancer epidemiology to estimate the probability of developing or dying from a disease over a lifetime or up to a target age.
cumrate( count, pop, rate = NULL, eage = 70, agewidth = 5, sep_zero = TRUE, mp = 1, decimal = 6 )cumrate( count, pop, rate = NULL, eage = 70, agewidth = 5, sep_zero = TRUE, mp = 1, decimal = 6 )
count |
Numeric vector, number of incident cases or deaths in each age group. |
pop |
Numeric vector, corresponding population at risk for each age group. |
rate |
Numeric vector, age-specific incidence or mortality rates. If not
supplied, it will be calculated as |
eage |
Integer, the upper age limit (e.g., 70) up to which the cumulative rate is calculated. |
agewidth |
Integer, width of the age intervals (e.g., 5 for 5-year bands). |
sep_zero |
Logical, whether the 0–1 age group is separated (i.e.,
age groups are 0, 1–4, 5–9, ...). Default is |
mp |
Numeric. A multiplier used to scale the final cumulative rate
(e.g., 100,000 or 1). Default is |
decimal |
Integer, number of decimal places to round the result.
Default is |
A named numeric value representing the cumulative rate, scaled
by mp.
px <- c( 20005, 86920, 102502, 151494, 182932, 203107, 240289, 247076, 199665, 163820, 145382, 86789, 69368, 51207, 39112, 20509, 12301, 6586, 1909 ) dx <- c( 156, 58, 47, 49, 48, 68, 120, 162, 160, 294, 417, 522, 546, 628, 891, 831, 926, 731, 269 ) mx <- dx / px cumrate(mx, eage = 70)px <- c( 20005, 86920, 102502, 151494, 182932, 203107, 240289, 247076, 199665, 163820, 145382, 86789, 69368, 51207, 39112, 20509, 12301, 6586, 1909 ) dx <- c( 156, 58, 47, 49, 48, 68, 120, 162, 160, 294, 417, 522, 546, 628, 891, 831, 926, 731, 269 ) mx <- dx / px cumrate(mx, eage = 70)
Converts a cumulative rate to a cumulative risk using the standard exponential formula. This is commonly used in cancer epidemiology to estimate the probability of developing or dying from cancer up to a certain age, under the assumption of constant rates.
cumrisk(cumrate, mp = 100, decimal = 2)cumrisk(cumrate, mp = 100, decimal = 2)
cumrate |
Numeric. The cumulative incidence or mortality rate,
typically calculated using |
mp |
Numeric. The rate multiplier used in |
decimal |
Integer. Number of decimal places to round the result.
Default is |
The cumulative risk is calculated as:
This converts the cumulative rate to a probability, assuming the event rate is constant over each age interval and the competing risks are ignored.
A named numeric value representing the cumulative risk (as a percentage).
px <- c( 20005, 86920, 102502, 151494, 182932, 203107, 240289, 247076, 199665, 163820, 145382, 86789, 69368, 51207, 39112, 20509, 12301, 6586, 1909 ) dx <- c( 156, 58, 47, 49, 48, 68, 120, 162, 160, 294, 417, 522, 546, 628, 891, 831, 926, 731, 269 ) mx <- dx / px cumrate(mx, eage = 70) cumrisk(cumrate(mx, eage = 70))px <- c( 20005, 86920, 102502, 151494, 182932, 203107, 240289, 247076, 199665, 163820, 145382, 86789, 69368, 51207, 39112, 20509, 12301, 6586, 1909 ) dx <- c( 156, 58, 47, 49, 48, 68, 120, 162, 160, 294, 417, 522, 546, 628, 891, 831, 926, 731, 269 ) mx <- dx / px cumrate(mx, eage = 70) cumrisk(cumrate(mx, eage = 70))
Groups numeric age values into categorized age bands using one of three
methods: fixed interval ("interval"), equal distance ("distance"),
or quantile-based grouping ("quantile"). Supports flexible labeling and
language-specific suffixes.
cutage( x, method = "distance", length = 5, maxage = 85, sep_zero = TRUE, breaks = c(seq(0, 85, 5)), labels = NULL, lang = "cn", label_tail = NULL, right = FALSE )cutage( x, method = "distance", length = 5, maxage = 85, sep_zero = TRUE, breaks = c(seq(0, 85, 5)), labels = NULL, lang = "cn", label_tail = NULL, right = FALSE )
x |
Numeric vector of ages. |
method |
Character. Grouping method: |
length |
Integer. Width of age bands (used only
if |
maxage |
Numeric. Upper limit of the age range (used in |
sep_zero |
Logical. Whether to separate age 0 into its own group
(only used if |
breaks |
Numeric vector of breakpoints (required if
|
labels |
Character vector of labels for resulting age groups.
If |
lang |
Output language for default labels, |
label_tail |
Character string appended to labels, e.g., |
right |
Logical. Whether intervals are right-closed. Passed to |
A factor variable of age groups with labeled levels.
ages <- sample(0:101, 200, replace = TRUE) cutage(ages, method = "distance", length = 5, maxage = 60, sep_zero = TRUE) # Custom breaks cutage(ages, method = "interval", breaks = c(0, 15, 30, 45, 60, 75, Inf)) # Quantile-based grouping cutage(ages, method = "quantile")ages <- sample(0:101, 200, replace = TRUE) cutage(ages, method = "distance", length = 5, maxage = 60, sep_zero = TRUE) # Custom breaks cutage(ages, method = "interval", breaks = c(0, 15, 30, 45, 60, 75, Inf)) # Quantile-based grouping cutage(ages, method = "quantile")
Removes a specified dictionary file (e.g., "registry", "area_type", "custom") from the user's local configuration directory.
del_dict_files(dict = "custom")del_dict_files(dict = "custom")
dict |
A character string specifying the dictionary name. Default is "custom". |
Invisibly returns TRUE if deleted, FALSE otherwise.
del_dict_files("custom")del_dict_files("custom")
This function creates grouped bar charts with optional faceting. It is useful for comparing category-specific values across different groups and panels.
draw_barchart( data, x, y, group = NULL, facet = NULL, facet_label = NULL, y_label = NULL, rev_group = FALSE, grid = NULL, topn = NULL, axis = NULL, bar_side = NULL, bar_way = NULL, gap = NULL, csize = 0.8, space = 0.9, adj = -0.01, gl = NULL, cols = NULL, palette = "Peach", x_label_side = 1, legend = FALSE, legend_label = NULL, legend_pos = c(0.4, 0.2), dens = c(-1, -1), overlay = FALSE )draw_barchart( data, x, y, group = NULL, facet = NULL, facet_label = NULL, y_label = NULL, rev_group = FALSE, grid = NULL, topn = NULL, axis = NULL, bar_side = NULL, bar_way = NULL, gap = NULL, csize = 0.8, space = 0.9, adj = -0.01, gl = NULL, cols = NULL, palette = "Peach", x_label_side = 1, legend = FALSE, legend_label = NULL, legend_pos = c(0.4, 0.2), dens = c(-1, -1), overlay = FALSE )
data |
A |
x |
The categorical variable (unquoted) for the x-axis. |
y |
The numerical variable (unquoted) for the y-axis (bar height). |
group |
Optional grouping variable (unquoted) used for color/fill aesthetics. |
facet |
Optional faceting variable (unquoted) for splitting the data into panels. |
facet_label |
Optional character vector of labels corresponding to each facet. |
y_label |
Optional y-axis label or vector of labels for facets. |
rev_group |
Reverse the bar position of group variables. |
grid |
A numeric vector of length 2 indicating the layout (rows, columns) of the plot grid. |
topn |
Number of top |
axis |
Optional vector of breaks for the y-axis. If |
bar_side |
Integer indicating the side of bars: 1 = left, 2 = right. |
bar_way |
Integer for bar layout: 1 = one-sided, 2 = mirrored. |
gap |
Horizontal spacing between bars and y-axis (default depends on |
csize |
Character size scaling factor (default = 0.8). |
space |
Vertical space between bars (default = 0.9). |
adj |
Vertical adjustment of x-label text (default = -0.01). |
gl |
Integer. Indicating the line type of the grids. |
cols |
Optional vector of colors for bars. |
palette |
Character. Name of palette colors. |
x_label_side |
Side to place x-labels: 1 = inside, 2 = outside (default = 1). |
legend |
Logical. Whether to show a legend (default = FALSE). |
legend_label |
Optional character vector for legend labels. |
legend_pos |
Optional, numeric vector of length 2 for legend position. |
dens |
A numeric vector of two density values for bar shading (default = c(-1, -1)). |
overlay |
Logical. Whether to overlay bars from two groups in the same panel. |
A base R plot with grouped bar charts, optionally faceted.
data("canregs") asr <- create_asr(canregs[[1]], year, sex, cancer, event = "fbs") asr <- cr_filter(asr, drop = c("total", "others")) draw_barchart(asr, x = cancer, y = cr, group = year, facet = sex)data("canregs") asr <- create_asr(canregs[[1]], year, sex, cancer, event = "fbs") asr <- cr_filter(asr, drop = c("total", "others")) draw_barchart(asr, x = cancer, y = cr, group = year, facet = sex)
Plot dumbbell chart
draw_dumbbell( data, x = NULL, y1 = NULL, y2 = NULL, topn = 20, sort = "insc", legend = NULL, cols = c("#006400", "gray", "#b32134"), gl = NULL, gl_col = c("gray"), main = "" )draw_dumbbell( data, x = NULL, y1 = NULL, y2 = NULL, topn = 20, sort = "insc", legend = NULL, cols = c("#006400", "gray", "#b32134"), gl = NULL, gl_col = c("gray"), main = "" )
data |
A data frame contains data to be plotted. |
x |
A category variable in data. |
y1 |
Variable indicate start point. |
y2 |
Variable indicate end point. |
topn |
Top n values to be plotted. |
sort |
Sort options. |
legend |
Legends. |
cols |
Colors of the start and end points. |
gl |
Integer. Indicating the line type of the grids. |
gl_col |
Color of the background grid. |
main |
Main title of the plot. |
A dumbbell plot.
asr <- create_asr(canregs[[1]], year, cancer, show_ci = TRUE) |> drop_others() |> drop_total() |> add_labels(vars = "cancer", lang = "en", label_type = "abbr") draw_dumbbell(asr, "cancer_en", asr_lower_cn2000, asr_upper_cn2000, topn = 15)asr <- create_asr(canregs[[1]], year, cancer, show_ci = TRUE) |> drop_others() |> drop_total() |> add_labels(vars = "cancer", lang = "en", label_type = "abbr") draw_dumbbell(asr, "cancer_en", asr_lower_cn2000, asr_upper_cn2000, topn = 15)
This function draws a line chart from a data frame, optionally grouped by a categorical variable. It uses base R graphics and supports custom axis ticks, labels, and styles.
draw_linechart( data, x, y, group = NULL, facet = NULL, grid = c(1, 1), x_axis = NULL, y_axis = NULL, x_label = NULL, y_label = NULL, axis_title = c("Age (years)", "Age specific rate"), cols = NULL, palette = "Peach", line_type = "l", lwd = 2, adj = 0.02, srt = 60, main = NULL, sub = NULL, legend_pos = c(0.05, 0.95), mar = c(1, 0, 1, 0), add = TRUE, offset = 0.01, ... )draw_linechart( data, x, y, group = NULL, facet = NULL, grid = c(1, 1), x_axis = NULL, y_axis = NULL, x_label = NULL, y_label = NULL, axis_title = c("Age (years)", "Age specific rate"), cols = NULL, palette = "Peach", line_type = "l", lwd = 2, adj = 0.02, srt = 60, main = NULL, sub = NULL, legend_pos = c(0.05, 0.95), mar = c(1, 0, 1, 0), add = TRUE, offset = 0.01, ... )
data |
A data frame containing the variables to plot. |
x, y
|
Bare column names for the x and y axis variables. |
group |
Optional bare column name used to group and color lines. |
facet |
Optional bare column name used for faceting. If provided, the data will be split by this variable and plotted in a multi-panel layout. |
grid |
A vector of length 2 specifying number of rows and columns for facets. Default is c(1, 1). |
x_axis |
Optional numeric vector specifying x-axis tick locations. |
y_axis |
Optional numeric vector specifying y-axis tick locations. |
x_label, y_label
|
Optional labels for x and y axis ticks. If |
axis_title |
Character vector of length 2 giving the axis titles: c("x axis label", "y axis label"). |
cols |
Character vector of line colors. Defaults to c("darkgreen", "darkred", "gray"). |
palette |
Character, palette name indicate group of colors. |
line_type |
1-character string giving the type of plot desired. The following values are possible, for details, see plot: "p" for points, "l" for lines, "b" for both points and lines, "c" for empty points joined by lines, "o" for overplotted points and lines, "s" and "S" for stair steps and "h" for histogram-like vertical lines. Finally, "n" does not produce any points or lines. |
lwd |
Line width. Default is 2. |
adj |
Adjustment for axis text placement. Default is 0.02. |
srt |
String rotation angle for x-axis labels. Default is 90 degrees. |
main, sub
|
Main title and subtitle of the plot. |
legend_pos |
Position of the legend. |
mar |
Margin of the sub plot. |
add |
Logical. If |
offset |
Axis offset used for spacing ticks. Default is 0.01. |
... |
Additional arguments (currently unused). |
data("canregs") fbsw <- count_canreg(canregs[[1]], label_tail="yrs") agerate <- create_age_rate(fbsw, year, sex) agerate <- add_labels(agerate, lang = "en") draw_linechart(agerate, agegrp, rate, sex) agerate <- create_age_rate(fbsw, year, sex, cancer) agerate <- add_labels(agerate, lang = "en") agerate <- dplyr::filter(agerate, cancer %in% as.character(c(103:106))) draw_linechart(agerate, agegrp, rate, sex, cancer, grid = c(2, 2))data("canregs") fbsw <- count_canreg(canregs[[1]], label_tail="yrs") agerate <- create_age_rate(fbsw, year, sex) agerate <- add_labels(agerate, lang = "en") draw_linechart(agerate, agegrp, rate, sex) agerate <- create_age_rate(fbsw, year, sex, cancer) agerate <- add_labels(agerate, lang = "en") agerate <- dplyr::filter(agerate, cancer %in% as.character(c(103:106))) draw_linechart(agerate, agegrp, rate, sex, cancer, grid = c(2, 2))
This function draws a population pyramid using either raw population numbers or proportions, displaying age groups in the center and population counts (e.g., by sex) on each side.
draw_pyramid( data, x, y, group, facet = NULL, facet_label = NULL, grid = NULL, show_value = FALSE, show_prop = TRUE, left_axis = NULL, right_axis = NULL, left_label = NULL, right_label = NULL, cgap = 0.3, cstep = 1, csize = 1, labs = c("Males", "Ages", "Females"), gl = 2, cadj = 0, cols = c("#006400", "#b32134"), dens = c(-1, -1), main = "", ... )draw_pyramid( data, x, y, group, facet = NULL, facet_label = NULL, grid = NULL, show_value = FALSE, show_prop = TRUE, left_axis = NULL, right_axis = NULL, left_label = NULL, right_label = NULL, cgap = 0.3, cstep = 1, csize = 1, labs = c("Males", "Ages", "Females"), gl = 2, cadj = 0, cols = c("#006400", "#b32134"), dens = c(-1, -1), main = "", ... )
data |
A data.frame containing the variables for age group ( |
x |
A variable indicating age groups (quoted or unquoted). |
y |
A variable indicating population counts (quoted or unquoted). |
group |
A grouping variable, typically representing sex (quoted or unquoted). |
facet |
Optional unquoted variable to facet the data by (e.g., year, region). A separate pyramid will be drawn for each unique value. |
facet_label |
Optional character vector of labels for each facet.
If |
grid |
Optional vector of two integers specifying the layout of the
facet grid (number of rows, number of columns). If |
show_value |
Logical. If TRUE, displays the actual population values beside the bars. Default is FALSE. |
show_prop |
Logical. If TRUE, the bars represent proportions (%) rather than absolute values. Default is TRUE. |
left_axis |
Numeric vector of tick marks for the left side
(e.g., males). If NULL, it will be generated using |
right_axis |
Numeric vector of tick marks for the right side (e.g.,
females). If NULL, it will use |
left_label |
Character vector to customize axis labels on the left side.
If NULL, generated using |
right_label |
Character vector for axis labels on the right side.
Same rules as |
cgap |
Numeric. Width of the central gap (relative to axis length). Default is 0.3. |
cstep |
Integer. Step interval between age group labels. Default is 1 (every label shown). |
csize |
Numeric. Scaling factor for text and lines. Default is 1. |
labs |
A character vector of three labels: left side (e.g., "Males"),
center (e.g., "Ages"), and right side (e.g., "Females"). Default is
|
gl |
Integer. Indicating the line type of the grids. |
cadj |
Numeric. Vertical adjustment for center age labels. Default is 0. |
cols |
A character vector of two colors for the left and right bars.
Default is |
dens |
A numeric vector indicating shading densities (lines per inch)
for bars. Use -1 to fill solid bars. Default is |
main |
A character string for the main plot title. Default is an empty string. |
... |
Additional graphical parameters passed to the base |
A base R graphics pyramid plot. It does not return a value.
data("canregs") pop <- canregs[[1]]$POP draw_pyramid(pop, agegrp, rks, sex)data("canregs") pop <- canregs[[1]]$POP draw_pyramid(pop, agegrp, rks, sex)
Filters out cases classified as "other" cancer types based on specific codes.
drop_others(x)drop_others(x)
x |
A data frame containing a |
A filtered data frame excluding non-specific cancer codes.
data("canregs") asr <- create_quality(canregs, year, sex, cancer) asr2 <- asr |> drop_others()data("canregs") asr <- create_quality(canregs, year, sex, cancer) asr2 <- asr |> drop_others()
Filter out cases that summarized as total when using create_asr(),
create_quality(), or create_age_rate().
drop_total(x)drop_total(x)
x |
A data frame containing a |
A filtered data frame excluding cancer in c("60", "61").
data("canregs") qua <- create_quality(canregs, year, sex, cancer) qua2 <- qua |> drop_total()data("canregs") qua <- create_quality(canregs, year, sex, cancer) qua2 <- qua |> drop_total()
This function estimates fbswicd object based on existing fbswicd object
and using the total population data.
esti_fbswicd(obj, pop = NULL)esti_fbswicd(obj, pop = NULL)
obj |
An object with class of |
pop |
An population dataset to override the population in |
The estimation proceeds in the following steps:
Calculate age-specific incidence and mortality rates for different sex and cancer sites.
Derive estimated counts of fbs and sws using the
rate × population formula.
Recalculate proportions (e.g., mv, dco) back to
estimated case counts.
Join all information and return a new object of the same class as obj.
An estimated object with class of fbswicd.
create_age_rate(), cr_filter()
Estimating population structure using interpolation method.
esti_pop(pop1, pop2, period)esti_pop(pop1, pop2, period)
pop1 |
Population or population proportion in each age group for the start year. |
pop2 |
Population or population proportion in each age group for the end year. |
period |
Vector contain the start year and end year value. |
A data frame contain the estimated population proportion in each year during the period with each year in one column and each age group in one row.
pop1 <- c( 59546, 294129, 472511, 552549, 821119, 996436, 805635, 1004506, 989357, 1056612, 986559, 792270, 544544, 452297, 473579, 350802, 212614, 109598, 61990 ) pop2 <- c( 75641, 377276, 327116, 380338, 539034, 1158852, 1152329, 881443, 903484, 1011164, 1238871, 1137832, 1022787, 645441, 464777, 482941, 406144, 227977, 144526 ) esti_pop(pop1, pop2, c(2000, 2010))pop1 <- c( 59546, 294129, 472511, 552549, 821119, 996436, 805635, 1004506, 989357, 1056612, 986559, 792270, 544544, 452297, 473579, 350802, 212614, 109598, 61990 ) pop2 <- c( 75641, 377276, 327116, 380338, 539034, 1158852, 1152329, 881443, 903484, 1011164, 1238871, 1137832, 1022787, 645441, 464777, 482941, 406144, 227977, 144526 ) esti_pop(pop1, pop2, c(2000, 2010))
expand_age_pop transforms population data aggregated in age groups into
estimates for single-year ages. It utilizes interpolation methods to
distribute the grouped data across individual ages, ensuring consistency
with the original totals.
expand_age_pop(x, method = "linear")expand_age_pop(x, method = "linear")
x |
A numeric vector representing the population counts for each age group. The vector should have 19 elements corresponding to the following age groups: 0, 1–4, 5–9, ..., 85+. |
method |
A character string specifying the interpolation method to use. Options include:
The default is |
A data frame with two columns:
xInteger ages from 0 to 92.
yEstimated population counts for each single-year age.
# Example population data for 19 age groups: 0, 1–4, 5–9, ..., 85+ ages <- c( 5053, 17743, 25541, 32509, 30530, 34806, 36846, 38691, 40056, 39252, 37349, 30507, 26363, 21684, 15362, 11725, 7461, 3260, 915 ) eages <- expand_age_pop(ages) head(eages)# Example population data for 19 age groups: 0, 1–4, 5–9, ..., 85+ ages <- c( 5053, 17743, 25541, 32509, 30530, 34806, 36846, 38691, 40056, 39252, 37349, 30507, 26363, 21684, 15362, 11725, 7461, 3260, 915 ) eages <- expand_age_pop(ages) head(eages)
expand_lx() transforms a five-year abridged life table into a
one-year complete life table using the Elandt–Johnson method.
expand_lx(lx, sage = c(0, 1, seq(5, 85, 5)), max_age = 100)expand_lx(lx, sage = c(0, 1, seq(5, 85, 5)), max_age = 100)
lx |
A numeric vector representing the number of survivors ( |
sage |
Numeric vector of starting ages for the age groups,
default is |
max_age |
Numeric value specifying the max age of the output estimated lx(fitlx) and mx (fitmx). |
A list containing:
A numeric vector representing the estimated number of survivors at each single year of age from 0 to 100.
A numeric vector representing the estimated
central death rates () for each single year of
age from 0 to 100.
Baili, P., Micheli, A., Montanari, A., & Capocaccia, R. (2005). Comparison of Four Methods for Estimating Complete Life Tables from Abridged Life Tables Using Mortality Data Supplied to EUROCARE-3. Mathematical Population Studies, 12(4), 183–198. https://doi.org/10.1080/08898480500301751
# Example abridged life table data (normalized to a radix of 1) lx <- c( 100000, 99498.39, 99294.62, 99173.88, 99047.59, 98840.46, 98521.16, 98161.25, 97636.99, 96900.13, 95718.96, 93930.91, 91463.21, 87131.41, 80525.02, 70907.59, 58090.75, 41630.48, 24019.33 ) lx <- lx / 100000 expand_lx(lx)# Example abridged life table data (normalized to a radix of 1) lx <- c( 100000, 99498.39, 99294.62, 99173.88, 99047.59, 98840.46, 98521.16, 98161.25, 97636.99, 96900.13, 95718.96, 93930.91, 91463.21, 87131.41, 80525.02, 70907.59, 58090.75, 41630.48, 24019.33 ) lx <- lx / 100000 expand_lx(lx)
Extracts population data from canreg-style objects and optionally summarizes it by specified grouping variables.
get_pop(data, sum_by = NULL, collapse = FALSE) ## S3 method for class 'canreg' get_pop(data, sum_by = NULL, collapse = FALSE) ## S3 method for class 'canregs' get_pop(data, sum_by = NULL, collapse = FALSE) ## S3 method for class 'fbswicd' get_pop(data, sum_by = NULL, collapse = FALSE) ## S3 method for class 'fbswicds' get_pop(data, sum_by = NULL, collapse = FALSE)get_pop(data, sum_by = NULL, collapse = FALSE) ## S3 method for class 'canreg' get_pop(data, sum_by = NULL, collapse = FALSE) ## S3 method for class 'canregs' get_pop(data, sum_by = NULL, collapse = FALSE) ## S3 method for class 'fbswicd' get_pop(data, sum_by = NULL, collapse = FALSE) ## S3 method for class 'fbswicds' get_pop(data, sum_by = NULL, collapse = FALSE)
data |
An object of class |
sum_by |
Character vector of grouping variables to summarize population. |
collapse |
Logical, if |
A data frame or a list of data frames depending on collapse.
Retrieves standardized population data for a specified standard from dict_maps.
get_stdpop(std = "wld85", sep_zero = TRUE)get_stdpop(std = "wld85", sep_zero = TRUE)
std |
Character string specifying the population standard. Supported values are "cn64", "cn82", "cn2000", "wld85", "wld2000". Defaults to "wld85". |
sep_zero |
Logical value indicating whether age 0 should be treated as a separate group. |
A vector or data structure containing the standardized population data for the specified standard, or NULL if the standard is not supported.
## Not run: get_std("cn64") get_std("wld2000") ## End(Not run)## Not run: get_std("cn64") get_std("wld2000") ## End(Not run)
cr_reframe()
list built-in attributes name used in cr_reframe()
ls_attrs()ls_attrs()
A character vector.
ls_attrs()ls_attrs()
list dictionary used in package
ls_dict(dict = "registry")ls_dict(dict = "registry")
dict |
Character, name of dictionary. |
A tibble of dictionary.
ls_dict("registry") ls_dict("area_type")ls_dict("registry") ls_dict("area_type")
Lists all dictionary .rds files stored in the user's config directory
for canregtools.
ls_dict_files(full.names = FALSE, with_info = FALSE)ls_dict_files(full.names = FALSE, with_info = FALSE)
full.names |
Logical. Whether to return full file paths.
Default is |
with_info |
Logical. Whether to return file size and last modified time.
Default is |
A character vector of file names, or a tibble with file info
if with_info = TRUE.
ls_dict_files() ls_dict_files(full.names = TRUE) ls_dict_files(with_info = TRUE)ls_dict_files() ls_dict_files(full.names = TRUE) ls_dict_files(with_info = TRUE)
This function returns a tibble containing variable names and their
descriptions based on the specified category.
ls_vars(type = "std")ls_vars(type = "std")
type |
A character string specifying the type of variables to list.
Options are |
A tibble with five columns:
code: The name of the variable.
cname: A detailed description of the variable in Chinese.
ename: A detailed description of the variable in English.
abbr_cn: An abbreviated description of the code label in Chinese.
abbr_en: An abbreviated description of the code label in English.
cr_reframe(), summary(), create_asr()
ls_vars("std") ls_vars("summary") ls_vars("reframe")ls_vars("std") ls_vars("summary") ls_vars("reframe")
lt() constructs a life table based on a vector of age-specific mortality
rates (mx), starting ages of age groups, and specified sex. It calculates
standard life table columns including the probability of dying (qx),
number of survivors (lx), number of deaths (dx), person-years lived (Lx),
total person-years remaining (Tx), and life expectancy (ex).
It can also compute a cause-deleted life table if cancer_death is provided.
lt( death = NULL, cancer_death = NULL, pop = NULL, mx = NULL, sage = c(0, 1, seq(5, 85, 5)), sex = "total", cohort = 1e+05, qx_method = "constant" )lt( death = NULL, cancer_death = NULL, pop = NULL, mx = NULL, sage = c(0, 1, seq(5, 85, 5)), sex = "total", cohort = 1e+05, qx_method = "constant" )
death |
Number of deaths from vital statistics. |
cancer_death |
Cancer related death. |
pop |
Average population size. |
mx |
Numeric vector of age-specific mortality rates. |
sage |
Numeric vector of starting ages for the age groups,
default is |
sex |
Character string specifying the sex: "male", "female", or "total" (default is "total"). |
cohort |
The size of the initial cohort (default is 100000). |
qx_method |
Character string specifying the method used to estiamte the probability of dying between age x and x + n (qx). |
The function uses standard demographic formulas to compute life table values. The average number of person-years lived in the interval by those dying in the interval (ax) is estimated based on standard formulas for infant ages, and n/2 for other ages. The calculations assume a starting population (radix) of cohort.
If cancer_death is provided, a cause-deleted life table is computed by adjusting mx to exclude cancer deaths.
A data frame with the following columns:
Starting age of each age group.
Age-specific mortality rate.
Probability of dying between age x and x+n.
Number of survivors at exact age x, starting from a radix of cohort.
Number of deaths between ages x and x+n.
Person-years lived between ages x and x+n.
Total person-years remaining after age x.
Life expectancy at exact age x.
Optionally includes pop, death, and cancer_death if provided.
pop <- c(3605201, 14795034, 41758253, 44202275, 44834666, 42184137, 40868806, 43408408, 33111965, 16344101, 6336435) death <- c(19538, 2027, 9730, 37218, 71511, 104616, 193514, 450972, 686178, 816715, 963781) cancer_death <- c(80, 328, 924, 1893, 2841, 10917, 33965, 87945, 152843, 169826, 144958) lt(death = death, pop = pop, sage = c(0,1, seq(5, 85, 10)))pop <- c(3605201, 14795034, 41758253, 44202275, 44834666, 42184137, 40868806, 43408408, 33111965, 16344101, 6336435) death <- c(19538, 2027, 9730, 37218, 71511, 104616, 193514, 450972, 686178, 816715, 963781) cancer_death <- c(80, 328, 924, 1893, 2841, 10917, 33965, 87945, 152843, 169826, 144958) lt(death = death, pop = pop, sage = c(0,1, seq(5, 85, 10)))
This dataset contains key quality indicators of population-based cancer registry (PBCR) data, as published in the China Cancer Registry Annual Report by the National Cancer Center of China.
qualityquality
A data frame with quality indicators for various cancer types across different years and area types.
yearCalendar year of the PBCR data.
area_typeArea classification code: 910000 for urban areas,
920000 for rural areas.
cancerCancer type code.
mvProportion of morphologically verified (MV) cases.
dcoProportion of cases identified through sdeath certificates only (DCO).
miMortality-to-incidence (MI) ratio.
China Cancer Registry Annual Report, National Cancer Center, China.
data("quality")data("quality")
Reads cancer registry data from one or more Excel files or from
a directory containing Excel files. It extracts case and population data and
returns objects of class "canreg" or a list of such objects ("canregs").
read_canreg( x, pop_type = "long", age_var = "agegroup", pop_var = "popu", death_var = "death" )read_canreg( x, pop_type = "long", age_var = "agegroup", pop_var = "popu", death_var = "death" )
x |
A path to an Excel file, a character vector of file paths, or a directory containing Excel files. |
pop_type |
A character string specifying the format of the population
sheet. Must be either |
age_var |
Name of the age group column (used only when
|
pop_var |
Name of the population count column (used only when
|
death_var |
Name of the death count column (used only when
|
A canreg object (a list with components: areacode, FBcases,
SWcases, POP) if a single file is read. A named list of such objects
with class "canregs" if multiple files or a directory is provided.
## Not run: file_address <- "410302.xlsx" canreg <- read_canreg(file_address, pop_type = "long") ## End(Not run)## Not run: file_address <- "410302.xlsx" canreg <- read_canreg(file_address, pop_type = "long") ## End(Not run)
Query the area codes corresponding to a given registry type.
show_registry(regi_type = 1:4)show_registry(regi_type = 1:4)
regi_type |
Numeric or character vector indicating registry types. Defaults to 1:4. |
A character vector of area codes.
show_registry(1:4) show_registry(c("1", "2"))show_registry(1:4) show_registry(c("1", "2"))
Summary object of class 'canreg'
## S3 method for class 'canreg' summary(object, collapse = FALSE, ...) ## S3 method for class 'canregs' summary(object, collapse = TRUE, ...)## S3 method for class 'canreg' summary(object, collapse = FALSE, ...) ## S3 method for class 'canregs' summary(object, collapse = TRUE, ...)
object |
Object data with class of 'canreg', 'canregs' |
collapse |
Collapse data or not. |
... |
Other filter expressions |
A data frame contains summary statistics of canreg data.
data("canregs") data <- canregs[[1]] summary(data) summary(canregs)data("canregs") data <- canregs[[1]] summary(data) summary(canregs)
Parses age descriptions written in Chinese and converts them into numeric values expressed in years, months, or days. It interprets age strings containing Chinese characters such as (years), (months), and (days), and converts them to a numeric vector representing age in the specified unit.
tidy_age(x, unit = "year")tidy_age(x, unit = "year")
x |
A character vector containing age descriptions in Chinese. |
unit |
A character string specifying the unit of the returned values.
Options are |
A numeric vector representing ages in the specified unit:
year: Truncated age in years.
month: Truncated age in months.
day: Rounded age in days.
agedes <- c( "50\u5c8110\u67083\u6708", "19\u5c815\u6708", "1\u5c8130\u6708", "3\u670820\u6708", "30\u6708" ) tidy_age(agedes, unit = "year") tidy_age(agedes, unit = "month") tidy_age(agedes, unit = "day")agedes <- c( "50\u5c8110\u67083\u6708", "19\u5c815\u6708", "1\u5c8130\u6708", "3\u670820\u6708", "30\u6708" ) tidy_age(agedes, unit = "year") tidy_age(agedes, unit = "month") tidy_age(agedes, unit = "day")
Standardizes gender-related values into consistent numeric codes or factors.
This function maps various gender-related character strings (e.g., "male",
"female", "man", "woman", "1", "2", etc.) to standardized numeric values:
1 for male, 2 for female, and 0 for total. It supports both Chinese
and English labels. Optionally, the result can be returned as a factor with
appropriate labels.
tidy_sex(x, lang = "cn", as_factor = FALSE)tidy_sex(x, lang = "cn", as_factor = FALSE)
x |
A character or numeric vector containing gender information. |
lang |
Character, specify the output language, options are 'cn', or 'en', default is 'cn'. |
as_factor |
Logical, indicate whether output value as factor. |
A numeric vector or a factor representing gender:
Total
Male
Female
If as_factor = TRUE, a factor is returned with labels in the specified
language (lang).
gender <- c("male", "men", "women", "female", "women", "man", "1", "2") tidy_sex(gender)gender <- c("male", "men", "women", "female", "women", "man", "1", "2") tidy_sex(gender)
Standardizes and labels values of a specified variable according to the national cancer registration standard of China: T/CHIA 18-2021.
tidy_var( x, var_name = "occu", label_type = "full", lang = "code", sep = "", as_factor = FALSE )tidy_var( x, var_name = "occu", label_type = "full", lang = "code", sep = "", as_factor = FALSE )
x |
A character vector containing raw values of a variable used in cancer registry data. |
var_name |
A character string indicating the name of the variable to
reformat (e.g., |
label_type |
Type of the label used ("full" or "abbr"). |
lang |
Character, specify the output language, options are 'cn', or 'en', default is 'cn'. |
sep |
A character string to separate the label. |
as_factor |
Logical, indicate whether output value as factor. |
tidy_var() converts raw character inputs into standardized labels, codes,
or abbreviations based on reference mappings defined for each variable (e.g.,
occupation, basis of diagnosis, etc.). It supports both Chinese and English
outputs and can return values as factors with labeled levels.
A character or factor vector of reformatted values. The output
depends on the settings for label_type, lang, and as_factor:
If as_factor = FALSE, returns a character vector.
If as_factor = TRUE, returns a factor with sorted unique levels.
The labels used depend on lang ("cn", "en", "code", or "icd10")
and label_type ("full" or "abbr").
occu <- c("11", "13", "17", "21", "24", "27", "31", "37", "51", "80", "90") tidy_var(occu, var_name = "occu", lang = "cn") tidy_var(occu, var_name = "occu", lang = "en") tidy_var(occu, var_name = "occu", lang = "cn", label_type = "abbr") tidy_var(occu, var_name = "occu", lang = "en", label_type = "abbr")occu <- c("11", "13", "17", "21", "24", "27", "31", "37", "51", "80", "90") tidy_var(occu, var_name = "occu", lang = "cn") tidy_var(occu, var_name = "occu", lang = "en") tidy_var(occu, var_name = "occu", lang = "cn", label_type = "abbr") tidy_var(occu, var_name = "occu", lang = "en", label_type = "abbr")
Calculates the truncated age-standardized rate (ASR) over a specified age range (e.g., 35–64 years). It uses the direct method of standardization by applying age-specific rates to a standard population within the truncated age group. This is particularly useful when comparing disease rates in middle-aged or other focused subgroups.
truncrate( cases, pop, stdpop = NULL, trunc_age = c(35, 64), agewidth = 5, sep_zero = TRUE, mp = 100, decimal = 2 )truncrate( cases, pop, stdpop = NULL, trunc_age = c(35, 64), agewidth = 5, sep_zero = TRUE, mp = 100, decimal = 2 )
cases |
Number of cases. |
pop |
Number of population at risk. |
stdpop |
The standard population. |
trunc_age |
The truncated age range. |
agewidth |
Age groups width, default is 5. |
sep_zero |
Logical value, if the 0 age group was a separate group. |
mp |
A multiplier used to scale the calculated rates. Default is 100. |
decimal |
Decimals of the calculated rates, default is 2. |
Truncated age standardized rate.
px <- c( 20005, 86920, 102502, 151494, 182932, 203107, 240289, 247076, 199665, 163820, 145382, 86789, 69368, 51207, 39112, 20509, 12301, 6586, 1909 ) dx <- c( 156, 58, 47, 49, 48, 68, 120, 162, 160, 294, 417, 522, 546, 628, 891, 831, 926, 731, 269 ) stdpop <- c(2.4, 9.6, 10, 9, 9, 8, 8, 6, 6, 6, 6, 5, 4, 4, 3, 2, 1, 0.5, 0.5) truncrate(dx, px, stdpop, trunc_age = c(35, 64))px <- c( 20005, 86920, 102502, 151494, 182932, 203107, 240289, 247076, 199665, 163820, 145382, 86789, 69368, 51207, 39112, 20509, 12301, 6586, 1909 ) dx <- c( 156, 58, 47, 49, 48, 68, 120, 162, 160, 294, 417, 522, 546, 628, 891, 831, 926, 731, 269 ) stdpop <- c(2.4, 9.6, 10, 9, 9, 8, 8, 6, 6, 6, 6, 5, 4, 4, 3, 2, 1, 0.5, 0.5) truncrate(dx, px, stdpop, trunc_age = c(35, 64))
Saves user-defined administrative division codes and their associated labels
(in both Chinese and English) to the local dictionary used by canregtools.
write_areacode(x = NULL, cache_refresh = FALSE)write_areacode(x = NULL, cache_refresh = FALSE)
x |
A data frame containing at least the following columns:
|
cache_refresh |
Logical. If TRUE, refresh the dictionary to default values before updating. |
Invisibly returns NULL. The function is called for its side effect.
## Not run: dict <- data.frame( areacode = c("410302"), cname = c("\u8001\u57CE\u533A"), ename = c("Laocheng District"), abbr_cn = c("\u8001\u57CE"), abbr_en = c("Laocheng") ) write_areacode(dict) ## End(Not run)## Not run: dict <- data.frame( areacode = c("410302"), cname = c("\u8001\u57CE\u533A"), ename = c("Laocheng District"), abbr_cn = c("\u8001\u57CE"), abbr_en = c("Laocheng") ) write_areacode(dict) ## End(Not run)
Stores the mapping between six-digit administrative division codes areacode and their corresponding attributes e.g., registry names or area types into a dictionary file. This supports consistent labeling and downstream processing in cancer registry data.
write_registry( x = NULL, dict = "registry", cache_refresh = FALSE, quiet = TRUE ) ## S3 method for class 'data.frame' write_registry( x = NULL, dict = "registry", cache_refresh = FALSE, quiet = TRUE ) ## S3 method for class 'list' write_registry( x = NULL, dict = "registry", cache_refresh = FALSE, quiet = TRUE ) ## S3 method for class ''NULL'' write_registry( x = NULL, dict = "registry", cache_refresh = FALSE, quiet = TRUE ) ## S3 method for class 'character' write_registry( x = NULL, dict = "registry", cache_refresh = FALSE, quiet = TRUE )write_registry( x = NULL, dict = "registry", cache_refresh = FALSE, quiet = TRUE ) ## S3 method for class 'data.frame' write_registry( x = NULL, dict = "registry", cache_refresh = FALSE, quiet = TRUE ) ## S3 method for class 'list' write_registry( x = NULL, dict = "registry", cache_refresh = FALSE, quiet = TRUE ) ## S3 method for class ''NULL'' write_registry( x = NULL, dict = "registry", cache_refresh = FALSE, quiet = TRUE ) ## S3 method for class 'character' write_registry( x = NULL, dict = "registry", cache_refresh = FALSE, quiet = TRUE )
x |
A named list, a named character vector or a data frame that
includes the columns |
dict |
A character string specifying the type of dictionary to update. Supported values are "registry" and "area_type". |
cache_refresh |
If TRUE, refresh the dictionary to default values before updating. |
quiet |
If TRUE, message will be supressed. |
This function allows users to build or update local mapping dictionaries for area-level attributes. It supports multiple input formats and updates internal files saved in the user-specific R cache directory.
A tibble with two columns: areacode and the corresponding
attribute values.
write_registry(list('410302' = '410301')) # Registry attributes stored in data frame registry_dict <- data.frame( areacode = c("410302", "410303", "410304", "410305", "410306", "410307"), registry = c(rep("410301", 5), "410300"), area_type = rep("urban", 6) ) write_registry(registry_dict) # Registry attributes stored in list dict <- list( '410302' = '410301', '410303' = '410301', '410304' = '410301', '410305' = '410301', '410306' = '410301', '410307' = '410301' ) write_registry(dict) # Registry attributes using built-in information dict <- NULL write_registry(dict) # Registry attributes stored in named character vector with areacode as # name and attributes as values dict <- rep("410301", 5) names(dict) <- c("410302", "410303", "410304", "410305", "410306") write_registry(dict)write_registry(list('410302' = '410301')) # Registry attributes stored in data frame registry_dict <- data.frame( areacode = c("410302", "410303", "410304", "410305", "410306", "410307"), registry = c(rep("410301", 5), "410300"), area_type = rep("urban", 6) ) write_registry(registry_dict) # Registry attributes stored in list dict <- list( '410302' = '410301', '410303' = '410301', '410304' = '410301', '410305' = '410301', '410306' = '410301', '410307' = '410301' ) write_registry(dict) # Registry attributes using built-in information dict <- NULL write_registry(dict) # Registry attributes stored in named character vector with areacode as # name and attributes as values dict <- rep("410301", 5) names(dict) <- c("410302", "410303", "410304", "410305", "410306") write_registry(dict)