Last updated: 2022-03-16

Checks: 5 2

Knit directory: codemapper/

This reproducible R Markdown analysis was created with workflowr (version 1.6.2). The Checks tab describes the reproducibility checks that were applied when the results were created. The Past versions tab lists the development history.


The R Markdown file has unstaged changes. To know which version of the R Markdown file created these results, you’ll want to first commit it to the Git repo. If you’re still working on the analysis, you can ignore this warning. When you’re finished, you can run wflow_publish to commit the R Markdown file and build the HTML.

The global environment had objects present when the code in the R Markdown file was run. These objects can affect the analysis in your R Markdown file in unknown ways. For reproduciblity it’s best to always run the code in an empty environment. Use wflow_publish or wflow_build to ensure that the code is always run in an empty environment.

The following objects were defined in the global environment when these results were created:

Name Class Size
install_codemapper function 1.2 Kb

The command set.seed(20210923) was run prior to running the code in the R Markdown file. Setting a seed ensures that any results that rely on randomness, e.g. subsampling or permutations, are reproducible.

Great job! Recording the operating system, R version, and package versions is critical for reproducibility.

Nice! There were no cached chunks for this analysis, so you can be confident that you successfully produced the results during this run.

Great job! Using relative paths to the files within your workflowr project makes it easier to run your code on other machines.

Great! You are using Git for version control. Tracking code development and connecting the code version to the results is critical for reproducibility.

The results in this page were generated with repository version b425304. See the Past versions tab to see a history of the changes made to the R Markdown and HTML files.

Note that you need to be careful to ensure that all relevant files for the analysis have been committed to Git prior to generating the results (you can use wflow_publish or wflow_git_commit). workflowr only checks the R Markdown file, but you know if there are other scripts or data files that it depends on. Below is the status of the Git repository when the results were generated:


Ignored files:
    Ignored:    .Renviron
    Ignored:    .Rhistory
    Ignored:    .Rproj.user/
    Ignored:    _targets/
    Ignored:    all_lkps_maps.db
    Ignored:    all_lkps_maps.db.gz
    Ignored:    renv/library/
    Ignored:    renv/staging/
    Ignored:    tar_make.R

Unstaged changes:
    Modified:   R/clinical_codes.R
    Modified:   R/lookups_and_mappings.R
    Modified:   R/utils.R
    Modified:   _targets.R
    Modified:   analysis/read2_icd10_mapping.Rmd
    Modified:   analysis/read3_icd10_mapping.Rmd

Note that any generated files, e.g. HTML, png, CSS, etc., are not included in this status report because it is ok for generated content to have uncommitted changes.


These are the previous versions of the repository in which changes were made to the R Markdown (analysis/read3_icd10_mapping.Rmd) and HTML (public/read3_icd10_mapping.html) files. If you’ve configured a remote Git repository (see ?wflow_git_remote), click on the hyperlinks in the table below to view the files as they were in that past version.

File Version Author Date Message
Rmd b425304 rmgpanw 2022-03-15 add code and tests to reformat read2 to icd10 mapping table
Rmd a81c1e7 rmgpanw 2022-03-14 add notes; add filter_cols function plus tests (all passing)
html e5c5381 rmgpanw 2022-03-10 update mainly icd10-related codes.
Rmd ae02335 Chuin Ying Ung 2022-02-22 update notes for read3_icd10
Rmd dfbc621 Chuin Ying Ung 2022-02-22 add mapping notes

library(tidyverse)
library(reactable)
library(readxl)
library(crosstalk)
library(targets)
library(codemapper)
library(flextable)

all_lkps_maps <- tar_read(all_lkps_maps_raw) %>% 
  purrr::map(codemapper:::rm_footer_rows_all_lkps_maps_df) %>%
  purrr::map(~ tibble::rowid_to_column(.data = .x,
                                       var = ".rowid"))

read_ctv3_icd10 <- all_lkps_maps$read_ctv3_icd10
icd10_lkp <- all_lkps_maps$icd10_lkp
# utility functions
append_read_icd10_descriptions <- function(df,
                                            read_type = "read3") {
  match.arg(read_type,
            c("read2", "read3"))
  
  # get read and icd10 descriptions
  read_df <- df$read_code %>%
    lookup_codes(code_type = read_type, 
                 unrecognised_codes = "warning") %>%
    select(read_code = code,
           description_read3 = description)
  
  icd10_df <- df$icd10_code %>%
    lookup_codes(code_type = "icd10",
                 unrecognised_codes = "warning") %>%
    select(icd10_code = code,
           description_icd10 = description)
  
  # append descriptions
  list(df,
       read_df,
       icd10_df) %>%
    reduce(full_join) %>%
    select(contains("read"),
           contains("icd10"),
           everything())
}

Key points

Unrecognised ICD10 codes

[1] TRUE

All Read 3 codes in the read_ctv3_icd10 mapping table are present in the read_ctv3_lkp lookup table.

# identify unrecognised ICD10 codes
unrecognised_icd10_codes <- subset(read_ctv3_icd10$icd10_code,
       !read_ctv3_icd10$icd10_code %in% icd10_lkp$ALT_CODE) %>% 
  unique()

There are 919 ICD10 codes in the read_ctv3_icd10 mapping table that are not present in the icd10_lkp table (ALT_CODE format).

[1] TRUE

These are all either ‘asterisk’ or ‘dagger’ codes, which have been appended with ‘A’ or ‘D’ respectively.

[1] TRUE

After removing ‘A’ or ‘D’ from the ICD10 codes in mapping table read_ctv3_icd10, all ICD10 codes are also present in lookup table icd10_lkp.

Mapping status and refine flag

The Read 3 to ICD10 mapping table has a number of columns with additional information:

read_ctv3_icd10 %>% 
  head() %>% 
  select(-.rowid) %>% 
  knitr::kable(caption = "First few rows of the Read 3 to ICD10 mapping table")
First few rows of the Read 3 to ICD10 mapping table
read_code icd10_code mapping_status refine_flag add_code_flag element_num block_num
123.. Z831 D C M 0 0
123.. Z830 A C M 0 0
1231. Z831 G C M 0 0
1232. Z831 G C M 0 0
1233. Z831 G C M 0 0
1244. Z848 G C M 0 0

These are described by UK Biobank resource 592. The following tables show how refine_flag status varies by mapping_status. Mappings labelled with mapping_status ‘E’, ‘D’ and ‘G’ have the lowest proportion labelled with refine_flag ‘M’ (i.e. it is mandatory to check mappings labelled with refine_flag ‘M’ against the default)

Counts:

refine_flag_by_mapping_status <- read_ctv3_icd10 %>% 
  distinct(mapping_status) %>% 
  filter(!is.na(mapping_status)) %>% 
  pull(mapping_status) %>% 
  set_names() %>% 
  map(~ {
    read_ctv3_icd10 %>% 
        filter(mapping_status == .x) %>% 
        group_by(refine_flag) %>% 
      summarise(n_unique_read3 = length(unique(read_code)))
    }) %>% 
  bind_rows(.id = "mapping_status") %>% 
  mutate(refine_flag = paste0("refine_flag_", refine_flag)) %>% 
  pivot_wider(names_from = refine_flag,
              values_from = n_unique_read3) %>% 
  mutate(Total_unique_read3 = rowSums(across(refine_flag_C:refine_flag_P)))

knitr::kable(refine_flag_by_mapping_status)
mapping_status refine_flag_C refine_flag_M refine_flag_P Total_unique_read3
D 27626 708 1950 30284
A 2891 8671 103 11665
G 31066 1916 1192 34174
R 5859 3598 268 9725
E 3660 41 3157 6858

Percentages:

refine_flag_by_mapping_status_pct <- refine_flag_by_mapping_status %>% 
  mutate(across(refine_flag_C:refine_flag_P,
                ~ round(.x / Total_unique_read3 * 100, digits = 3)))

knitr::kable(refine_flag_by_mapping_status_pct)
mapping_status refine_flag_C refine_flag_M refine_flag_P Total_unique_read3
D 91.223 2.338 6.439 30284
A 24.784 74.333 0.883 11665
G 90.905 5.607 3.488 34174
R 60.247 36.997 2.756 9725
E 53.368 0.598 46.034 6858

Mappings are categorised under the following refine_flag labels:

refine_flag Description
C Completely refined
M Mandatory to refine further
P Possible but not mandatory to refine further
The refine_flag denotes whether or not the target code is sufficiently detailed to be acceptable. 3-character ICD codes are usually not acceptable, for example. Covers addition of 4th and 5th digit extensions in ICD, 4th character in OPCS-4.

+=====================================================================================================================================================================================================================================================+ +—————————————————————————————————————————————————————————————————————————————————–+

# get all possible combinations of `mapping_status`, `refine_flag` and
# `add_code_flag`
mapping_metadata_vars <- c("mapping_status",
  "refine_flag",
  "add_code_flag")

mapping_metadata_vars_combinations <- mapping_metadata_vars %>% 
  set_names() %>% 
  
  # get unique values for each variable
  map(~ read_ctv3_icd10 %>% 
        filter(!is.na(.data[[.x]])) %>% 
        pull(.data[[.x]]) %>% 
               unique()) %>% 
  
  # generate all possible combinations
  crossing(!!!.) %>% 
  
  # perform filtering join for only combinations that are actually present in
  # mapping table
  semi_join(read_ctv3_icd10,
            by = mapping_metadata_vars)

# convert this to a named list (where names describe the combination)
mapping_metadata_vars_combinations_list <- mapping_metadata_vars_combinations %>% 
  unite(col = "combination_label",
        everything(),
        remove = FALSE) %>% 
  mutate(combination_label = paste0("mapping_status/refine_flag/add_code_flag: ",
                                    combination_label))

mapping_metadata_vars_combinations_list <-
  split(
    mapping_metadata_vars_combinations_list,
    mapping_metadata_vars_combinations_list$combination_label
  ) %>%
  map(~ select(.x, -combination_label))

Here are some example codes for each refine flag

# df of example mappings
mapping_metadata_vars_combinations_list_example_codes <-
  mapping_metadata_vars_combinations_list %>%
  map( ~ right_join(.x,
                    read_ctv3_icd10,
                    by = mapping_metadata_vars) %>%
         head(1)) %>%
  bind_rows() %>%
  pull(read_code)

mapping_metadata_vars_combinations_list_examples <-
  read_ctv3_icd10 %>%
  filter(read_code %in% mapping_metadata_vars_combinations_list_example_codes) %>%
  codemapper:::reformat_read_ctv3_icd10() %>%
  select(-.rowid) %>%
  append_read_icd10_descriptions() %>%
  bind_rows()

# crosstalk reactable table
mapping_metadata_vars_combinations_list_examples_crosstalk <-
  SharedData$new(mapping_metadata_vars_combinations_list_examples)

bscols(
  widths = c(3, 9),
  list(
    crosstalk::filter_select(
      "read_code",
      "Read code",
      mapping_metadata_vars_combinations_list_examples_crosstalk,
      ~ read_code
    ),
    crosstalk::filter_checkbox(
      "mapping_status",
      "Mapping status",
      mapping_metadata_vars_combinations_list_examples_crosstalk,
      ~ mapping_status
    ),
    crosstalk::filter_checkbox(
      "refine_flag",
      "Refine flag",
      mapping_metadata_vars_combinations_list_examples_crosstalk,
      ~ refine_flag
    ),
    crosstalk::filter_checkbox(
      "add_code_flag",
      "Add code flag",
      mapping_metadata_vars_combinations_list_examples_crosstalk,
      ~ add_code_flag
    )
  ),
  reactable(
    mapping_metadata_vars_combinations_list_examples_crosstalk,
    filterable = TRUE,
    searchable = TRUE,
    showPageSizeOptions = TRUE,
    pageSizeOptions = c(5, 25, 50, 100),
    defaultPageSize = 5,
    resizable = TRUE,
    paginationType = 'jump'
  )
)

Examples of non-specific Read 3 to sex-specific ICD10 code mappings

The Read 3 code XaIP9 maps ‘sebaceous cyst’ to various ICD10 codes, including some which are sex-specific. A similar issue arises for Read 3 M262.

# df of example mappings
non_specific_read3_to_sex_specific_icd10_examples <-
  read_ctv3_icd10 %>%
  filter(read_code %in% c('XaIP9', 'M262.')) %>%
  codemapper:::reformat_read_ctv3_icd10() %>%
  select(-.rowid) %>%
  append_read_icd10_descriptions()

# crosstalk reactable table
non_specific_read3_to_sex_specific_icd10_examples_crosstalk <-
  SharedData$new(non_specific_read3_to_sex_specific_icd10_examples)

bscols(
  widths = c(3, 9),
  list(
    crosstalk::filter_checkbox(
      "read_code",
      "Read code",
      non_specific_read3_to_sex_specific_icd10_examples_crosstalk,
      ~ read_code
    ),
    crosstalk::filter_checkbox(
      "mapping_status",
      "Mapping status",
      non_specific_read3_to_sex_specific_icd10_examples_crosstalk,
      ~ mapping_status
    ),
    crosstalk::filter_checkbox(
      "refine_flag",
      "Refine flag",
      non_specific_read3_to_sex_specific_icd10_examples_crosstalk,
      ~ refine_flag
    ),
    crosstalk::filter_checkbox(
      "add_code_flag",
      "Add code flag",
      non_specific_read3_to_sex_specific_icd10_examples_crosstalk,
      ~ add_code_flag
    )
  ),
  reactable(
    non_specific_read3_to_sex_specific_icd10_examples_crosstalk,
    filterable = TRUE,
    searchable = TRUE,
    showPageSizeOptions = TRUE,
    pageSizeOptions = c(5, 25, 50, 100),
    defaultPageSize = 5,
    resizable = TRUE,
    paginationType = 'jump'
  )
)

Example: type 2 diabetes

The description for Read 3 code ‘C10..’ is ‘Diabetes mellitus’. This is an example of an unspecific Read 3 code that can potentially map to multiple specific ICD10 codes. Note that the default mapping (flagged as ‘D’ under mapping_status) is the most appropriate.

read_ctv3_icd10 %>%
  filter(read_code == 'C10..') %>%
  codemapper:::reformat_read_ctv3_icd10() %>%
  select(-.rowid) %>%
  append_read_icd10_descriptions() %>% 
  select(read_code:add_code_flag,
         -icd10_dagger_asterisk) %>% 
  flextable()

Compare also with the Read 2 to ICD10 mapping, which does not label mappings as ‘default’/‘alternative’/‘requires checking’ but simply maps to all ICD10 codes ‘E10-E14’:

all_lkps_maps$read_v2_icd10 %>%
  filter(read_code == 'C10..') %>%
  codemapper:::reformat_read_v2_icd10(icd10_lkp = all_lkps_maps$icd10_lkp) %>%
  select(-.rowid) %>%
  append_read_icd10_descriptions(read_type = "read2") %>% 
  select(-icd10_dagger_asterisk) %>% 
  flextable()

sessionInfo()
R version 4.1.0 (2021-05-18)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Big Sur 10.16

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRblas.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib

locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8

attached base packages:
[1] stats     graphics  grDevices datasets  utils     methods   base     

other attached packages:
 [1] flextable_0.6.10      codemapper_0.0.0.9000 ukbwranglr_0.0.0.9000
 [4] targets_0.8.0         crosstalk_1.1.1       readxl_1.3.1         
 [7] reactable_0.2.3       forcats_0.5.1         stringr_1.4.0        
[10] dplyr_1.0.7           purrr_0.3.4           readr_2.0.2          
[13] tidyr_1.1.4           tibble_3.1.4          ggplot2_3.3.5        
[16] tidyverse_1.3.1       workflowr_1.6.2      

loaded via a namespace (and not attached):
 [1] fs_1.5.0          bit64_4.0.5       lubridate_1.7.10  httr_1.4.2       
 [5] rprojroot_2.0.2   tools_4.1.0       backports_1.2.1   bslib_0.3.1      
 [9] utf8_1.2.2        R6_2.5.1          lazyeval_0.2.2    DBI_1.1.1        
[13] colorspace_2.0-2  withr_2.4.3       tidyselect_1.1.1  processx_3.5.2   
[17] bit_4.0.4         compiler_4.1.0    git2r_0.28.0      cli_3.0.1        
[21] rvest_1.0.1       xml2_1.3.2        officer_0.4.1     sass_0.4.0       
[25] scales_1.1.1      callr_3.7.0       systemfonts_1.0.4 digest_0.6.28    
[29] rmarkdown_2.11    base64enc_0.1-3   pkgconfig_2.0.3   htmltools_0.5.2  
[33] dbplyr_2.1.1      fastmap_1.1.0     highr_0.9         htmlwidgets_1.5.4
[37] rlang_0.4.11      RSQLite_2.2.9     rstudioapi_0.13   shiny_1.7.0      
[41] jquerylib_0.1.4   generics_0.1.0    jsonlite_1.7.2    zip_2.2.0        
[45] magrittr_2.0.1    Rcpp_1.0.7        munsell_0.5.0     fansi_0.5.0      
[49] gdtools_0.2.4     lifecycle_1.0.1   stringi_1.7.4     whisker_0.4      
[53] yaml_2.2.1        blob_1.2.2        grid_4.1.0        promises_1.2.0.1 
[57] crayon_1.4.1      haven_2.4.3       hms_1.1.1         knitr_1.34       
[61] ps_1.6.0          pillar_1.6.3      uuid_0.1-4        igraph_1.2.6     
[65] codetools_0.2-18  reprex_2.0.1      glue_1.4.2        evaluate_0.14    
[69] data.table_1.14.2 renv_0.13.2       modelr_0.1.8      vctrs_0.3.8      
[73] tzdb_0.1.2        httpuv_1.6.3      cellranger_1.1.0  gtable_0.3.0     
[77] reactR_0.4.4      assertthat_0.2.1  cachem_1.0.6      xfun_0.24        
[81] mime_0.12         xtable_1.8-4      broom_0.7.9       later_1.3.0      
[85] memoise_2.0.0     ellipsis_0.3.2