Inclusion rules - how to interpret and use them #616

gowthamrao · 2021-10-12T23:55:11Z

There are three inclusion rules tables

cohort_inclusion = "#cohort_inclusion"

Has rule names and description by cohort. Useful to know what rule are we looking at

cohort_inclusion_stats = "#cohort_inc_stats"

Has person_count, gain_count, person_total

select *
from cohort_inclusion_stats
where cohort_definition_id = 2511

would give something like this:

Mode 0 = all events, mode 1 = best event. The beset event is the single event per person that matched the most inclusion criteria. Because the person total for mode_id =0 is 37.6k, but mode 1 is 12.8k, we can tell that this cohort has multiple events per person.

cohort_inclusion_result = "#cohort_inc_result"

select *
from results_optum_extended_dod_v1707.cohort_inclusion_result
where cohort_definition_id = 2511

In this table, inclusion_rule_mask is a bitstring of inclusion rules (0 based index) that matched that combination, and the count. So the first row says 26 entry events met mask = 7, which is 111 in binary, which is inclusion rule 1,2 and 3. 29072 people met no criteria (mask = 0), and 57 people had mask = 5, which is 101 which is inclusion rule 1 and 3 (index 0 and 2 of the bits are set)

bit operators to find people with inclusion rule 3 for example (which would be 2^2 = 4), so it’s something like:

WHERE inclusion_rule_mask & 4 = 4

This returns all the rows where that bit is set, and then you can GROUP BY SUM(person_count) on those to tell you number of people who had that inclusion rule.

If you wanted rule 3 and rule 1, that would be 4+1 = 5 so where inclusion_rule_mask & 5 = 5.

If you wanted to check for people that had any of those, then it would just be maxk & 5 > 0 because if any of the bits in ‘5’ are set, you get a > 0 result.

cohort_summary_stats = "#cohort_summary_stats"

This is fourth table, that is a derived table that is only present in Cohort Diagnostics

All four are in Cohort Diagnostics are in version 3 results data model here

The text was updated successfully, but these errors were encountered:

gowthamrao · 2021-10-13T00:06:54Z

as.integer(intToBits(5))

[1] 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

as.integer(intToBits(7))

[1] 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

…tion #616 #409

…tion (#798) #616 #409

gowthamrao · 2024-04-04T17:40:55Z

This function would allow you to get cohort attrition view

getCohortAttritionViewResults <- function(inclusionResultTable,
                                          maxRuleId) {
  numberToBitString <- function(numbers) {
    vapply(numbers, function(number) {
      if (number == 0) {
        return("0")
      }
      
      bitString <- character()
      while (number > 0) {
        bitString <- c(as.character(number %% 2), bitString)
        number <- number %/% 2
      }
      
      paste(bitString, collapse = "")
    }, character(1))
  }
  
  # problem - how to create attrition view
  bitsToMask <- function(bits) {
    positions <- seq_along(bits) - 1
    number <- sum(bits * 2 ^ positions)
    return(number)
  }
  
  ruleToMask <- function(ruleId) {
    bits <- rep(1, ruleId)
    mask <- bitsToMask(bits)
    return(mask)
  }
  
  inclusionResultTable <- inclusionResultTable |>
    dplyr::mutate(inclusionRuleMaskBitString = numberToBitString(inclusionRuleMask))
  
  output <- c()
  
  for (i in (1:maxRuleId)) {
    suffixString <- numberToBitString(ruleToMask(i))
    output[[i]] <- inclusionResultTable |>
      dplyr::filter(endsWith(x = inclusionRuleMaskBitString,
                             suffix = suffixString)) |>
      dplyr::group_by(cohortDefinitionId,
                      modeId) |>
      dplyr::summarise(personCount = sum(personCount), .groups = "drop") |>
      dplyr::ungroup() |>
      dplyr::mutate(id = i)
  }
  
  output <- dplyr::bind_rows(output)
  
  return(output)
}

gowthamrao · 2024-04-04T17:45:51Z

@chrisknoll and I worked on this problem for many hours today. Key learning is how to handle large numbers. We used the same strategy that is currently used in webapi to process the inclusionResultTable for processing inclusionRuleMask, i.e. use string and string match, instead of bit match.

A simple way to solve it would be the code below, but it fails in base R when the value goes beyond integer range because the used functions only support integer range. This is relevant when we have a lot of inclusion rules e.g. more than 32

ruleToMask <- function(ruleId) {
  bits <- rep(1, ruleId)
  
  bitsToMask <- function(bits) {
    positions <- seq_along(bits) - 1
    number <- sum(bits * 2 ^ positions)
    return(number)
  }
  
  mask <- bitsToMask(bits)
  return(mask)
}

a <- dplyr::tibble(inclusionRuleMask = c(15, 11, 7, 1),
                   personCount = c(20, 20, 20, 20))

ruleId <- 3
maskId <- ruleToMask(ruleId = 3)
a |>
  dplyr::filter(bitwAnd(inclusionRuleMask, maskId) == maskId) |>
  dplyr::summarise(personCount = sum(personCount))

gowthamrao mentioned this issue Oct 12, 2021

Inclusion rule statistics - visualizaiton #409

Closed

azimov added the enhancement New feature or request label Jan 19, 2022

gowthamrao added a commit that referenced this issue Apr 11, 2022

Export the full set of inclusion rules generated during cohort genera…

d64e547

…tion #616 #409

gowthamrao mentioned this issue Apr 11, 2022

Export the full set of inclusion rules generated during cohort generation #798

Merged

gowthamrao added a commit that referenced this issue Apr 11, 2022

Export the full set of inclusion rules generated during cohort genera…

30f7271

…tion (#798) #616 #409

gowthamrao mentioned this issue Aug 8, 2022

Develop branch - inclusion rule statistics does not show percentage #896

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inclusion rules - how to interpret and use them #616

Inclusion rules - how to interpret and use them #616

gowthamrao commented Oct 12, 2021 •

edited

Loading

gowthamrao commented Oct 13, 2021

gowthamrao commented Apr 4, 2024 •

edited

Loading

gowthamrao commented Apr 4, 2024 •

edited

Loading

Inclusion rules - how to interpret and use them #616

Inclusion rules - how to interpret and use them #616

Comments

gowthamrao commented Oct 12, 2021 • edited Loading

cohort_inclusion = "#cohort_inclusion"

cohort_inclusion_stats = "#cohort_inc_stats"

cohort_inclusion_result = "#cohort_inc_result"

cohort_summary_stats = "#cohort_summary_stats"

gowthamrao commented Oct 13, 2021

gowthamrao commented Apr 4, 2024 • edited Loading

gowthamrao commented Apr 4, 2024 • edited Loading

gowthamrao commented Oct 12, 2021 •

edited

Loading

gowthamrao commented Apr 4, 2024 •

edited

Loading

gowthamrao commented Apr 4, 2024 •

edited

Loading