Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CollectHsMetrics AT_DROPOUT #1973

Closed
jcharlton67 opened this issue Aug 5, 2024 · 1 comment
Closed

CollectHsMetrics AT_DROPOUT #1973

jcharlton67 opened this issue Aug 5, 2024 · 1 comment

Comments

@jcharlton67
Copy link

Documentation request

Tool involved : CollectHsMetrics AT_DROPOUT

Description
AT_DROPOUT calculation is described as "For each GC bin [0..50] we calculate a = % of target territory, and b = % of aligned reads that align to these targets. AT DROPOUT is then abs(sum(a-b when a-b < 0))."

I have a few questions to help clarify:

  1. Is each "bin" a target region?
  2. Is "a" equivalent to the expected coverage of reads?
  3. Is "b" equivalent to the observed coverage of reads?
  4. If we only sum "a-b" when "a-b < 0", does this mean we only sum regions where the observed coverage is greater than expected? Would this measure an enrichment in coverage, not a depletion?
  5. If (4) is true, is the explanation "if the value is 5% this implies that 5% of total reads that should have mapped to GC<=50% regions mapped elsewhere" correct? it seems like the calculation is measuring enrichment. So maybe the equation should be "when a-b > 0" instead?

Thank you so much!

@kockan
Copy link
Contributor

kockan commented Aug 19, 2024

Hi @jcharlton67 , I'll try to provide some clarifications to the best of my knowledge:

  1. Not exactly, but somewhat related. Each bin tracks the number of windows and the number of read starts (within those windows) for some GC percentage, which is why there are 101 bins (from 0 to 100 inclusive). For example GC bin[0] would contain the number of windows in the target regions with 0% GC content and the number of reads aligned which have start positions contained within these windows. The default window size should be 100 if I'm not mistaken.

  2. I believe it would be fair to say that yes.

  3. Same as above, I believe that's one way to look at it.

4 and 5. I believe you are correct in your observation. I've checked the code and the calculations seem to be correct but the documentation seems to be incorrect and it should be "when a - b > 0" indeed.

I'll open a PR regarding the documentation issue and if it is indeed a typo, we'll change it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants