-
Notifications
You must be signed in to change notification settings - Fork 369
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Yf stratify chimeric #1269
Yf stratify chimeric #1269
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yfarjoun
Minor changes requested, mostly in descriptions.
@@ -74,7 +74,7 @@ | |||
"<p>" + | |||
"The resulting metric file will be named according to a provided prefix and a suffix which is generated " + | |||
" automatically according to the error metric. " + | |||
"The tool cal collect multiple metrics in a single pass and there should be hardly any " + | |||
"The tool can collect multiple metrics in a single pass and there [ be hardly any " + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
replace "[" with "should"
return -1; | ||
} | ||
return cigar.getCigarElements().stream() | ||
.filter(e->e.getOperator()== CigarOperator.S) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spaces around operators -> and ==
POST_DINUC(() -> postDiNucleotideStratifier, "Stratifies bases by the read base at the previous cycle, and the current reference base."), | ||
HOMOPOLYMER_LENGTH(() -> homoPolymerLengthStratifier, "Stratifies bases based on the length of homopolymer they are part of (only accounts for bases that were read prior to the current base)."), | ||
HOMOPOLYMER(() -> homopolymerStratifier, "Stratifies bases based on the length of homopolymer, the base that the homopolymer is comprised of, and the reference base."), | ||
GC_CONTENT(() -> gcContentStratifier, "The gc content of their read."), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
their -> the
Also, gc should be "GC"
READ_ORDINALITY(() -> readOrdinalityStratifier, "The read ordinality (i.e. first or second)."), | ||
READ_BASE(() -> currentReadBaseStratifier, "the base in the original reading direction."), | ||
READ_DIRECTION(() -> readDirectionStratifier, "The alignment direction of the read (encoded as + or -)."), | ||
PAIR_ORIENTATION(() -> readOrientationStratifier, "The reads orientation and ordinality. (into F1R2 or F2R1) Assumes reads are \"innies\"."), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
either "reads" should have a possessive or be singular. I would vote for "read".
Not sure I understand what "into" means.
READ_BASE(() -> currentReadBaseStratifier, "the base in the original reading direction."), | ||
READ_DIRECTION(() -> readDirectionStratifier, "The alignment direction of the read (encoded as + or -)."), | ||
PAIR_ORIENTATION(() -> readOrientationStratifier, "The reads orientation and ordinality. (into F1R2 or F2R1) Assumes reads are \"innies\"."), | ||
PAIR_PROPERNESS(() -> readPairednessStratifier, "The properness of the reads alignment. Looks for indications of chimerism."), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"reads" should be possessive or singular. I vote for "read".
private static Integer stratifySoftClippedBases(final SAMRecord sam) { | ||
final Cigar cigar = sam.getCigar(); | ||
if (cigar == null) { | ||
return -1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you declare a constant for this?
PAIR_PROPERNESS(() -> readPairednessStratifier, "The properness of the reads alignment. Looks for indications of chimerism."), | ||
REFERENCE_BASE(() -> referenceBaseStratifier, "The reference base in the read's direction."), | ||
PRE_DINUC(() -> preDiNucleotideStratifier, "The read base at the previous cycle, and the current reference base."), | ||
POST_DINUC(() -> postDiNucleotideStratifier, "The read base at the previous cycle, and the current reference base."), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think "previous" should be "subsequent"
READ_GROUP(() -> readgroupStratifier, "The read-group id of the read."), | ||
CYCLE(() -> baseCycleStratifier, "The machine cycle during which the base was read."), | ||
BINNED_CYCLE(() -> binnedReadCycleStratifier, "The binned machine cycle. Similar to CYCLE, but binned into 5 evenly spaced ranges across the size of the read. This stratifier may produce confusing results when used on datasets with variable sized reads."), | ||
SOFT_CLIPS(() -> softClipsLengthStratifier, "The number of softclipped bases their read has."), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"their" -> "the"
Thanks for the review @fleharty. I think I addressed all your points. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just have a question about whether or not F1F2 and R1R2 should be included in the documentation, and a final.
Otherwise I approve. You'll need to find another reviewer anyway.
Description
Adds two stratifiers to CollectSamErrorMetrics that allow exploring the data with respect to Chimeric reads and soft-clipped bases.
Checklist (never delete this)
Never delete this, it is our record that procedure was followed. If you find that for whatever reason one of the checklist points doesn't apply to your PR, you can leave it unchecked but please add an explanation below.
Content
Review
For more detailed guidelines, see https:/broadinstitute/picard/wiki/Guidelines-for-pull-requests