-
Notifications
You must be signed in to change notification settings - Fork 369
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
liftovervcf without sorting the variants #1306
liftovervcf without sorting the variants #1306
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice addition @lindenb! thanks!
Lifting gnomad is indeed a pain....it's interesting that you think that sorting it with a different tool will be better...
I've put in a few comments, but overall it looks great! thanks again.
|
||
//try to open with / without index | ||
try (final VCFFileReader liftReader = new VCFFileReader(liftOutputFile, !disableSort)) { | ||
// nothing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
at least count the variants and make sure that you got the right number....
@@ -250,7 +253,10 @@ | |||
); | |||
|
|||
private VariantContextWriter rejects; | |||
/** the output VariantContextWriter */ | |||
private VariantContextWriter accept; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For consistency, can you rename this accepts
(with s
) or change this and rejects
to acceptedRecords
and rejectedRecords
?
.modifyOption(Options.ALLOW_MISSING_FIELDS_IN_HEADER, ALLOW_MISSING_FIELDS_IN_HEADER) | ||
.setOutputFile(OUTPUT) | ||
.setReferenceDictionary(walker.getSequenceDictionary()) | ||
; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
keep ;
in same line.
@@ -477,7 +508,11 @@ private void trackLiftedVariantContig(Map<String, Long> map, String contig) { | |||
private void addAndTrack(final VariantContext toAdd, final VariantContext source) { | |||
trackLiftedVariantContig(liftedBySourceContig, source.getContig()); | |||
trackLiftedVariantContig(liftedByDestContig, toAdd.getContig()); | |||
sorter.add(toAdd); | |||
if (sorter != null) { //we're sorting the variants |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (sorter != null) { //we're sorting the variants | |
if (!DISABLE_SORT) { //we're sorting the variants |
// nothing | ||
} | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
too many NLs
for (final VariantContext ctx : sorter) { | ||
out.add(ctx); | ||
progress.record(ctx.getContig(), ctx.getStart()); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extra nl
log.info("Writing out sorted records to final VCF."); | ||
|
||
for (final VariantContext ctx : sorter) { | ||
this.accept.add(ctx); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indentation is off.
@@ -477,7 +508,11 @@ private void trackLiftedVariantContig(Map<String, Long> map, String contig) { | |||
private void addAndTrack(final VariantContext toAdd, final VariantContext source) { | |||
trackLiftedVariantContig(liftedBySourceContig, source.getContig()); | |||
trackLiftedVariantContig(liftedByDestContig, toAdd.getContig()); | |||
sorter.add(toAdd); | |||
if (sorter != null) { //we're sorting the variants | |||
sorter.add(toAdd); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please use spaces (4) not tabs for indent.
@yfarjoun back to you. Thank you for the review , I hope I removed all the formatting problems
well one can imagine that you want a sorted VCF without index. I added a test
|
what I meant is that your test only tests the explicit invocation, and not the implicit one... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
almost, a few extra newlines and one misformatted } else {
@@ -287,6 +287,11 @@ protected int doWork() { | |||
IOUtil.assertFileIsWritable(OUTPUT); | |||
IOUtil.assertFileIsWritable(REJECT); | |||
|
|||
|
|||
if (CREATE_INDEX && DISABLE_SORT) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this type of input validation should be in customCommandLineValidation()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in fact I realized that CREATED_INDEX is only for BAM-related tools, for LiftoverVCF, the index is always created event if CREATED_INDEX=false
@@ -181,6 +181,9 @@ | |||
@Argument(doc = "INFO field annotations that should be deleted when swapping reference with variant alleles.", optional = true) | |||
public Collection<String> TAGS_TO_DROP = new ArrayList<>(LiftoverUtils.DEFAULT_TAGS_TO_DROP); | |||
|
|||
@Argument(doc = "Output VCF file will be written on the fly but it won't be sorted and indexed.", optional = true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Argument(doc = "Output VCF file will be written on the fly but it won't be sorted and indexed.", optional = true) | |
@Argument(doc = "Output VCF file will be written on the fly but it won't be sorted nor indexed.", optional = true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
“or” actually 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not native english speaker, nor do I care that much :-D
out.writeHeader(outHeader); | ||
this.acceptedRecords = new VariantContextWriterBuilder() | ||
.modifyOption(Options.ALLOW_MISSING_FIELDS_IN_HEADER, ALLOW_MISSING_FIELDS_IN_HEADER) | ||
.modifyOption(Options.INDEX_ON_THE_FLY,!DISABLE_SORT) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.modifyOption(Options.INDEX_ON_THE_FLY,!DISABLE_SORT) | |
.modifyOption(Options.INDEX_ON_THE_FLY, !DISABLE_SORT) |
Sorry, I'm not sure I understand 🤔 . for example in a test like LiftoverVcfTest/testWriteOriginalPosition how is the condition implicit/explicit tested ? |
I guess I mean that you explicitly told the program to CREATE_INDEX rather than it doing whatever it thinks |
Description
I'm trying to liftover gnomad/genome to hg38 but it takes to much time and memory. I'd rather have the resulting VCF file without sorting and then possibly sort later with bcftools.
I added an option DISABLE_SORT to liftover vcf:
sorter
miight be null if DISABLE_SORT=trueout
to a class memberaccept
now vcfliftover can write to stdout, variants are not sorted:
Checklist (never delete this)
Never delete this, it is our record that procedure was followed. If you find that for whatever reason one of the checklist points doesn't apply to your PR, you can leave it unchecked but please add an explanation below.
Content
Review
For more detailed guidelines, see https:/broadinstitute/picard/wiki/Guidelines-for-pull-requests