Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

liftovervcf without sorting the variants #1306

Merged
merged 2 commits into from
Apr 22, 2019
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
87 changes: 61 additions & 26 deletions src/main/java/picard/vcf/LiftoverVcf.java
Original file line number Diff line number Diff line change
Expand Up @@ -181,6 +181,9 @@ public class LiftoverVcf extends CommandLineProgram {
@Argument(doc = "INFO field annotations that should be deleted when swapping reference with variant alleles.", optional = true)
public Collection<String> TAGS_TO_DROP = new ArrayList<>(LiftoverUtils.DEFAULT_TAGS_TO_DROP);

@Argument(doc = "Output VCF file will be written on the fly but it won't be sorted and indexed.", optional = true)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
@Argument(doc = "Output VCF file will be written on the fly but it won't be sorted and indexed.", optional = true)
@Argument(doc = "Output VCF file will be written on the fly but it won't be sorted nor indexed.", optional = true)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

“or” actually 😄

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not native english speaker, nor do I care that much :-D

public boolean DISABLE_SORT = false;

// When a contig used in the chain is not in the reference, exit with this value instead of 0.
public static int EXIT_CODE_WHEN_CONTIG_NOT_IN_REFERENCE = 1;

Expand Down Expand Up @@ -250,7 +253,10 @@ public class LiftoverVcf extends CommandLineProgram {
);

private VariantContextWriter rejects;
/** the output VariantContextWriter */
private VariantContextWriter accept;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency, can you rename this accepts (with s) or change this and rejects to acceptedRecords and rejectedRecords ?

private final Log log = Log.getInstance(LiftoverVcf.class);
/** the Variant sorter, may be null if DISABLE_SORT = null */
private SortingCollection<VariantContext> sorter;

private long failedLiftover = 0, failedAlleleCheck = 0, totalTrackedAsSwapRefAlt = 0;
Expand Down Expand Up @@ -321,11 +327,23 @@ protected int doWork() {
"The REF and the ALT alleles have been reverse complemented in liftover since the mapping from the " +
"previous reference to the current one was on the negative strand."));

final VariantContextWriter out = new VariantContextWriterBuilder()
.setOption(Options.INDEX_ON_THE_FLY)
.modifyOption(Options.ALLOW_MISSING_FIELDS_IN_HEADER, ALLOW_MISSING_FIELDS_IN_HEADER)
.setOutputFile(OUTPUT).setReferenceDictionary(walker.getSequenceDictionary()).build();
out.writeHeader(outHeader);
final VariantContextWriterBuilder variantContextWriterBuilder = new VariantContextWriterBuilder()
.modifyOption(Options.ALLOW_MISSING_FIELDS_IN_HEADER, ALLOW_MISSING_FIELDS_IN_HEADER)
.setOutputFile(OUTPUT)
.setReferenceDictionary(walker.getSequenceDictionary())
;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

keep ; in same line.


if (DISABLE_SORT) {
lindenb marked this conversation as resolved.
Show resolved Hide resolved
this.accept = variantContextWriterBuilder
.unsetOption(Options.INDEX_ON_THE_FLY)
.build();
} else {
this.accept = variantContextWriterBuilder
.setOption(Options.INDEX_ON_THE_FLY)
.build();
}

this.accept.writeHeader(outHeader);

rejects = new VariantContextWriterBuilder().setOutputFile(REJECT)
.unsetOption(Options.INDEX_ON_THE_FLY)
Expand All @@ -346,13 +364,21 @@ protected int doWork() {
// collection.
////////////////////////////////////////////////////////////////////////
long total = 0;
log.info("Lifting variants over and sorting (not yet writing the output file.)");

sorter = SortingCollection.newInstance(VariantContext.class,
new VCFRecordCodec(outHeader, ALLOW_MISSING_FIELDS_IN_HEADER || VALIDATION_STRINGENCY != ValidationStringency.STRICT),
outHeader.getVCFRecordComparator(),
MAX_RECORDS_IN_RAM,
TMP_DIR);

if (DISABLE_SORT) {
log.info("Lifting variants over and writing the output file. Variants will not be sorted");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
log.info("Lifting variants over and writing the output file. Variants will not be sorted");
log.info("Lifting variants over and writing the output file. Variants will not be sorted.");


sorter = null;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make one line :} else {

else {
log.info("Lifting variants over and sorting (not yet writing the output file.)");

sorter = SortingCollection.newInstance(VariantContext.class,
new VCFRecordCodec(outHeader, ALLOW_MISSING_FIELDS_IN_HEADER || VALIDATION_STRINGENCY != ValidationStringency.STRICT),
outHeader.getVCFRecordComparator(),
MAX_RECORDS_IN_RAM,
TMP_DIR);
}

ProgressLogger progress = new ProgressLogger(log, 1000000, "read");

Expand Down Expand Up @@ -441,21 +467,26 @@ protected int doWork() {
rejects.close();
in.close();

////////////////////////////////////////////////////////////////////////
// Write the sorted outputs to the final output file
////////////////////////////////////////////////////////////////////////
sorter.doneAdding();
progress = new ProgressLogger(log, 1000000, "written");
log.info("Writing out sorted records to final VCF.");

for (final VariantContext ctx : sorter) {
out.add(ctx);
progress.record(ctx.getContig(), ctx.getStart());

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extra nl

if (sorter != null) {
////////////////////////////////////////////////////////////////////////
// Write the sorted outputs to the final output file
////////////////////////////////////////////////////////////////////////
sorter.doneAdding();
progress = new ProgressLogger(log, 1000000, "written");
log.info("Writing out sorted records to final VCF.");

for (final VariantContext ctx : sorter) {
this.accept.add(ctx);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indentation is off.

progress.record(ctx.getContig(), ctx.getStart());
}


sorter.cleanup();
}
out.close();

sorter.cleanup();

this.accept.close();

return 0;
}

Expand All @@ -477,7 +508,11 @@ private void trackLiftedVariantContig(Map<String, Long> map, String contig) {
private void addAndTrack(final VariantContext toAdd, final VariantContext source) {
trackLiftedVariantContig(liftedBySourceContig, source.getContig());
trackLiftedVariantContig(liftedByDestContig, toAdd.getContig());
sorter.add(toAdd);
if (sorter != null) { //we're sorting the variants
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (sorter != null) { //we're sorting the variants
if (!DISABLE_SORT) { //we're sorting the variants

sorter.add(toAdd);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use spaces (4) not tabs for indent.

} else {
this.accept.add(toAdd);
}
}

/**
Expand Down
38 changes: 38 additions & 0 deletions src/test/java/picard/util/LiftoverVcfTest.java
Original file line number Diff line number Diff line change
Expand Up @@ -248,6 +248,44 @@ public void testWriteOriginalPosition(final boolean shouldWriteOriginalPosition)
}
}

@DataProvider(name = "dataTestSort")
public Object[][] dataTestVcfSorted() {
return new Object[][]{
{false},
{true}
};
}

@Test(dataProvider = "dataTestSort")
public void testVcfSorted(final boolean disableSort) {
final File liftOutputFile = new File(OUTPUT_DATA_PATH, "lift-delete-me.vcf");
final File rejectOutputFile = new File(OUTPUT_DATA_PATH, "reject-delete-me.vcf");
final File input = new File(TEST_DATA_PATH, "testLiftoverBiallelicIndels.vcf");

liftOutputFile.deleteOnExit();
rejectOutputFile.deleteOnExit();

final String[] args = new String[]{
"INPUT=" + input.getAbsolutePath(),
"OUTPUT=" + liftOutputFile.getAbsolutePath(),
"REJECT=" + rejectOutputFile.getAbsolutePath(),
"CHAIN=" + CHAIN_FILE,
"REFERENCE_SEQUENCE=" + REFERENCE_FILE,
"CREATE_INDEX=" + (!disableSort),
yfarjoun marked this conversation as resolved.
Show resolved Hide resolved
"DISABLE_SORT=" + disableSort
};

Assert.assertEquals(runPicardCommandLine(args), 0);

//try to open with / without index
try (final VCFFileReader liftReader = new VCFFileReader(liftOutputFile, !disableSort)) {
// nothing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at least count the variants and make sure that you got the right number....

}
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

too many NLs




@DataProvider
Iterator<Object[]> testWriteVcfData() {

Expand Down