Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GroupReadsByUmi and optical duplicates #1013

Open
miwalter opened this issue Oct 14, 2024 · 1 comment
Open

GroupReadsByUmi and optical duplicates #1013

miwalter opened this issue Oct 14, 2024 · 1 comment

Comments

@miwalter
Copy link

Hi.

In a recent experiment we sequenced the same libraries on a MiSeq (random FC) and NovaSeq (patterend FC) with similar number of reads but with a 10x higher number of duplicate reads on the NovaSeq. So, I'm wondering if there is a way to deal with optical duplicates (OD) on Illumina patterned flow cells when creating the UMI groups?

If I understand the documentation correctly, all reads with the same coordinates and UMI sequence are grouped regardless if they are PCR or optical duplicates and later used to create a consensus call. In the attached example, there is a tag family with 14 read pairs. However, looking at their location of the flow cell, there are several copies that are within a pixel distance of 2500 which is considered to be ODs on a patterned FC. Some OD cluster have 3-4 copies while other members of the same UMI family have no OD. This will skew the representation of PCR/library prep errors and also the overall size of the UMI family is overestimated (accounting for OD there are only 7 unique copies of the same UMI left). Or do I need to remove optical duplicates first (e.g with picard) and then create my UMI consensus?

Thank you very much for your comments.

image

@miwalter
Copy link
Author

Here's the same UMI family accounted for ODs:

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant