Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feedback on integration of other cancer knowledgebases in Scout for cancer #4879

Open
muabnezor opened this issue Sep 19, 2024 · 8 comments
Open

Comments

@muabnezor
Copy link

Hi all.

As has been discussed previously we in Umeå has started to work on an extension that will enable 3rd party variant interpretation tools integration in scout. But we have also received funds for improving cancer variant assessment directly in scout. An obstacle for this is that most cancer knowledgebases are not in the public domain, at least not for clinical use. They are usually CC By 4 non-commercial, with two exceptions: CIViC and the Cancer Genome Interpreter (CGI) database, which is CC0, but not the frontend / hosted solution.
The Personal Cancer Genome interpreter (PCGR) has included those and more such as clinvar, cancermine, cancer hotspots, in its annotation. We were thinking about including some or all of the information from PCGR into the Scout cancer view. It might also be possible to include for example intOGen as well.

But before commencing this we would like your feedback and thoughts.

/Adam Rosenbaum

@dnil
Copy link
Collaborator

dnil commented Sep 19, 2024

Hi Adam! 😸
We are very happy to see new annotation sources, and the cancer ones are well known to be lacking. The best solution we have found so far is to have variant level variant key information appear as an item on the VCF from pipeline annotation (e.g. a ClinVar or COSMIC id appear for variants that have relevant annotation) in combination with any values/qualitative statements that may have relevance for ranking or filtering ("Likely pathogenic"/"Assessed by expert panel" from ClinVar, 100 obs in CoCa from Cosmic etc). These can be stored as INFO or CSQ on variants and parsed by downstream tools and scout as needed. Then we can link out to deeper info on sites using those keys from Scout, allowing the user to drill down a bit.

@northwestwitch
Copy link
Member

Hello! long time no see, hope everything is fine with you! 😃

In addition to what has been written by Daniel, we should check if it's possible to create IGV tracks that can be loaded in the igv browser of scout. Possibly with links to the resources as well!

@parlar
Copy link

parlar commented Sep 19, 2024

Thanks for responding quickly!

Regarding Cosmic. I'm a bit uncertain how Cosmic can be used from now on. For non-academic use, I believe that hospitals must now pay a license fee to Qiagen. Not cheap. Perhaps it's okay to keep cosmic ids but the database as a whole will probably not be possible to use any longer.

But as I interpret your answer you feel that it is a good idea to improve the cancer view and that we should not develop a separate application linked to scout?

There is one slight complication that I can see, which is that tier classifications are disease-dependent. So, a variant could be tier I for breast cancer but tier IIb for colorectal. In princial, the disease could be added before loading into Scout but if we wish to change that, the data needs to be re-ranked/re-annotated and re-imported. One possibility might be to include all variants that have any kind of cancer annotation and then do the last bit dynamically in Scout itself.

And we are very fortunate to have Adam here with us now! :)

cheers

@parlar
Copy link

parlar commented Sep 19, 2024

Just realized that muabnezor is rosenbaum in reverse, with a z instead of an s :)

@dnil
Copy link
Collaborator

dnil commented Sep 19, 2024

Just realized that muabnezor is rosenbaum in reverse, with a z instead of an s :)

Now we know how he chooses passwords as well?

@dnil
Copy link
Collaborator

dnil commented Sep 19, 2024

Thanks for responding quickly!

Regarding Cosmic. I'm a bit uncertain how Cosmic can be used from now on. For non-academic use, I believe that hospitals must now pay a license fee to Qiagen. Not cheap. Perhaps it's okay to keep cosmic ids but the database as a whole will probably not be possible to use any longer.

Right, Cosmic ids was not the example. ClinVar still holds. 😜 For the record I do not agree with this kind of late-breaking sellouts of academic data - its not like the cosmic data was produced by one individual/center and without the input of a lot of individual consented samples - but that is beyond our control I'm afraid. We can try to avoid greed in our lives, and keep promoting an open society in every area possible.

But as I interpret your answer you feel that it is a good idea to improve the cancer view and that we should not develop a separate application linked to scout?

Improving the cancer view is great - contributions will be most welcome. If PCGR and/or its annotations could be introduced as a generic annotation tool, to be added to the likes of Balsamic or Sarek, that would fit the current model best. Also as long as the APIs are good, interacting with other apps is not off the table. Im not familiar enough with that app to say how one would best go about it!

There is one slight complication that I can see, which is that tier classifications are disease-dependent. So, a variant could be tier I for breast cancer but tier IIb for colorectal. In princial, the disease could be added before loading into Scout but if we wish to change that, the data needs to be re-ranked/re-annotated and re-imported. One possibility might be to include all variants that have any kind of cancer annotation and then do the last bit dynamically in Scout itself.

Right, we do sort of have the same thing in rare disease, where e.g. ACMG classifications are not really valid without a given disorder. We let the users tick through a list of criteria to finalise it. It is also not so unlikely that a high tier variant for one type will anyway be relevant for another tumor category, even if we haven't got so many of that kind yet, and possibly with a different tier. Presumably going for worst case and letting the user modify is going to be ok. Perhaps you could provide a list of calls conditional on disease type?

@muabnezor
Copy link
Author

Hi Chiara and Daniel, great to hear from you both again! I’m slowly starting to dig into Scout for cancer here up north, so expect to hear from me more in the future!

If it is a matter of annotation before loading variants into Scout, then this should be pretty straight forward. Whether PCGR is a reliable tool for pre-scout annotation I'll have to look into. Has there been any previous discussion of a scout-specific annotation pipeline? It might be worth considering if we want a more tailored approach that fits into Scout's existing structure, rather than adapting Scout to work with the annotations from several different tools. Having a dedicated pipeline could also offer more flexibility for future customizations as needed. But maybe this has been discussed before.

@dnil
Copy link
Collaborator

dnil commented Sep 20, 2024

Very good to have you onboard again!

It has definitely been discussed, many times, and to some small extent implemented - not least in Umeå! It has certainly not been ruled out, but the general feeling is that since what you mostly need is VEP and genmod, it is not too difficult to get to a working stage, and then the rest usually needs some customisation anyway. (Like, did you want to run gens? Oh, you will need to make a PoN. Oh did you want local frequencies? Ah, then you need to add loqusdb to your automatisation etc. Chromosome images? Ah, just add Chromograph to your pipeline!) But it would be super convenient to be able to point to a specific tool/pipe, so don't let that discourage you if that is what looks fun and doable to you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants