Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query about using locator within marine environments #32

Open
wpearman1996 opened this issue Aug 19, 2022 · 1 comment
Open

Query about using locator within marine environments #32

wpearman1996 opened this issue Aug 19, 2022 · 1 comment

Comments

@wpearman1996
Copy link

Hi there,
I'm hoping to use locator to predict the origin of some kelp rafts i've collected. I have wide spanning RADseq data across the range of the species, and then I have 40 samples which were found floating on the ocean - and i'm hoping to identify the source locations of these samples. However I was wondering how well locator would fare with this task, as the kelp only grow on rocky shores - is there a suggested approach for handling predicted locations if they're in the middle of the ocean? Can I provide a map file to limit predicted locations to?
Thanks!
William

@cjbattey
Copy link
Collaborator

Hi William,

The short answer is no, locator doesn't have a map input to limit the theoretically suitable area for a species. How well it works will depend on how strong the population structure is for your species and how good the spatial sampling of the training set is. Our elife paper has all the reliable details I can provide.

I'd recommend starting with something a little simpler like fitting a PCA to the genotypes of samples with known locations and then projecting the unknown samples into that PC space (or, alternately, just fitting a PCA to all the samples at once). If the PCA doesn't have any interpretable patterns at all, locator probably won't be much help.

It's a good question though. In principle it's doable to add a suitability map. You would want to add a term to the loss function that accounted for the habitat suitability at a predicted location, maybe by doing a grid lookup on a raster map. The trick would be implementing it in keras and making it fast enough to run in reasonable time.

If you're interested in diving in and trying to implement it yourself, you're welcome to fork locator and make it your own thing, or file a PR to this repo. The loss function is defined here: https:/kr-colab/locator/blob/master/scripts/locator.py#L224 . Otherwise you could file a feature request under the issues tab here and I could take a look (fair warning that at this point I'm essentially doing this as an after-work hobby so it could take a very long time).

CJ

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants