Spaces:
Runtime error
Runtime error
jennzhuge
commited on
Commit
·
2b207de
1
Parent(s):
e86736e
readme
Browse files
README.md
CHANGED
@@ -13,16 +13,16 @@ Check out the configuration reference at https://huggingface.co/docs/hub/spaces-
|
|
13 |
|
14 |
|
15 |
# Welcome to Lofi Amazon Rainforest Beats to Hack/AI to's DNA Identifier Tool.
|
|
|
16 |
|
17 |
## Genus Prediction
|
18 |
-
To get started, upload DNA
|
19 |
Our tool will output the top three most probable genuses that your sample belongs to based on DNA and environmental factors such as elevation, annual precipitation, or human activity levels of the sample location. You can also see the top three most probable genuses based on DNA similarity alone.
|
20 |
|
21 |
## DNA Embedding Space Visualization
|
22 |
Prehaps we have a DNA sequence for which the highest genus probability is very low (this could be because scientists have not managed to directly sample any specimens of the genus, so our training dataset, BOLD, doesn't contain any examples), we can still examine the DNA embedding of the sequence in relation to known samples. The t-SNE plots show the embedding space of the top N most common species in the area surrounding the given coordinate. We can see clear group distinctions between species. The following t-SNE plot show how the sample sequence embedding is positioned in the space and identified nearest species clusters.
|
23 |
|
24 |
# Downstream Tasks
|
25 |
-
|
26 |
Potential downstream tasks include:
|
27 |
- Identifying invasive species.
|
28 |
- Reclassifying wrongly classified species. for example red panda is called a panda, but it's actually more genetically similar to a raccoon.
|
|
|
13 |
|
14 |
|
15 |
# Welcome to Lofi Amazon Rainforest Beats to Hack/AI to's DNA Identifier Tool.
|
16 |
+
This tool is intended to help conservationists/biologists identify unmatched eDNA samples or verify known samples by predicting genus from DNA sequences. If unsure, the tool can also visualize the DNA embedding space to help one hypothesize about which species the sequence could belong to.
|
17 |
|
18 |
## Genus Prediction
|
19 |
+
To get started, upload a DNA sequence and the coordinates where you sampled it. (We can easily extend this tool to handle multiple DNA sequences with CSV upload.)
|
20 |
Our tool will output the top three most probable genuses that your sample belongs to based on DNA and environmental factors such as elevation, annual precipitation, or human activity levels of the sample location. You can also see the top three most probable genuses based on DNA similarity alone.
|
21 |
|
22 |
## DNA Embedding Space Visualization
|
23 |
Prehaps we have a DNA sequence for which the highest genus probability is very low (this could be because scientists have not managed to directly sample any specimens of the genus, so our training dataset, BOLD, doesn't contain any examples), we can still examine the DNA embedding of the sequence in relation to known samples. The t-SNE plots show the embedding space of the top N most common species in the area surrounding the given coordinate. We can see clear group distinctions between species. The following t-SNE plot show how the sample sequence embedding is positioned in the space and identified nearest species clusters.
|
24 |
|
25 |
# Downstream Tasks
|
|
|
26 |
Potential downstream tasks include:
|
27 |
- Identifying invasive species.
|
28 |
- Reclassifying wrongly classified species. for example red panda is called a panda, but it's actually more genetically similar to a raccoon.
|