jennzhuge commited on
Commit
2b207de
·
1 Parent(s): e86736e
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -13,16 +13,16 @@ Check out the configuration reference at https://huggingface.co/docs/hub/spaces-
13
 
14
 
15
  # Welcome to Lofi Amazon Rainforest Beats to Hack/AI to's DNA Identifier Tool.
 
16
 
17
  ## Genus Prediction
18
- To get started, upload DNA sequences and the coordinates where you sampled them.
19
  Our tool will output the top three most probable genuses that your sample belongs to based on DNA and environmental factors such as elevation, annual precipitation, or human activity levels of the sample location. You can also see the top three most probable genuses based on DNA similarity alone.
20
 
21
  ## DNA Embedding Space Visualization
22
  Prehaps we have a DNA sequence for which the highest genus probability is very low (this could be because scientists have not managed to directly sample any specimens of the genus, so our training dataset, BOLD, doesn't contain any examples), we can still examine the DNA embedding of the sequence in relation to known samples. The t-SNE plots show the embedding space of the top N most common species in the area surrounding the given coordinate. We can see clear group distinctions between species. The following t-SNE plot show how the sample sequence embedding is positioned in the space and identified nearest species clusters.
23
 
24
  # Downstream Tasks
25
-
26
  Potential downstream tasks include:
27
  - Identifying invasive species.
28
  - Reclassifying wrongly classified species. for example red panda is called a panda, but it's actually more genetically similar to a raccoon.
 
13
 
14
 
15
  # Welcome to Lofi Amazon Rainforest Beats to Hack/AI to's DNA Identifier Tool.
16
+ This tool is intended to help conservationists/biologists identify unmatched eDNA samples or verify known samples by predicting genus from DNA sequences. If unsure, the tool can also visualize the DNA embedding space to help one hypothesize about which species the sequence could belong to.
17
 
18
  ## Genus Prediction
19
+ To get started, upload a DNA sequence and the coordinates where you sampled it. (We can easily extend this tool to handle multiple DNA sequences with CSV upload.)
20
  Our tool will output the top three most probable genuses that your sample belongs to based on DNA and environmental factors such as elevation, annual precipitation, or human activity levels of the sample location. You can also see the top three most probable genuses based on DNA similarity alone.
21
 
22
  ## DNA Embedding Space Visualization
23
  Prehaps we have a DNA sequence for which the highest genus probability is very low (this could be because scientists have not managed to directly sample any specimens of the genus, so our training dataset, BOLD, doesn't contain any examples), we can still examine the DNA embedding of the sequence in relation to known samples. The t-SNE plots show the embedding space of the top N most common species in the area surrounding the given coordinate. We can see clear group distinctions between species. The following t-SNE plot show how the sample sequence embedding is positioned in the space and identified nearest species clusters.
24
 
25
  # Downstream Tasks
 
26
  Potential downstream tasks include:
27
  - Identifying invasive species.
28
  - Reclassifying wrongly classified species. for example red panda is called a panda, but it's actually more genetically similar to a raccoon.