File size: 1,764 Bytes
63f3feb
 
 
07123ec
63f3feb
 
 
a7e373a
63f3feb
d9cdfd9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
---
datasets:
- EvaKlimentova/knots_AF
license: apache-2.0
---
# M2 - small CNN trained on embeddings

The model is trained on [ProtBert-BFD](https://huggingface.co/Rostlab/prot_bert_bfd) embeddings of [knots_AF dataset](https://huggingface.co/datasets/EvaKlimentova/knots_AF) to recognize between knotted and unknotted proteins based on their amino acid sequence.

Accuracy on the test set:

|                              | Dataset size | Unknotted set size | Accuracy |   TPR  |   TNR  |
|:----------------------------:|:------------:|:------------------:|:--------:|:------:|:------:|
|              All             |     39412    |        19718       |  0.9690  | 0.9569 | 0.9811 |
|             SPOUT            |     7371     |         550        |  0.9712  | 0.9815 | 0.8436 |
|              TDD             |      612     |         24         |  0.9673  | 0.9796 | 0.6667 |
|              DUF             |      716     |         429        |  0.9413  | 0.8955 | 0.9720 |
|        AdoMet synthase       |     1794     |         240        |  0.9727  | 0.9755 | 0.9542 |
|      Carbonic anhydrase      |     1531     |         539        |  0.8870  | 0.8619 | 0.9332 |
|              UCH             |      477     |         125        |  0.8700  | 0.8892 |  0.816 |
|         ATCase/OTCase        |     3799     |        3352        |  0.9932  | 0.9418 |   1.0  |
|    ribosomal-mitochondrial   |      147     |         41         |  0.8163  | 0.8319 | 0.7805 |
|           membrane           |     8309     |        1577        |  0.9740  | 0.9857 | 0.9239 |
|              VIT             |     14347    |        12639       |  0.9742  | 0.8214 | 0.9948 |
| biosynthesis of lantibiotics |      392     |         286        |  0.9388  | 0.8019 | 0.9895 |