eachanjohnson commited on
Commit
dee10df
·
verified ·
1 Parent(s): 5a9d45c

Upload folder using huggingface_hub

Browse files
README.md ADDED
@@ -0,0 +1,140 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ pipeline_tag: tabular-regression
4
+ tags:
5
+ - chemistry
6
+ - microbiology
7
+ - antibiotics
8
+ library_name: duvida
9
+ datasets:
10
+ - scbirlab/thomas-2018-spark-wt
11
+ ---
12
+
13
+ # Predictor of _Yersinia pestis_ MICs
14
+
15
+ _Updated:_ Fri 28 Mar 18:11:51 GMT 2025
16
+
17
+ Trained on the _Yersinia pestis_, WT accumulator phenotype subset of the [human-curated SPARK dataset](https://doi.org/10.1021/acsinfecdis.8b00193) ( rows in total for _Yersinia pestis_).
18
+
19
+ ## Model details
20
+
21
+ This model was trained using [our Duvida framework](https://github.com/scbirlab/duvida),
22
+ as a result of hyperparameter searches and selecting the model that performs best on unseen test data
23
+ (from a scaffold split).
24
+
25
+ Duvida also saves the training data in this checkpoint to allows the calculation of uncertainty metrics
26
+ based on that training data.
27
+
28
+ This model is the best regression model from a hyperparameter search, determined
29
+ by Spearman's $\rho$ on a held-out test set not used in training or early stopping.
30
+
31
+ ### Model architecture
32
+
33
+ - **Regression**
34
+
35
+ ```json
36
+
37
+ {
38
+ "dropout": 0.2,
39
+ "ensemble_size": 10,
40
+ "extra_featurizers": null,
41
+ "learning_rate": 0.0001,
42
+ "model_class": "FPMLPModelBox",
43
+ "n_hidden": 2,
44
+ "n_units": 16,
45
+ "use_2d": true,
46
+ "use_fp": true
47
+ }
48
+ ```
49
+
50
+ ### Model usage
51
+
52
+ You can use this model with:
53
+
54
+ ```python
55
+ from duvida.autoclasses import AutoModelBox
56
+ modelbox = AutoModelBox.from_pretrained("hf://scbirlab/spark-dv-2503-ypes")
57
+ modelbox.predict(filename=..., inputs=[...], columns=[...]) # make predictions on your own data
58
+ ```
59
+
60
+ ## Training details
61
+
62
+ - **Dataset:** [SPARK, WT accumulator, _Yersinia pestis_ subset](https://huggingface.co/datasets/scbirlab/thomas-2018-spark-wt)
63
+ - **Input column:** smiles
64
+ - **Output column:** pmic
65
+ - **Split type:** Murcko scaffold
66
+ - **Split proportions:**
67
+ - 70% training (7002 rows)
68
+ - 15% validation (for early stopping) (1499 rows)
69
+ - 15% test (for selecting hyperparameters) (1501 rows)
70
+
71
+ Here is the training log:
72
+
73
+ <img src="training-log.png" width=450>
74
+
75
+ And these are the evaluation scores.
76
+
77
+ Train (7002 rows):
78
+
79
+ ```json
80
+
81
+ {
82
+ "Pearson r": 0.8516616817450599,
83
+ "RMSE": 0.037040311843156815,
84
+ "Spearman rho": 0.97379069361196
85
+ }
86
+ ```
87
+
88
+ Validation (1499 rows):
89
+
90
+ ```json
91
+
92
+ {
93
+ "Pearson r": 0.7403999692649684,
94
+ "RMSE": 0.041353218257427216,
95
+ "Spearman rho": 0.9332247081913623
96
+ }
97
+ ```
98
+
99
+
100
+ Test (1501 rows):
101
+
102
+ ```json
103
+
104
+ {
105
+ "Pearson r": 0.7870136836550153,
106
+ "RMSE": 0.042944587767124176,
107
+ "Spearman rho": 0.9723500084144376
108
+ }
109
+ ```
110
+
111
+ ## Training data details
112
+
113
+ The training data were collated by the authors of:
114
+
115
+ > Joe Thomas, Marc Navre, Aileen Rubio, and Allan Coukell
116
+ > Shared Platform for Antibiotic Research and Knowledge: A Collaborative Tool to SPARK Antibiotic Discovery
117
+ > ACS Infectious Diseases 2018 4 (11), 1536-1539
118
+ > DOI: 10.1021/acsinfecdis.8b00193
119
+
120
+ We cleaned the original SPARK dataset to subset the most relevant columns, remove empty values,
121
+ give succint column titles, and split by species.
122
+
123
+ This particular dataset retains only measurements on bacteria with wild-type accumulation phenotypes.
124
+
125
+ ### Dataset Sources
126
+
127
+ - **Repository:** https://www.collaborativedrug.com/spark-data-downloads
128
+ - **Paper:** https://doi.org/10.1021/acsinfecdis.8b00193
129
+
130
+ ### Data Collection and Processing
131
+
132
+ Data were processed using [schemist](https://github.com/scbirlab/schemist), a tool for processing chemical datasets.
133
+
134
+ The SMILES strings have been canonicalized, and split into training (70%), validation (15%), and test (15%) sets
135
+ by Murcko scaffold for each species with more than 1000 entries. Additional features like molecular weight and
136
+ topological polar surface area have also been calculated.
137
+
138
+ ### Who are the source data producers?
139
+
140
+ Joe Thomas, Marc Navre, Aileen Rubio, and Allan Coukell
data-config.json ADDED
@@ -0,0 +1,17 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_default_cache": "cache/duvida/data",
3
+ "_in_key": "inputs",
4
+ "_input_cols": [
5
+ "smiles"
6
+ ],
7
+ "_label_cols": [
8
+ "pmic"
9
+ ],
10
+ "_out_key": "labels",
11
+ "input_shape": [
12
+ 2248
13
+ ],
14
+ "output_shape": [
15
+ 1
16
+ ]
17
+ }
data-load-args.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cache": "/nemo/lab/johnsone/home/users/johnsoe/projects/abx-discovery-strategy/models/spark/Yersinia-pestis/11/cache",
3
+ "features": [
4
+ "smiles"
5
+ ],
6
+ "filename": "/nemo/lab/johnsone/home/users/johnsoe/data/datasets/thomas-2018-spark-wt/Yersinia-pestis/scaffold-split-train.csv.gz",
7
+ "labels": [
8
+ "pmic"
9
+ ]
10
+ }
eval-metrics_test.json ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ {
2
+ "Pearson r": 0.7870136836550153,
3
+ "RMSE": 0.042944587767124176,
4
+ "Spearman rho": 0.9723500084144376
5
+ }
eval-metrics_train.json ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ {
2
+ "Pearson r": 0.8516616817450599,
3
+ "RMSE": 0.037040311843156815,
4
+ "Spearman rho": 0.97379069361196
5
+ }
eval-metrics_validation.json ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ {
2
+ "Pearson r": 0.7403999692649684,
3
+ "RMSE": 0.041353218257427216,
4
+ "Spearman rho": 0.9332247081913623
5
+ }
input-data.hf/data-00000-of-00001.arrow ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:87cacd4bbececee8e029375f15a02f8b79032b9d286247d482e8eaa39b62bbf6
3
+ size 714472
input-data.hf/dataset_info.json ADDED
@@ -0,0 +1,52 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "builder_name": "csv",
3
+ "citation": "",
4
+ "config_name": "default",
5
+ "dataset_name": "csv",
6
+ "dataset_size": 2760245,
7
+ "description": "",
8
+ "download_checksums": {
9
+ "/nemo/lab/johnsone/home/users/johnsoe/data/datasets/thomas-2018-spark-wt/Yersinia-pestis/scaffold-split-train.csv.gz": {
10
+ "num_bytes": 559702,
11
+ "checksum": null
12
+ }
13
+ },
14
+ "download_size": 559702,
15
+ "features": {
16
+ "smiles": {
17
+ "dtype": "string",
18
+ "_type": "Value"
19
+ },
20
+ "inputs": {
21
+ "feature": {
22
+ "dtype": "string",
23
+ "_type": "Value"
24
+ },
25
+ "_type": "Sequence"
26
+ },
27
+ "labels": {
28
+ "feature": {
29
+ "dtype": "float64",
30
+ "_type": "Value"
31
+ },
32
+ "_type": "Sequence"
33
+ }
34
+ },
35
+ "homepage": "",
36
+ "license": "",
37
+ "size_in_bytes": 3319947,
38
+ "splits": {
39
+ "train": {
40
+ "name": "train",
41
+ "num_bytes": 2760245,
42
+ "num_examples": 7002,
43
+ "dataset_name": "csv"
44
+ }
45
+ },
46
+ "version": {
47
+ "version_str": "0.0.0",
48
+ "major": 0,
49
+ "minor": 0,
50
+ "patch": 0
51
+ }
52
+ }
input-data.hf/state.json ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_data_files": [
3
+ {
4
+ "filename": "data-00000-of-00001.arrow"
5
+ }
6
+ ],
7
+ "_fingerprint": "754d1b2639f55a04",
8
+ "_format_columns": null,
9
+ "_format_kwargs": {},
10
+ "_format_type": null,
11
+ "_output_all_columns": false,
12
+ "_split": "train"
13
+ }
logs-csv/lightning_logs/version_0/hparams.yaml ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ dropout: 0.2
2
+ ensemble_size: 10
3
+ extra_featurizers: null
4
+ learning_rate: 0.0001
5
+ n_hidden: 2
6
+ n_input: 2248
7
+ n_out: 1
8
+ n_units: 16
9
+ optimizer: !!python/name:torch.optim.adam.Adam ''
10
+ reduce_lr_on_plateau: true
11
+ reduce_lr_patience: 10
12
+ use_2d: true
13
+ use_fp: true
logs-csv/lightning_logs/version_0/metrics.csv ADDED
@@ -0,0 +1,153 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ epoch,learning_rate,loss,step,val_loss
2
+ 0,9.999999747378752e-05,,437,0.1413220316171646
3
+ 0,,4.066796779632568,437,
4
+ 1,9.999999747378752e-05,,875,0.1224312037229538
5
+ 1,,1.0922460556030273,875,
6
+ 2,9.999999747378752e-05,,1313,0.0954282358288765
7
+ 2,,0.9123325943946838,1313,
8
+ 3,9.999999747378752e-05,,1751,0.07629581540822983
9
+ 3,,0.7958776950836182,1751,
10
+ 4,9.999999747378752e-05,,2189,0.07736650854349136
11
+ 4,,0.7250118255615234,2189,
12
+ 5,9.999999747378752e-05,,2627,0.06728645414113998
13
+ 5,,0.6619632244110107,2627,
14
+ 6,9.999999747378752e-05,,3065,0.0589933842420578
15
+ 6,,0.6077855825424194,3065,
16
+ 7,9.999999747378752e-05,,3503,0.056337159126996994
17
+ 7,,0.5694469213485718,3503,
18
+ 8,9.999999747378752e-05,,3941,0.057812344282865524
19
+ 8,,0.5191640257835388,3941,
20
+ 9,9.999999747378752e-05,,4379,0.05062902346253395
21
+ 9,,0.48335176706314087,4379,
22
+ 10,9.999999747378752e-05,,4817,0.0412386953830719
23
+ 10,,0.44524967670440674,4817,
24
+ 11,9.999999747378752e-05,,5255,0.037790026515722275
25
+ 11,,0.41807180643081665,5255,
26
+ 12,9.999999747378752e-05,,5693,0.036534860730171204
27
+ 12,,0.39038118720054626,5693,
28
+ 13,9.999999747378752e-05,,6131,0.037033967673778534
29
+ 13,,0.35996440052986145,6131,
30
+ 14,9.999999747378752e-05,,6569,0.03006904385983944
31
+ 14,,0.3394574820995331,6569,
32
+ 15,9.999999747378752e-05,,7007,0.026293819770216942
33
+ 15,,0.31450673937797546,7007,
34
+ 16,9.999999747378752e-05,,7445,0.02689516358077526
35
+ 16,,0.29431262612342834,7445,
36
+ 17,9.999999747378752e-05,,7883,0.02622925490140915
37
+ 17,,0.28134000301361084,7883,
38
+ 18,9.999999747378752e-05,,8321,0.021103335544466972
39
+ 18,,0.26751601696014404,8321,
40
+ 19,9.999999747378752e-05,,8759,0.023117465898394585
41
+ 19,,0.2504330575466156,8759,
42
+ 20,9.999999747378752e-05,,9197,0.020669564604759216
43
+ 20,,0.24217894673347473,9197,
44
+ 21,9.999999747378752e-05,,9635,0.017657894641160965
45
+ 21,,0.2293732464313507,9635,
46
+ 22,9.999999747378752e-05,,10073,0.017272211611270905
47
+ 22,,0.21885405480861664,10073,
48
+ 23,9.999999747378752e-05,,10511,0.014207125641405582
49
+ 23,,0.2088192105293274,10511,
50
+ 24,9.999999747378752e-05,,10949,0.013398121111094952
51
+ 24,,0.20257121324539185,10949,
52
+ 25,9.999999747378752e-05,,11387,0.014862680807709694
53
+ 25,,0.1938035935163498,11387,
54
+ 26,9.999999747378752e-05,,11825,0.012240545824170113
55
+ 26,,0.1880243569612503,11825,
56
+ 27,9.999999747378752e-05,,12263,0.010983969084918499
57
+ 27,,0.1807633489370346,12263,
58
+ 28,9.999999747378752e-05,,12701,0.011664721183478832
59
+ 28,,0.17542214691638947,12701,
60
+ 29,9.999999747378752e-05,,13139,0.011810495518147945
61
+ 29,,0.17104730010032654,13139,
62
+ 30,9.999999747378752e-05,,13577,0.01122882030904293
63
+ 30,,0.16505487263202667,13577,
64
+ 31,9.999999747378752e-05,,14015,0.009850334376096725
65
+ 31,,0.16016913950443268,14015,
66
+ 32,9.999999747378752e-05,,14453,0.010203495621681213
67
+ 32,,0.15770508348941803,14453,
68
+ 33,9.999999747378752e-05,,14891,0.008621348068118095
69
+ 33,,0.15496382117271423,14891,
70
+ 34,9.999999747378752e-05,,15329,0.00886634923517704
71
+ 34,,0.15054166316986084,15329,
72
+ 35,9.999999747378752e-05,,15767,0.008278611116111279
73
+ 35,,0.14714494347572327,15767,
74
+ 36,9.999999747378752e-05,,16205,0.01034875400364399
75
+ 36,,0.14546401798725128,16205,
76
+ 37,9.999999747378752e-05,,16643,0.008283929899334908
77
+ 37,,0.14061588048934937,16643,
78
+ 38,9.999999747378752e-05,,17081,0.008863476105034351
79
+ 38,,0.13884156942367554,17081,
80
+ 39,9.999999747378752e-05,,17519,0.008464700542390347
81
+ 39,,0.1370924860239029,17519,
82
+ 40,9.999999747378752e-05,,17957,0.008587003685534
83
+ 40,,0.13470688462257385,17957,
84
+ 41,9.999999747378752e-05,,18395,0.007929904386401176
85
+ 41,,0.1306893229484558,18395,
86
+ 42,9.999999747378752e-05,,18833,0.008624639362096786
87
+ 42,,0.13056859374046326,18833,
88
+ 43,9.999999747378752e-05,,19271,0.00847290363162756
89
+ 43,,0.12831832468509674,19271,
90
+ 44,9.999999747378752e-05,,19709,0.007803838234394789
91
+ 44,,0.12601053714752197,19709,
92
+ 45,9.999999747378752e-05,,20147,0.008512926287949085
93
+ 45,,0.12347382307052612,20147,
94
+ 46,9.999999747378752e-05,,20585,0.008274545893073082
95
+ 46,,0.12048626691102982,20585,
96
+ 47,9.999999747378752e-05,,21023,0.008215994574129581
97
+ 47,,0.11959672719240189,21023,
98
+ 48,9.999999747378752e-05,,21461,0.007745087146759033
99
+ 48,,0.11766844987869263,21461,
100
+ 49,9.999999747378752e-05,,21899,0.007633403409272432
101
+ 49,,0.11644507199525833,21899,
102
+ 50,9.999999747378752e-05,,22337,0.007164951413869858
103
+ 50,,0.11457140743732452,22337,
104
+ 51,9.999999747378752e-05,,22775,0.007660590577870607
105
+ 51,,0.11204110085964203,22775,
106
+ 52,9.999999747378752e-05,,23213,0.006850301753729582
107
+ 52,,0.11089475452899933,23213,
108
+ 53,9.999999747378752e-05,,23651,0.007705322001129389
109
+ 53,,0.10870277881622314,23651,
110
+ 54,9.999999747378752e-05,,24089,0.007404465693980455
111
+ 54,,0.10835524648427963,24089,
112
+ 55,9.999999747378752e-05,,24527,0.006458517163991928
113
+ 55,,0.10650914907455444,24527,
114
+ 56,9.999999747378752e-05,,24965,0.007545735687017441
115
+ 56,,0.10408639162778854,24965,
116
+ 57,9.999999747378752e-05,,25403,0.007189831230789423
117
+ 57,,0.10364826023578644,25403,
118
+ 58,9.999999747378752e-05,,25841,0.006769783794879913
119
+ 58,,0.10162495821714401,25841,
120
+ 59,9.999999747378752e-05,,26279,0.007359679788351059
121
+ 59,,0.1005307286977768,26279,
122
+ 60,9.999999747378752e-05,,26717,0.007181616500020027
123
+ 60,,0.09889473021030426,26717,
124
+ 61,9.999999747378752e-05,,27155,0.007292558439075947
125
+ 61,,0.09785006195306778,27155,
126
+ 62,9.999999747378752e-05,,27593,0.006553602870553732
127
+ 62,,0.09617737680673599,27593,
128
+ 63,9.999999747378752e-05,,28031,0.007114957086741924
129
+ 63,,0.09512084722518921,28031,
130
+ 64,9.999999747378752e-05,,28469,0.007685838732868433
131
+ 64,,0.09308712184429169,28469,
132
+ 65,9.999999747378752e-05,,28907,0.006641875021159649
133
+ 65,,0.09165126830339432,28907,
134
+ 66,9.999999747378752e-05,,29345,0.0069791534915566444
135
+ 66,,0.09131074696779251,29345,
136
+ 67,9.999999747378752e-06,,29783,0.006699174176901579
137
+ 67,,0.09000696986913681,29783,
138
+ 68,9.999999747378752e-06,,30221,0.006658358033746481
139
+ 68,,0.08921317756175995,30221,
140
+ 69,9.999999747378752e-06,,30659,0.006615882273763418
141
+ 69,,0.0899544283747673,30659,
142
+ 70,9.999999747378752e-06,,31097,0.006855498533695936
143
+ 70,,0.0888020247220993,31097,
144
+ 71,9.999999747378752e-06,,31535,0.0065613421611487865
145
+ 71,,0.08810604363679886,31535,
146
+ 72,9.999999747378752e-06,,31973,0.006737003102898598
147
+ 72,,0.08877792209386826,31973,
148
+ 73,9.999999747378752e-06,,32411,0.006567845121026039
149
+ 73,,0.08856819570064545,32411,
150
+ 74,9.999999747378752e-06,,32849,0.006733382120728493
151
+ 74,,0.08776818960905075,32849,
152
+ 75,9.999999747378752e-06,,33287,0.006751921493560076
153
+ 75,,0.08756214380264282,33287,
logs/lightning_logs/version_0/events.out.tfevents.1743098444.cn039.616674.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5aa8a7f2bfd9173f39192aa9ada6622131f9af524cc7736eeff2a71f2a1cc65f
3
+ size 18621
logs/lightning_logs/version_0/hparams.yaml ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ dropout: 0.2
2
+ ensemble_size: 10
3
+ extra_featurizers: null
4
+ learning_rate: 0.0001
5
+ n_hidden: 2
6
+ n_input: 2248
7
+ n_out: 1
8
+ n_units: 16
9
+ optimizer: !!python/name:torch.optim.adam.Adam ''
10
+ reduce_lr_on_plateau: true
11
+ reduce_lr_patience: 10
12
+ use_2d: true
13
+ use_fp: true
metrics.csv ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ split,split_filename,config_i,model_class,n_parameters,filename,features,labels,cache,extra_featurizers,use_2d,use_fp,dropout,ensemble_size,learning_rate,n_hidden,n_units,val_filename,epochs,batch_size,RMSE,Pearson r,Spearman rho
2
+ train,/nemo/lab/johnsone/home/users/johnsoe/data/datasets/thomas-2018-spark-wt/Yersinia-pestis/scaffold-split-train.csv.gz,11,FPMLPModelBox,362730,/nemo/lab/johnsone/home/users/johnsoe/data/datasets/thomas-2018-spark-wt/Yersinia-pestis/scaffold-split-train.csv.gz,['smiles'],['pmic'],/nemo/lab/johnsone/home/users/johnsoe/projects/abx-discovery-strategy/models/spark/Yersinia-pestis/11/cache,,True,True,0.2,10,0.0001,2,16,/nemo/lab/johnsone/home/users/johnsoe/data/datasets/thomas-2018-spark-wt/Yersinia-pestis/scaffold-split-validation.csv.gz,2000,16,0.037040311843156815,0.8516616817450599,0.97379069361196
3
+ validation,/nemo/lab/johnsone/home/users/johnsoe/data/datasets/thomas-2018-spark-wt/Yersinia-pestis/scaffold-split-validation.csv.gz,11,FPMLPModelBox,362730,/nemo/lab/johnsone/home/users/johnsoe/data/datasets/thomas-2018-spark-wt/Yersinia-pestis/scaffold-split-train.csv.gz,['smiles'],['pmic'],/nemo/lab/johnsone/home/users/johnsoe/projects/abx-discovery-strategy/models/spark/Yersinia-pestis/11/cache,,True,True,0.2,10,0.0001,2,16,/nemo/lab/johnsone/home/users/johnsoe/data/datasets/thomas-2018-spark-wt/Yersinia-pestis/scaffold-split-validation.csv.gz,2000,16,0.041353218257427216,0.7403999692649684,0.9332247081913623
4
+ test,/nemo/lab/johnsone/home/users/johnsoe/data/datasets/thomas-2018-spark-wt/Yersinia-pestis/scaffold-split-test.csv.gz,11,FPMLPModelBox,362730,/nemo/lab/johnsone/home/users/johnsoe/data/datasets/thomas-2018-spark-wt/Yersinia-pestis/scaffold-split-train.csv.gz,['smiles'],['pmic'],/nemo/lab/johnsone/home/users/johnsoe/projects/abx-discovery-strategy/models/spark/Yersinia-pestis/11/cache,,True,True,0.2,10,0.0001,2,16,/nemo/lab/johnsone/home/users/johnsoe/data/datasets/thomas-2018-spark-wt/Yersinia-pestis/scaffold-split-validation.csv.gz,2000,16,0.042944587767124176,0.7870136836550153,0.9723500084144376
modelbox-config.json ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "dropout": 0.2,
3
+ "ensemble_size": 10,
4
+ "extra_featurizers": null,
5
+ "learning_rate": 0.0001,
6
+ "model_class": "FPMLPModelBox",
7
+ "n_hidden": 2,
8
+ "n_units": 16,
9
+ "use_2d": true,
10
+ "use_fp": true
11
+ }
params.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a9144b2299a6f26a2ed9447251427551ec1181dfea0c04e5929b258979fc80b6
3
+ size 1482978
predictions_test.csv.gz ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:095491ac1071c3c9079d3b95600ed1134e685a1c11801cb7cb535c2ff882b061
3
+ size 1629104
predictions_train.csv.gz ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f3b511605ace094d76531dfb80e25a1593bae4f00a76845bfc9485d75dccf742
3
+ size 6731796
predictions_validation.csv.gz ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a02adfb354bcb1a2fc6a124c7b0b65aa903562b0a1c449349a2575ca69018669
3
+ size 1561499
repo-name.txt ADDED
@@ -0,0 +1 @@
 
 
1
+ scbirlab/spark-dv-2503-ypes
training-args.json ADDED
@@ -0,0 +1,5 @@
 
 
 
 
 
 
1
+ {
2
+ "batch_size": 16,
3
+ "epochs": 2000,
4
+ "val_filename": "/nemo/lab/johnsone/home/users/johnsoe/data/datasets/thomas-2018-spark-wt/Yersinia-pestis/scaffold-split-validation.csv.gz"
5
+ }
training-data.hf/cache-c576e31571e711ad.arrow ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:959a614450be7989a79900ff4684261077ef4e69646ce081b02e9bb8ad3e1551
3
+ size 126862848
training-data.hf/data-00000-of-00001.arrow ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:db76da8bf189fcd4b2421108c5534d504b8d09606b4361f62d45689219923beb
3
+ size 126338936
training-data.hf/dataset_info.json ADDED
@@ -0,0 +1,52 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "builder_name": "csv",
3
+ "citation": "",
4
+ "config_name": "default",
5
+ "dataset_name": "csv",
6
+ "dataset_size": 2760245,
7
+ "description": "",
8
+ "download_checksums": {
9
+ "/nemo/lab/johnsone/home/users/johnsoe/data/datasets/thomas-2018-spark-wt/Yersinia-pestis/scaffold-split-train.csv.gz": {
10
+ "num_bytes": 559702,
11
+ "checksum": null
12
+ }
13
+ },
14
+ "download_size": 559702,
15
+ "features": {
16
+ "smiles": {
17
+ "dtype": "string",
18
+ "_type": "Value"
19
+ },
20
+ "inputs": {
21
+ "feature": {
22
+ "dtype": "float64",
23
+ "_type": "Value"
24
+ },
25
+ "_type": "Sequence"
26
+ },
27
+ "labels": {
28
+ "feature": {
29
+ "dtype": "float64",
30
+ "_type": "Value"
31
+ },
32
+ "_type": "Sequence"
33
+ }
34
+ },
35
+ "homepage": "",
36
+ "license": "",
37
+ "size_in_bytes": 3319947,
38
+ "splits": {
39
+ "train": {
40
+ "name": "train",
41
+ "num_bytes": 2760245,
42
+ "num_examples": 7002,
43
+ "dataset_name": "csv"
44
+ }
45
+ },
46
+ "version": {
47
+ "version_str": "0.0.0",
48
+ "major": 0,
49
+ "minor": 0,
50
+ "patch": 0
51
+ }
52
+ }
training-data.hf/state.json ADDED
@@ -0,0 +1,15 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_data_files": [
3
+ {
4
+ "filename": "data-00000-of-00001.arrow"
5
+ }
6
+ ],
7
+ "_fingerprint": "187fee60d2df35f6",
8
+ "_format_columns": null,
9
+ "_format_kwargs": {
10
+ "dtype": "float"
11
+ },
12
+ "_format_type": "numpy",
13
+ "_output_all_columns": false,
14
+ "_split": "train"
15
+ }
training-log.csv ADDED
@@ -0,0 +1,77 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ epoch,step,learning_rate,loss,val_loss
2
+ 0,437,9.999999747378752e-05,4.066796779632568,0.1413220316171646
3
+ 1,875,9.999999747378752e-05,1.0922460556030271,0.1224312037229538
4
+ 2,1313,9.999999747378752e-05,0.9123325943946838,0.0954282358288765
5
+ 3,1751,9.999999747378752e-05,0.7958776950836182,0.0762958154082298
6
+ 4,2189,9.999999747378752e-05,0.7250118255615234,0.0773665085434913
7
+ 5,2627,9.999999747378752e-05,0.6619632244110107,0.0672864541411399
8
+ 6,3065,9.999999747378752e-05,0.6077855825424194,0.0589933842420578
9
+ 7,3503,9.999999747378752e-05,0.5694469213485718,0.0563371591269969
10
+ 8,3941,9.999999747378752e-05,0.5191640257835388,0.0578123442828655
11
+ 9,4379,9.999999747378752e-05,0.4833517670631408,0.0506290234625339
12
+ 10,4817,9.999999747378752e-05,0.4452496767044067,0.0412386953830719
13
+ 11,5255,9.999999747378752e-05,0.4180718064308166,0.0377900265157222
14
+ 12,5693,9.999999747378752e-05,0.3903811872005462,0.0365348607301712
15
+ 13,6131,9.999999747378752e-05,0.3599644005298614,0.0370339676737785
16
+ 14,6569,9.999999747378752e-05,0.3394574820995331,0.0300690438598394
17
+ 15,7007,9.999999747378752e-05,0.3145067393779754,0.0262938197702169
18
+ 16,7445,9.999999747378752e-05,0.2943126261234283,0.0268951635807752
19
+ 17,7883,9.999999747378752e-05,0.2813400030136108,0.0262292549014091
20
+ 18,8321,9.999999747378752e-05,0.267516016960144,0.0211033355444669
21
+ 19,8759,9.999999747378752e-05,0.2504330575466156,0.0231174658983945
22
+ 20,9197,9.999999747378752e-05,0.2421789467334747,0.0206695646047592
23
+ 21,9635,9.999999747378752e-05,0.2293732464313507,0.0176578946411609
24
+ 22,10073,9.999999747378752e-05,0.2188540548086166,0.0172722116112709
25
+ 23,10511,9.999999747378752e-05,0.2088192105293274,0.0142071256414055
26
+ 24,10949,9.999999747378752e-05,0.2025712132453918,0.0133981211110949
27
+ 25,11387,9.999999747378752e-05,0.1938035935163498,0.0148626808077096
28
+ 26,11825,9.999999747378752e-05,0.1880243569612503,0.0122405458241701
29
+ 27,12263,9.999999747378752e-05,0.1807633489370346,0.0109839690849184
30
+ 28,12701,9.999999747378752e-05,0.1754221469163894,0.0116647211834788
31
+ 29,13139,9.999999747378752e-05,0.1710473001003265,0.0118104955181479
32
+ 30,13577,9.999999747378752e-05,0.1650548726320266,0.0112288203090429
33
+ 31,14015,9.999999747378752e-05,0.1601691395044326,0.0098503343760967
34
+ 32,14453,9.999999747378752e-05,0.157705083489418,0.0102034956216812
35
+ 33,14891,9.999999747378752e-05,0.1549638211727142,0.008621348068118
36
+ 34,15329,9.999999747378752e-05,0.1505416631698608,0.008866349235177
37
+ 35,15767,9.999999747378752e-05,0.1471449434757232,0.0082786111161112
38
+ 36,16205,9.999999747378752e-05,0.1454640179872512,0.0103487540036439
39
+ 37,16643,9.999999747378752e-05,0.1406158804893493,0.0082839298993349
40
+ 38,17081,9.999999747378752e-05,0.1388415694236755,0.0088634761050343
41
+ 39,17519,9.999999747378752e-05,0.1370924860239029,0.0084647005423903
42
+ 40,17957,9.999999747378752e-05,0.1347068846225738,0.008587003685534
43
+ 41,18395,9.999999747378752e-05,0.1306893229484558,0.0079299043864011
44
+ 42,18833,9.999999747378752e-05,0.1305685937404632,0.0086246393620967
45
+ 43,19271,9.999999747378752e-05,0.1283183246850967,0.0084729036316275
46
+ 44,19709,9.999999747378752e-05,0.1260105371475219,0.0078038382343947
47
+ 45,20147,9.999999747378752e-05,0.1234738230705261,0.008512926287949
48
+ 46,20585,9.999999747378752e-05,0.1204862669110298,0.008274545893073
49
+ 47,21023,9.999999747378752e-05,0.1195967271924018,0.0082159945741295
50
+ 48,21461,9.999999747378752e-05,0.1176684498786926,0.007745087146759
51
+ 49,21899,9.999999747378752e-05,0.1164450719952583,0.0076334034092724
52
+ 50,22337,9.999999747378752e-05,0.1145714074373245,0.0071649514138698
53
+ 51,22775,9.999999747378752e-05,0.112041100859642,0.0076605905778706
54
+ 52,23213,9.999999747378752e-05,0.1108947545289993,0.0068503017537295
55
+ 53,23651,9.999999747378752e-05,0.1087027788162231,0.0077053220011293
56
+ 54,24089,9.999999747378752e-05,0.1083552464842796,0.0074044656939804
57
+ 55,24527,9.999999747378752e-05,0.1065091490745544,0.0064585171639919
58
+ 56,24965,9.999999747378752e-05,0.1040863916277885,0.0075457356870174
59
+ 57,25403,9.999999747378752e-05,0.1036482602357864,0.0071898312307894
60
+ 58,25841,9.999999747378752e-05,0.101624958217144,0.0067697837948799
61
+ 59,26279,9.999999747378752e-05,0.1005307286977768,0.007359679788351
62
+ 60,26717,9.999999747378752e-05,0.0988947302103042,0.00718161650002
63
+ 61,27155,9.999999747378752e-05,0.0978500619530677,0.0072925584390759
64
+ 62,27593,9.999999747378752e-05,0.0961773768067359,0.0065536028705537
65
+ 63,28031,9.999999747378752e-05,0.0951208472251892,0.0071149570867419
66
+ 64,28469,9.999999747378752e-05,0.0930871218442916,0.0076858387328684
67
+ 65,28907,9.999999747378752e-05,0.0916512683033943,0.0066418750211596
68
+ 66,29345,9.999999747378752e-05,0.0913107469677925,0.0069791534915566
69
+ 67,29783,9.999999747378752e-06,0.0900069698691368,0.0066991741769015
70
+ 68,30221,9.999999747378752e-06,0.0892131775617599,0.0066583580337464
71
+ 69,30659,9.999999747378752e-06,0.0899544283747673,0.0066158822737634
72
+ 70,31097,9.999999747378752e-06,0.0888020247220993,0.0068554985336959
73
+ 71,31535,9.999999747378752e-06,0.0881060436367988,0.0065613421611487
74
+ 72,31973,9.999999747378752e-06,0.0887779220938682,0.0067370031028985
75
+ 73,32411,9.999999747378752e-06,0.0885681957006454,0.006567845121026
76
+ 74,32849,9.999999747378752e-06,0.0877681896090507,0.0067333821207284
77
+ 75,33287,9.999999747378752e-06,0.0875621438026428,0.00675192149356
training-log.png ADDED