danitamayo's picture
Update README.md
9a7d130
Results after fine-tuning distilbert in 80% of 15189 instances
**(Model still under development)**
20it [00:11, 1.85it/s]Train: wpb=2121, num_updates=20, accuracy=44.1, loss=0.00\
50it [00:28, 1.76it/s]Train: wpb=2121, num_updates=50, accuracy=55.4, loss=0.00\
100it [00:55, 1.88it/s]Train: wpb=2117, num_updates=100, accuracy=64.5, loss=0.00\
200it [01:48, 1.85it/s]Train: wpb=2132, num_updates=200, accuracy=71.6, loss=0.00\
300it [02:42, 1.88it/s]Train: wpb=2147, num_updates=300, accuracy=75.1, loss=0.00\
380it [03:24, 1.86it/s]\
Train: wpb=2142, num_updates=380, accuracy=76.9, loss=0.00\
| epoch 000 | train accuracy=76.9%, train loss=0.00\
| epoch 000 | valid accuracy=85.7%, valid loss=0.00\
20it [00:10, 1.85it/s]Train: wpb=2121, num_updates=20, accuracy=84.6, loss=0.00\
50it [00:27, 1.77it/s]Train: wpb=2121, num_updates=50, accuracy=84.6, loss=0.00\
100it [00:54, 1.87it/s]Train: wpb=2117, num_updates=100, accuracy=85.1, loss=0.00\
200it [01:47, 1.86it/s]Train: wpb=2132, num_updates=200, accuracy=85.4, loss=0.00\
300it [02:41, 1.88it/s]Train: wpb=2147, num_updates=300, accuracy=85.6, loss=0.00\
380it [03:24, 1.86it/s]\
Train: wpb=2142, num_updates=380, accuracy=85.8, loss=0.00\
| epoch 001 | train accuracy=85.8%, train loss=0.00\
| epoch 001 | valid accuracy=88.3%, valid loss=0.00
20it [00:10, 1.86it/s]Train: wpb=2121, num_updates=20, accuracy=87.3, loss=0.00\
50it [00:27, 1.77it/s]Train: wpb=2121, num_updates=50, accuracy=87.0, loss=0.00\
100it [00:54, 1.88it/s]Train: wpb=2117, num_updates=100, accuracy=87.2, loss=0.00\
200it [01:47, 1.85it/s]Train: wpb=2132, num_updates=200, accuracy=87.2, loss=0.00\
300it [02:41, 1.88it/s]Train: wpb=2147, num_updates=300, accuracy=87.2, loss=0.00\
380it [03:23, 1.86it/s]\
Train: wpb=2142, num_updates=380, accuracy=87.3, loss=0.00\
| epoch 002 | train accuracy=87.3%, train loss=0.00\
| epoch 002 | valid accuracy=89.3%, valid loss=0.00
We have to change the loss function... It seems to be a problem...
**You can evaluate the performance of our model by writing the following example:**
*"google chrome before 18. 0. 1025. 142 does not properly validate the renderer's navigation requests, which has unspecified impact and remote attack vectors."*
The result, for each token, should be similar :
['B-vendor', 'B-application', 'B-version', 'I-version', 'I-version', 'I-version', 'I-version', 'I-version', 'I-version', 'I-version', 'I-version', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'O', 'B-relevant_term', 'O', 'O', 'O', 'O', 'O', 'O', 'B-relevant_term', 'B-relevant_term', 'O', 'O']
Different possible classes that are detected:
['I-update', 'I-version', 'B-programming language', 'B-relevant_term', 'B-parameter', 'I-relevant_term', 'B-vendor', 'B-function', 'B-version', 'B-hardware', 'I-application', 'B-os', 'O', 'B-cve id', 'B-update', 'I-edition', 'I-hardware', 'I-os', 'B-edition', 'B-application', 'B-language', 'B-file', 'B-method', 'I-vendor']