Commit
·
467c5c9
1
Parent(s):
8bd3d31
Update README.md
Browse files
README.md
CHANGED
@@ -20,18 +20,29 @@ pipeline_tag: text-to-image
|
|
20 |
- **Architecture**: roberta-base
|
21 |
|
22 |
## Description
|
23 |
-
In order to improve the RoBERTa encoder performance, this model has been trained using the generated corpus ([in this respository](https://huggingface.co/oeg/RoBERTa-CelebA-Sp/))
|
24 |
and following the strategy of using a Siamese network together with the loss function of cosine similarity. The following steps were followed:
|
25 |
-
- Define sentence-transformer and torch libraries for the implementation of the encoder.
|
26 |
- Divide the training corpus into two parts, training with 249,999 sentences and validation with 10,000 sentences.
|
27 |
- Load training / validation data for the model. Two lists are generated for the storage of the information and, in each of them,
|
28 |
the entries are composed of a pair of descriptive sentences and their similarity value.
|
29 |
-
- Implement RoBERTa as a baseline model for transformer training.
|
30 |
- Train with a Siamese network in which, for a pair of sentences _A_ and _B_ from the training corpus, the similarities of their embedding
|
31 |
- vectors _u_ and _v_ generated using the cosine similarity metric (_CosineSimilarityLoss()_) are evaluated.
|
32 |
|
|
|
|
|
33 |
## How to use
|
34 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
35 |
|
36 |
## Licensing information
|
37 |
This model is available under the [Apache License 2.0.](https://www.apache.org/licenses/LICENSE-2.0)
|
|
|
20 |
- **Architecture**: roberta-base
|
21 |
|
22 |
## Description
|
23 |
+
In order to improve the [RoBERTa-large-bne](https://huggingface.co/PlanTL-GOB-ES/roberta-large-bne) encoder performance, this model has been trained using the generated corpus ([in this respository](https://huggingface.co/oeg/RoBERTa-CelebA-Sp/))
|
24 |
and following the strategy of using a Siamese network together with the loss function of cosine similarity. The following steps were followed:
|
25 |
+
- Define [sentence-transformer](https://www.sbert.net/) and torch libraries for the implementation of the encoder.
|
26 |
- Divide the training corpus into two parts, training with 249,999 sentences and validation with 10,000 sentences.
|
27 |
- Load training / validation data for the model. Two lists are generated for the storage of the information and, in each of them,
|
28 |
the entries are composed of a pair of descriptive sentences and their similarity value.
|
29 |
+
- Implement [RoBERTa-large-bne](https://huggingface.co/PlanTL-GOB-ES/roberta-large-bne) as a baseline model for transformer training.
|
30 |
- Train with a Siamese network in which, for a pair of sentences _A_ and _B_ from the training corpus, the similarities of their embedding
|
31 |
- vectors _u_ and _v_ generated using the cosine similarity metric (_CosineSimilarityLoss()_) are evaluated.
|
32 |
|
33 |
+
The total training time using the [sentence-transformer](https://www.sbert.net/) library in Python was 42 days using all the available GPUs of the server, and with exclusive dedication.
|
34 |
+
|
35 |
## How to use
|
36 |
|
37 |
+
To make use of the model use the following code in Python:
|
38 |
+
```python
|
39 |
+
from sentence_transformers import SentenceTransformer, InputExample, models, losses, util, evaluation
|
40 |
+
model_sbert = SentenceTransformer('roberta-large-bne-celebAEs-UNI')
|
41 |
+
caption = ['La mujer tiene pomulos altos. Su cabello es de color negro. Tiene las cejas arqueadas y la boca ligeramente abierta. La joven y atractiva mujer sonriente tiene mucho maquillaje. Lleva aretes, collar y lapiz labial.']
|
42 |
+
vectors = model_sbert.encode(captions)
|
43 |
+
print(vector)
|
44 |
+
```
|
45 |
+
To see more detailed information about the implementation visit the [following link](https://github.com/eduar03yauri/DCGAN-text2face-forSpanish/Data/encoder-models/RoBERTa_model_trained.md).
|
46 |
|
47 |
## Licensing information
|
48 |
This model is available under the [Apache License 2.0.](https://www.apache.org/licenses/LICENSE-2.0)
|