metadata

license: apache-2.0
datasets:
  - oeg/CelebA_RoBERTa_Sp
language:
  - es
tags:
  - Spanish
  - CelebA
  - Roberta-base-bne
  - celebFaces Attributes
pipeline_tag: text-to-image

RoBERTa base BNE trained with data from the descriptive text corpus of the CelebA dataset

Overview

Language: Spanish
Data: CelebA_RoBERTa_Sp.
Architecture: roberta-base

Description

In order to improve the RoBERTa encoder performance, this model has been trained using the generated corpus (in this respository) and following the strategy of using a Siamese network together with the loss function of cosine similarity. The following steps were followed:

Define sentence-transformer and torch libraries for the implementation of the encoder.
Divide the training corpus into two parts, training with 249,999 sentences and validation with 10,000 sentences.
Load training / validation data for the model. Two lists are generated for the storage of the information and, in each of them, the entries are composed of a pair of descriptive sentences and their similarity value.
Implement RoBERTa as a baseline model for transformer training.
Train with a Siamese network in which, for a pair of sentences A and B from the training corpus, the similarities of their embedding
vectors u and v generated using the cosine similarity metric (CosineSimilarityLoss()) are evaluated.

How to use

Licensing information

This model is available under the Apache License 2.0.

Citation information

Citing: If you used RoBERTa+CelebA model in your work, please cite the ????:

@article{inffus_TINTO,
    title = {A novel deep learning approach using blurring image techniques for Bluetooth-based indoor localisation},
    journal = {Information Fusion},
    author = {Reewos Talla-Chumpitaz and Manuel Castillo-Cara and Luis Orozco-Barbosa and Raúl García-Castro},
    volume = {91},
    pages = {173-186},
    year = {2023},
    issn = {1566-2535},
    doi = {https://doi.org/10.1016/j.inffus.2022.10.011}
}

Autors

Universidad Nacional de Ingeniería, Ontology Engineering Group, Universidad Politécnica de Madrid.

Contributors

See the full list of contributors here.