Model Card for Model ID

This is a Chinese-English translation model based on Transformer architecture(beta version 0.0.1). This modelcard aims to be a base template for new models.

Model Details

Model Description

This model aims to provide the service of traslation between Chinese and English(Only Chinese to English available now), which based on wmt19 dataset.

The model is good at grammar in translation betweeen zh-en, if you don't want to fork this repository, just try the API reference aside this page ^_^

  • Developed by: Varine
  • Shared by: TianQi Xu
  • Model type: Tranformer
  • Language(s) (NLP): Chinese, English
  • License: MIT
  • Finetuned from model: opus-mt-zh-en-fintuned-zhen-checkpoints

Model Sources

Uses

This model can be used in tranlation missions between Chinese and English.

Direct Use

As it's a traditional translation model, it can be used in many circumstances, including translation between some academical papers, news, and even some of the literary works(as the excellent performance the model is in grammar and multi-context cases).

Bias, Risks, and Limitations

1.Remember this is a beta version of this translation model,thus we add the limitation on the scale of input tokens, so plz make sure the scale of your input text won't overflow the limit.
2.DO NOT APPLY THIS MODEL FOR ILLEGAL USES.

Recommendations

Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations.

How to Get Started with the Model

Before we enjoy this model, Plz follow the possible directions to make sure your environment is appropriate.Use the code below to get started with the model.

1.Use git tools to fork this repository(If you don't wan't to torture youself in configuring environment, just feel free to use API!):

git clone "https://huggingface.co/Varine/opus-mt-zh-en-model"

2.After forking, plz make sure you have installed the modules below in your Jupiter Notebook or other IDEs:

! pip install transformers datasets numpy

3.After checking the packages, plz run the code in translation.ipynb, and a Jupiter Notebook environment is recommended in this step.
4.Ultimately you can enjoy the whole model my loading the model by code and pipelining translation strategy, and don't forget to input your text!

Training Details

Training Data

  • wmt/wmt19

Training Procedure

As the dataset we choose in training is tremendous in scale, so after analyzing, we decided to use the only 4% among the whole dataset to train, and we divided the 4% data in 10 epoch to evaluate the training loss and and validation loss in every part of the epoch.
Moreover, we need to claim that, the data form that we used in our training progress is Chinese-English sentence pairs(to better embedding and compare them in higher-dimension space in Transformer architecture).

Training Hyperparameters

  • Training regime: fp32

Evaluation

Testing Data, Factors & Metrics

Testing Data

  • wmt/wmt19

Hardware used in the training

  • Hardware Type: 1x Nvidia A10 GPU with 30v CPUs, 200GiB RAM, 1 TiB SSD storage
  • Hours used: 4.08hrs(roughly estimated)
  • Cloud Provider: Lambda Cloud.Co
  • Compute Region: California, USA
  • Carbon Emitted: N/A

Model Architecture and Objective

We use the Transformer architecture(Huggingface version) in this model,and it's universal architecture widely used in machine translation missions.

Compute Infrastructure

Due to the limit of the computational ability on personal PC and the scale of the dataset, we decided to training our model on GPU cloud, which proved to be effective.

Hardware

Thanks to the Lambda Cloud, we use the A10 GPU of Nvidia to finish the project.

Software

We used the Jupiter Notebook on cloud to run our code.

Model Card Authors [optional]

Varine Xie

Model Card Contact

Plz contact me through email:https://[email protected], and I'm glad to receive feedback from y'all! 😊

Downloads last month
30
Safetensors
Model size
77.5M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train Varine/opus-mt-zh-en-model