Spaces:
Build error
Build error
VNSpellCorrection
Environment setup using Conda
conda create -n spcr python=3.9.5
conda activate spcr
Install libraries
pip install -r requirements.txt
To Use Our Trained Model
Download the following vocab
and weights
file
Set up folder data as follow
.
├── ...
├── models
├── data
│ ├── binhvq
│ └── binhvq.vocab.pkl
│ ├── checkpoints
│ └── tfmwtr
│ └── locdx.weights.pth
├── utils
└── ...
And then start the Flask
server
python server.py
Go to localhost:8000 to use the website
To Train Model From Scratch
Prepare a corpus file corpus.txt
and put as folowing structure. Sample file in the folder sample
.
.
├── ...
├── models
├── data
│ ├── binhvq
│ └── corpus.txt
├── utils
└── ...
Start prepare data by
cd dataset
python prepare_dataset.py --corpus binhvq --file corpus.txt
cleandata.sh binhvq
Start training by
python train.py
To Evaluate Model
Evaluate on generated dataset.
python correct.py
Evaluate on VSEC public dataset. First need to download VSEC.jsonl
at https://github.com/VSEC2021/VSEC and setup folder as follow
.
├── ...
├── models
├── data
│ ├── vsec
│ └── VSEC.jsonl
├── utils
└── ...
Start prepare VSEC data.
cd dataset
python prepare_vsec.py
python correct.py --test_data vsec