norwegian-gpt2-vgd / README.md
pere's picture
initial clone before setting up
2be395e
|
raw
history blame
554 Bytes
metadata
language: 'no'
license: cc-by-4.0
tags:
  - norwegian
  - GPT2
  - casual language modeling

Norwegian GPT-2 - Social

Description

Private test of gpt fine-tuning based on vgd.

The following sub-corpora are used for the base model:

wikipedia_download_nb.jsonl
wikipedia_download_nn.jsonl
newspapers_online_nb.jsonl
newspapers_online_nn.jsonl
twitter_2016_2018_no.jsonl
twitter_news_2016_2018_no.jsonl
open_subtitles_no.jsonl
facebook_no.jsonl
reddit_no.jsonl
vgdebatt_no.jsonl

Finetuned on the private dataset located at NbAiLab/vgd.