h3 / README.md

Update README.md

4153976 almost 2 years ago

8.46 kB

	---
	license: apache-2.0
	tags:
	- distigpt2
	- hearthstone
	metrics:
	- bleu
	- dvitel/codebleu
	- exact_match
	- chrf
	datasets:
	- dvitel/hearthstone
	model-index:
	- name: h0
	results:
	- task:
	type: text-generation
	name: Python Code Synthesis
	dataset:
	type: dvitel/hearthstone
	name: HearthStone
	split: test
	metrics:
	- type: exact_match
	value: 0.30303030303030304
	name: Exact Match
	- type: bleu
	value: 0.8850182403024257
	name: BLEU
	- type: dvitel/codebleu
	value: 0.677852377992836
	name: CodeBLEU
	- type: chrf
	value: 91.00848749530383
	name: chrF
	---

	# h3

	This model is a fine-tuned version of [distilgpt2](https://huggingface.co/distilgpt2) on [hearthstone](https://huggingface.co/datasets/dvitel/hearthstone) dataset.
	[GitHub repo](https://github.com/dvitel/nlp-sem-parsing/blob/master/h3.py).
	It achieves the following results on the evaluation set:
	- Loss: 0.2782
	- Exact Match: 0.2879
	- Bleu: 0.9121
	- Codebleu: 0.7482
	- Ngram Match Score: 0.7504
	- Weighted Ngram Match Score: 0.7583
	- Syntax Match Score: 0.7673
	- Dataflow Match Score: 0.7169
	- Chrf: 93.1064

	## Model description

	DistilGPT2 fine-tuned on HearthStone dataset for 200 epochs. \
	Related to [dvitel/h0](https://huggingface.co/dvitel/h0) but with preprocessing which anonymizes classes and function variables (Local renaming). \
	[dvitel/h2](https://huggingface.co/dvitel/h2) implements global renaming where all names are removed. Global renaming showed worse results compared to local renaming.

	Example of generated code with mistake on last eval iteration (EV L - gold labels, EV P - prediction):
	```python
	EV L class CLS0(MinionCard):
	def __init__(self):
	super().__init__('Darkscale Healer', 5, CHARACTER_CLASS.ALL, CARD_RARITY.COMMON, battlecry=Battlecry(Heal(2), CharacterSelector()))
	def create_minion(self, v0):
	return Minion(4, 5)
	EV P class CLS0(MinionCard):
	def __init__(self):
	super().__init__('Darkscale Healer', 5, CHARACTER_CLASS.ALL, CARD_RARITY.COMMON, battlecry=Battlecry(Heal(2), CharacterSelector())
	def create_minion(self, v0):
	return Minion(4, 5)

	EV L class CLS0(WeaponCard):
	def __init__(self):
	super().__init__('Fiery War Axe', 2, CHARACTER_CLASS.WARRIOR, CARD_RARITY.FREE)
	def create_weapon(self, v0):
	return Weapon(3, 2)
	EV P class CLS0(WeaponCard):
	def __init__(self):
	super().__init__('Fiery War Axe', 2, CHARACTER_CLASS.WARRIOR, CARD_RARITY.FREE,
	def create_weapon(self, v0):
	return Weapon(3, 2)

	EV L class CLS0(MinionCard):
	def __init__(self):
	super().__init__('Frostwolf Warlord', 5, CHARACTER_CLASS.ALL, CARD_RARITY.COMMON, battlecry=Battlecry(Give([Buff(ChangeAttack(Count(MinionSelector()))), Buff(ChangeHealth(Count(MinionSelector())))]), SelfSelector()))
	def create_minion(self, v0):
	return Minion(4, 4)
	EV P class CLS0(MinionCard):
	def __init__(self):
	super().__init__('Frostwolf Warlord', 5, CHARACTER_CLASS.ALL, CARD_RARITY.COMMON, battlecry=Battlecry(Give([Buff(ChangeAttack(Count(MinionSelector(),), Buff(ChangeHealth(Count(MinionSelector()))))]),), SelfSelector()))
	def create_minion(self, v0):
	return Minion(4, 4)

	EV L class CLS0(SpellCard):
	def __init__(self):
	super().__init__('Hellfire', 4, CHARACTER_CLASS.WARLOCK, CARD_RARITY.FREE)
	def use(self, v0, v1):
	super().use(v0, v1)
	v2 = copy.copy(v1.other_player.minions)
	v2.extend(v1.current_player.minions)
	v2.append(v1.other_player.hero)
	v2.append(v1.current_player.hero)
	for v3 in v2:
	v3.damage(v0.effective_spell_damage(3), self)
	EV P class CLS0(SpellCard):
	def __init__(self):
	super().__init__('Hellfire', 4, CHARACTER_CLASS.WARLOCK, CARD_RARITY.FREE,
	def use(self, v0, v1):
	super().use(v0, v1)
	v2 = copy.copy(v1.other_player.minions)
	v2.extend(v1.current_player.minions)
	for.append(v1.other_player.hero)
	for.append(v1.other_player.hero)
	for v3 in v2:
	.damage(v0.effective_spell_damage(3), self)
	```

	## Intended uses & limitations

	HearthStone card code synthesis.

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 2e-05
	- train_batch_size: 4
	- eval_batch_size: 4
	- seed: 17
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: cosine
	- num_epochs: 200
	- mixed_precision_training: Native AMP

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Exact Match \| Bleu \| Codebleu \| Ngram Match Score \| Weighted Ngram Match Score \| Syntax Match Score \| Dataflow Match Score \| Chrf \|
	\|:-------------:\|:------:\|:-----:\|:---------------:\|:-----------:\|:------:\|:--------:\|:-----------------:\|:--------------------------:\|:------------------:\|:--------------------:\|:-------:\|
	\| 0.8612 \| 11.94 \| 1600 \| 0.2725 \| 0.0455 \| 0.8477 \| 0.6050 \| 0.6229 \| 0.6335 \| 0.6203 \| 0.5431 \| 88.7010 \|
	\| 0.175 \| 23.88 \| 3200 \| 0.2311 \| 0.0909 \| 0.8739 \| 0.6304 \| 0.6566 \| 0.6656 \| 0.6484 \| 0.5508 \| 90.7364 \|
	\| 0.1036 \| 35.82 \| 4800 \| 0.2172 \| 0.1818 \| 0.8930 \| 0.6905 \| 0.6976 \| 0.7062 \| 0.7172 \| 0.6409 \| 91.9702 \|
	\| 0.0695 \| 47.76 \| 6400 \| 0.2233 \| 0.2424 \| 0.8944 \| 0.7017 \| 0.7148 \| 0.7232 \| 0.7187 \| 0.6499 \| 92.0340 \|
	\| 0.0482 \| 59.7 \| 8000 \| 0.2407 \| 0.2879 \| 0.9046 \| 0.7301 \| 0.7387 \| 0.7456 \| 0.7475 \| 0.6885 \| 92.6219 \|
	\| 0.0352 \| 71.64 \| 9600 \| 0.2407 \| 0.2424 \| 0.9074 \| 0.7255 \| 0.7371 \| 0.7448 \| 0.7482 \| 0.6718 \| 92.8281 \|
	\| 0.0262 \| 83.58 \| 11200 \| 0.2596 \| 0.3030 \| 0.9061 \| 0.7445 \| 0.7415 \| 0.7500 \| 0.7774 \| 0.7091 \| 92.6737 \|
	\| 0.0213 \| 95.52 \| 12800 \| 0.2589 \| 0.2879 \| 0.9061 \| 0.7308 \| 0.7409 \| 0.7488 \| 0.7464 \| 0.6873 \| 92.7814 \|
	\| 0.0164 \| 107.46 \| 14400 \| 0.2679 \| 0.2879 \| 0.9096 \| 0.7452 \| 0.7510 \| 0.7592 \| 0.7626 \| 0.7079 \| 92.9900 \|
	\| 0.0131 \| 119.4 \| 16000 \| 0.2660 \| 0.2879 \| 0.9096 \| 0.7447 \| 0.7480 \| 0.7564 \| 0.7666 \| 0.7079 \| 93.0122 \|
	\| 0.0116 \| 131.34 \| 17600 \| 0.2669 \| 0.2727 \| 0.9092 \| 0.7463 \| 0.7445 \| 0.7529 \| 0.7684 \| 0.7194 \| 92.9256 \|
	\| 0.0093 \| 143.28 \| 19200 \| 0.2678 \| 0.2879 \| 0.9113 \| 0.7531 \| 0.7496 \| 0.7581 \| 0.7709 \| 0.7336 \| 93.0406 \|
	\| 0.0083 \| 155.22 \| 20800 \| 0.2728 \| 0.2879 \| 0.9103 \| 0.7407 \| 0.7462 \| 0.7540 \| 0.7702 \| 0.6924 \| 92.9302 \|
	\| 0.0077 \| 167.16 \| 22400 \| 0.2774 \| 0.2879 \| 0.9103 \| 0.7449 \| 0.7449 \| 0.7532 \| 0.7659 \| 0.7156 \| 92.9742 \|
	\| 0.0069 \| 179.1 \| 24000 \| 0.2774 \| 0.2879 \| 0.9120 \| 0.7396 \| 0.7463 \| 0.7539 \| 0.7633 \| 0.6950 \| 93.1057 \|
	\| 0.0069 \| 191.04 \| 25600 \| 0.2782 \| 0.2879 \| 0.9121 \| 0.7482 \| 0.7504 \| 0.7583 \| 0.7673 \| 0.7169 \| 93.1064 \|


	### Framework versions

	- Transformers 4.24.0
	- Pytorch 1.13.0
	- Datasets 2.6.1
	- Tokenizers 0.13.1