AmelieSchreiber
/

esm2_t6_8M_finetuned_cafa5_v2

Text Classification

protein language model

Inference Endpoints

Model card Files Files and versions Community

esm2_t6_8M_finetuned_cafa5_v2 / README.md

AmelieSchreiber's picture

AmelieSchreiber

Update README.md

8301fe6 about 1 year ago

|

history blame contribute delete

1.12 kB

	---
	license: mit
	datasets:
	- AmelieSchreiber/cafa5_pickle_split
	language:
	- en
	metrics:
	- accuracy
	- f1
	- precision
	- recall
	- roc_auc
	library_name: transformers
	tags:
	- esm
	- esm2
	- protein language model
	- biology
	- cafa5
	---

	# ESM-2 Pre-finetuned for CAFA-5 for Protein Function Prediction
	This model is a pre-finetuned for CAFA-5 protein function prediction for four epochs.
	This model is meant to be finetuned in a second stage of training with a Low Rank Adaptation.
	The training script for both the pre-finetuning and second stage finetuning with LoRA is
	[available here](https://huggingface.co/AmelieSchreiber/esm2_t6_8M_lora_cafa5/blob/main/cafa_5_finetune_v2.ipynb).
	This notebook allows you to pre-finetune the base model, and then use a LoRA for the second stage of training.
	Note, the second stage of training is a harder curriculum for the model as it uses class weights so that the
	model better captures the hierarchical (weighted) structure of the gene ontology (GO) terms that serve as
	the labels for the multilabel sequence classification task of predicting a protein's functions (GO terms).