JacobLinCool
/

IELTS_essay_scoring_safetensors

Model card Files Files and versions Community

IELTS_essay_scoring_safetensors / README.md

kevintu's picture

Update README.md

ad2da8f verified 9 months ago

|

3.37 kB

	---
	license: mit
	language:
	- en
	---

	We trained a language model to automatically score the IELTS (International English Language Testing System) essays by using massive the training dataset by human raters.

	The impressive result in the test dataset is as follows: Accuracy = 0.82, F1 Score = 0.81.

	The following is the code to implement the model for scoring new IELTS essays.

	In the following example, an essay is taken from the test dataset with the overall score 8.

	```

	from transformers import AutoModelForSequenceClassification, AutoTokenizer
	import torch
	import numpy as np

	# Load the pre-trained model and tokenizer
	model_path = "./ielts_scoring_model"
	model = AutoModelForSequenceClassification.from_pretrained(model_path)
	tokenizer = AutoTokenizer.from_pretrained(model_path)

	# Example text to be evaluated, the essay with the score by human rater (= 8.5) in the test dataset.

	new_text = (
	"It is important for all towns and cities to have large public spaces such as squares and parks. "
	"Do you agree or disagree with this statement? It is crucial for all metropolitan cities and towns to "
	"have some recreational facilities like parks and squares because of their numerous benefits. A number of "
	"arguments surround my opinion, and I will discuss it in upcoming paragraphs. To commence with, the first "
	"and the foremost merit is that it is beneficial for the health of people because in morning time they can "
	"go for walking as well as in the evenings, also older people can spend their free time with their loved ones, "
	"and they can discuss about their daily happenings. In addition, young people do lot of exercise in parks and "
	"gardens to keep their health fit and healthy, otherwise if there is no park they glue with electronic gadgets "
	"like mobile phones and computers and many more. Furthermore, little children get best place to play, they play "
	"with their friends in parks if any garden or square is not available for kids then they use roads and streets "
	"for playing it can lead to serious incidents. Moreover, parks have some educational value too, in schools, "
	"students learn about environment protection in their studies and teachers can take their pupils to parks because "
	"students can see those pictures so lively which they see in their school books and they know about importance "
	"and protection of trees and flowers. In recapitulate, parks holds immense importance regarding education, health "
	"for people of every society, so government should build parks in every city and town."
	)


	encoded_input = tokenizer(new_text, return_tensors='pt', padding=True, truncation=True, max_length=512)


	model.eval()

	# Perform the prediction
	with torch.no_grad():
	outputs = model(**encoded_input)

	predictions = outputs.logits.squeeze()


	predicted_scores = predictions.numpy()

	# Normalize the scores
	normalized_scores = (predicted_scores / predicted_scores.max()) * 9 # Scale to 9


	rounded_scores = np.round(normalized_scores * 2) / 2

	item_names = ["Task Achievement", "Coherence and Cohesion", "Vocabulary", "Grammar", "Overall"]


	for item, score in zip(item_names, rounded_scores):
	print(f"{item}: {score:.1f}")

	##the output:
	#Task Achievement: 9.0
	#Coherence and Cohesion: 7.5
	#Vocabulary: 8.0
	#Grammar: 7.5
	#Overall: 8.5
	```