language: en
- spotify-podcast-dataset
- bert
- classification
- pytorch
- text-classification
- text: >-
__START__ [SEP] This is the first podcast on natural language processing
applied to spoken language.
- text: >-
This is the first podcast on natural language processing applied to spoken
language. [SEP] You can find us on
- text: >-
You can find us on [SEP] You
can also subscribe to our newsletter
General Information
This is a bert-base-cased
, binary classification model, fine-tuned to classify a given sentence as containing advertising content or not. It leverages previous-sentence context to make more accurate predictions.
The model is used in the paper 'Leveraging multimodal content for podcast summarization' published at ACM SAC 2022.
from transformers import AutoModelForSequenceClassification, AutoTokenizer
model = AutoModelForSequenceClassification.from_pretrained('morenolq/spotify-podcast-advertising-classification')
tokenizer = AutoTokenizer.from_pretrained('morenolq/spotify-podcast-advertising-classification')
desc_sentences = ["Sentence 1", "Sentence 2", "Sentence 3"]
for i, s in enumerate(desc_sentences):
if i==0:
context = "__START__"
context = desc_sentences[i-1]
out = tokenizer(context, s, padding = "max_length",
max_length = 256,
return_tensors = 'pt')
outputs = model(**out)
print (f"{s},{outputs}")
The manually annotated data, used for model fine-tuning are available here
Hereafter is the classification report of the model evaluation on the test split:
precision recall f1-score support
0 0.95 0.93 0.94 256
1 0.88 0.91 0.89 140
accuracy 0.92 396
macro avg 0.91 0.92 0.92 396
weighted avg 0.92 0.92 0.92 396
If you find it useful, please cite the following paper:
author = {Vaiani, Lorenzo and La Quatra, Moreno and Cagliero, Luca and Garza, Paolo},
title = {Leveraging Multimodal Content for Podcast Summarization},
year = {2022},
isbn = {9781450387132},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {},
doi = {10.1145/3477314.3507106},
booktitle = {Proceedings of the 37th ACM/SIGAPP Symposium on Applied Computing},
pages = {863–870},
numpages = {8},
keywords = {multimodal learning, multimodal features fusion, extractive summarization, deep learning, podcast summarization},
location = {Virtual Event},
series = {SAC '22}