CTI specifics: Threat landscape inferred vocab expansion

#4
by Crowler - opened

Hello,
First of all, thank you very much for sharing such a concentrated of great achievements.

I was wondering about your take concerning the way to deal with an expansion of the vocabulary at no cost, especially the V_{distinct} set (referring to the $4 of the introducing paper). Indeed, CTI possess these specifics that other specialized fields do not: the expanding - or even moving (like an average) - actual vocabulary due to the evolution of the threat landscape at the tactical level : new (dubbed) malware, threat actors, campaigns, tools, intrusion sets, etc.
The contextual nature of BERT could of course render objective embeddings, but even the possibility to generate those embeddings is conditionned to a suitable "basis" in the vocabulary.

What is your say about it ? Which solution ?

Crowler changed discussion title from CTI specifics: Threat landscape inferred vocab extension to CTI specifics: Threat landscape inferred vocab expansion
CyberPeace Institute org

Hello,
I am not a cybersecurity expert myself but given the dynamic nature of threat landscapes, adapting to new terminology like malware variants or intrusion sets are probably crucial as you pointed out. While BERT's contextual nature aids in objective embeddings, establishing a robust vocabulary basis and fine tuning this model with up-to-date data is essential to have a high performing model.

Sign up or log in to comment