Skip to content

Around 30 000 sentences from around 10 000 YouTube comments. Each sentence manually annotated as either being a violent threat or not.

Notifications You must be signed in to change notification settings

erikve/YouTube-Threat-Corpus

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 

Repository files navigation

YouTube-Threat-Corpus

This corpus consists of a total of around 30 000 sentences from around 10 000 YouTube comments. Each sentence is manually annotated as either being a threat of (or sympathy with) violence or not.

The corpus is described in the papers "THREAT: A Large Annotated Corpus for Detection of Violent Threats" (Hammer et al. 2019). A previous version of the corpus was thoroughly evaluated in "Threat detection in online discussions" (Wester et al. 2016) and represent a natural benchmark for future research. The version of the corpus used in Wester et al. (2016) is not publicly available, but can be obtained by contacting the authors. Both articles are included in the folder 'Articles', along with the bib-files for referencing the articles. Please cite both papers in any work using the corpus.

The corpus can be downloaded by following the link below. By clicking the link to download you accept that the coruus is for academic use only and that you will delete the dataset on request.

>>I accept terms of use, proceed to download<<

Sincerely, Hugo Lewi Hammer, Michael Riegler, Lilja Øvrelid and Erik Velldal

About

Around 30 000 sentences from around 10 000 YouTube comments. Each sentence manually annotated as either being a violent threat or not.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages