This folder contains implementations for toxicity detection benchmarks on LLM360 models. The benchmark measures model's capability on identifying toxic text.
Here's a list of toxicity detection benchmarks we have implemented so far.
single_ckpt_toxic_detection.py
is the main entrypoint for evaluating toxicity detection on a single model. It uses python modules in utils/
folder.
The utils/
folder contains helper functions for model/dataset IO:
data_utils.py
: Dataset preparation for all benchmarksmodel_utils.py
: Model loader
By default, the evaluation results are saved in ./{model_name}_results.jsonl
.
- Clone and enter the folder:
git clone https://github.com/LLM360/Analysis360.git cd analysis360/analysis/safety360/toxic_detection
- Install dependencies:
pip install -r requirements.txt
An example usage is provided in the demo.ipynb, which can be executed with a single A100 80G
GPU.