Skip to content

Latest commit

 

History

History

Benchmark

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 

Benchmark

Benchmark Dataset

SQuAD, WNLIdataset과 같이 Benchmark dataset에 대해 다루는 곳입니다.


List

  • Dataset Name : Dataset의 이름(관련 페이지 링크)
  • 각 Benchmark 이름 혹은 FLAN, T0 : 해당 벤치마크 혹은 FLAN, T0에서 어떠한 task로 사용되었는지 표시합니다. (사용되지 않았다면 공란)
Dataset Name GLUE SuperGLUE KILT BEIR T0 FLAN
Adversarial QA - - - - Extractive QA -
AG News - - - - Topic Classification Summarization
AIDA CoNLL-YAGO - - Entity Linking - - -
ANLI - - - - NLI NLI
ARC - - - - - Closed-Book QA
BoolQ - QA - - - Reading Comprehension
CB - NLI - - NLI NLI
CNN-DM - - - - Summarization Summarization
CoLA Sentence Acceptability - - - - Misc.
Common Gen - - - Structure-To-Te-t Structure-To-Te-t
CommonsenseQA - - - - Multiple-Choice QA -
COPA - - - - Sentence Completion Commonsense
CoQA - QA - - - Misc.
Cosmos QA - - - - Multiple-Choice QA Reading Comprehension/Commonsense
DREAM - - - - Multiple-Choice QA -
DROP - - - - - Reading Comprehension
DuoRC - - - - Extractive QA -
E2ENLG - - - - - Structure-To-Text
ELI5 - - Open-domain QA - - -
Fever - - Fact Checking Fact Checking - -
Gigaword - - - - Summarization Summarization
HellaSwag - - - - Sentence Completion Commonsense
Hotpot QA - - Open-domain QA Open-domain QA Closed-Book QA -
IMDB - - - - Sentiment Sentiment
Math - - - - - Misc.
MNLI NLI - - - - NLI
MRPC Paraphrase Identification - - - Paraphrase Identification Paraphrase Identification
MS-MARCO - - - Passage Retrieval - -
MultiNews - - - - Summarization Summarization
MultiRC - QA - - - Reading Comprehension
Newsroom - - - - - Summarization
NQ(Natural Question) - - - - - Closed-Book QA
OBQA - - - - - Reading Comprehension
PAWS - - - - Paraphrase Identification Paraphrase Identification
PiQA - - - - Commonsense
QASC - - - - Multiple-Choice QA -
QNLI QA/NLI - - - - NLI
QQP Paraphrase Identification - - - Paraphrase Identification Paraphrase Identification
QuAC - - - - - Misc.
QuAIL - - Multiple-Choice QA -
QuaRel - - - - Multiple-Choice QA -
QuaRTz - - - - Multiple-Choice QA -
Quoref - - - - Extractive QA -
ReCoRD - QA - - Reading Comprehension/Commonsense
ROPES - - - - Extractive QA -
Rotten Tomatoes - - - - Sentiment -
RTE NLI NLI - - NLI NLI
SamSum - - - - Summarization Summarization
SciQ - - - - Multiple-Choice QA -
Sent140 - - - - - Sentiment
Social IQA - - - - - NLI
SNLI - - - - Multiple-Choice QA -
SQuAD - - - - - Reading Comprehension
SST-2 Sentiment - - - - -
Story Cloze - - - - Sentence Completion Commonsense
STS-B Sentence Similarity - - - - Paraphrase Identification
T-REx - - Slot Filling - - -
TQA - - Open-domain QA - - Closed-Book QA
TREC - - - - Topic Classification Misc.
WEBNLG - - - - - Structure-To-Text
WiC - Word Sense Disambiguation - - Word Sense Disambiguation Misc.
Wiki Hop - - - - Multiple-Choice QA -
Wiki QA - - - - Closed-Book QA -
WikiBio - - - - Structure-To-Text -
Winogrande - - - - Coreference Resolution -
WiQA - - - - Multiple-Choice QA -
Wizard of Wikipedia - - Dialogue - - -
WNLI NLI - - - - NLI
WSC - Conference Resolution - - Conference Resolution -
Xsum - - - - Summarization Summarization
Yelp - - - - Sentiment Sentiment