Name	Name	Last commit message	Last commit date
parent directory ..
atis	atis
atis2/nlu_joint/v1	atis2/nlu_joint/v1
cnn_dailymail/seq2seq/v1	cnn_dailymail/seq2seq/v1
conll2003	conll2003
hkust/asr/v1	hkust/asr/v1
iemocap	iemocap
mini_an4	mini_an4
mock_text_cls_data/text_cls/v1	mock_text_cls_data/text_cls/v1
mock_text_match_data/text_match/v1	mock_text_match_data/text_match/v1
mock_text_nlu_joint_data/nlu-joint/v1	mock_text_nlu_joint_data/nlu-joint/v1
mock_text_seq2seq_data/seq2seq/v1	mock_text_seq2seq_data/seq2seq/v1
mock_text_seq_label_data/seq-label/v1	mock_text_seq_label_data/seq-label/v1
msra_ner	msra_ner
quora_qp	quora_qp
snli	snli
sre16/v1	sre16/v1
trec	trec
voxceleb	voxceleb
wmt14_en_de/nlp1	wmt14_en_de/nlp1
yahoo_answer	yahoo_answer
README.md	README.md

Name

Last commit message

Last commit date

atis

atis2/nlu_joint/v1

cnn_dailymail/seq2seq/v1

mock_text_cls_data/text_cls/v1

mock_text_match_data/text_match/v1

mock_text_nlu_joint_data/nlu-joint/v1

mock_text_seq2seq_data/seq2seq/v1

mock_text_seq_label_data/seq-label/v1

Examples

All examples are under directory egs and named by its name of dataset. All data-sets starts with "mock" are data-sets for test.

Examples for NLP

DataSet	Supported Tasks	Description
ATIS	Sequence labeling/ Text classification/ NLU joint learning	Air Travel Information System (ATIS) pilot corpus.
CoNLL2003	Sequence labeling	The CoNLL 2003 NER task consists of newswire text from the Reuters RCV1 corpus tagged with four different entity types (PER, LOC, ORG, MISC).
MSRA_NER	Sequence labeling	MSRA datasets are in the news domain about NER.
SNIL	Sentence Matching	Stanford Natural Language Inference corpus is a new, freely available collection of labeled sentence pairs, written by humans doing a novel grounded task based on image captioning.
Quora_QP	Sentence Matching	Data collected from the quara platform. Quora is a place to gain and share knowledge—about anything.
Yahoo_Answer	Document Classification	Yahoo answers are obtained from (Zhang et al., 2015). This is a topic classification task with 10 classes. The document we use includes question titles, question contexts and best answers.
Trec	Document Classification	This data collection contains all the data used in our learning question classification experiments,which has question class definitions.

Examples for Speech

DataSet	Supported Tasks	Description
hkust	ASR	HKUST Mandarin Telephone Speech
voxceleb	Speaker Verfication	VoxCeleb is an audio-visual dataset consisting of short clips of human speech, extracted from interview videos uploaded to YouTube
iemocap	Emotion	The Interactive Emotional Dyadic Motion Capture (IEMOCAP) database is an acted, multimodal and multispeaker database, recently collected at SAIL lab at USC.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Examples

Examples for NLP

Examples for Speech

FilesExpand file tree

egs

Directory actions

More options

Directory actions

More options

Latest commit

History

egs

Folders and files

parent directory

README.md

Examples

Examples for NLP

Examples for Speech