LDC Catalog Information for 2001
Catalog IDCatalog NameRequested ByInvoice DateLocationInv #
LDC2001S91 1997 HUB-4 Broadcast News Evaluation Non English Test Material Gina-Anne Levow Jan 16, 2002 /data/lorien1/dataHUB4_1997NE/ 7584
LDC2001S97 2000 NIST Speaker Recognition Evaluation        
LDC2001T55 Arabic Newswire Part 1        
LDC2001T61 CALLHOME Spanish Dialogue Act Annotation Gina-Anne Levow Jan 16, 2002 merry:/export/data/1/Data/LDCdownloads/levowb7.LDC2001T61.tar 7584
LDC2001T62 Cetempublico        
LDC2001T11 Chinese Treebank Version 2.0 Gina-Anne Levow Jan 16, 2002 merry:/export/data/1/Data/LDCdownloads/levow28.LDC2001T11.tar 7584
LDC2001S16 Grassfields Bantu Fieldwork: Ngomba Tone Paradigms Nadine Di Vito Mar 15, 2005 /data/lorien1/Data/Tone/Ngomba 13272
LDC2001T02 Message Understanding Conference (MUC) 7        
LDC2001T10 Prague Dependency Treebank 1.0        
LDC2001S04 Speech in Noisy Environments (SPINE2) Part 1 Audio        
LDC2001T05 Speech in Noisy Environments (SPINE2) Part 1 Transcripts        
LDC2001S06 Speech in Noisy Environments (SPINE2) Part 2 Audio        
LDC2001T07 Speech in Noisy Environments (SPINE2) Part 2 Transcripts        
LDC2001S08 Speech in Noisy Environments (SPINE2) Part 3 Audio        
LDC2001T09 Speech in Noisy Environments (SPINE2) Part 3 Transcripts        
LDC2001S99 Speech in Noisy Environments 1 (SPINE1 CODED) Coded Audio        
LDC2001S13 Switchboard Cellular Part 1 Audio        
LDC2001S15 Switchboard Cellular Part 1 Transcribed Audio        
LDC2001T14 Switchboard Cellular Part 1 Transcription Barbara Need Feb 26, 2004   11067
LDC2001T60 Syllable-Final /s/ Lenition        
LDC2001S93 TDT2 Mandarin Audio Corpus Gina-Anne Levow Jan 16, 2002 /data/lorien1/Data/TDT2-Audio 7584
LDC99T38 TDT2 Mandarin Text     /data/lorien1/Data/TDT2-Text  
LDC99T39 TDT2 Multilanguage Text Version 3.0        
LDC2001T57 TDT2 Multilanguage Text Version 4.0 Gina-Anne Levow Jan 16, 2002 /data/lorien1/Data/TDT2-Text-4.0 7584
LDC2001S94 TDT3 English Audio Gina-Anne Levow Jan 23, 2002 /data/lorien1/Data/TDT3-Audio 7613
LDC2001S95 TDT3 Mandarin Audio Gina-Anne Levow Jan 16, 2002 /data/lorien1/Data/TDT3-Audio 7584
LDC2001T58 TDT3 Multilanguage Text Version 2.0 Gina-Anne Levow Jan 16, 2002 /data/lorien1/Data/TDT3-Text 7584


LDC Catalog Information for 2002
Catalog IDCatalog NameRequested ByInvoice DateLocationInv #
LDC2002S11 1997 HUB4 English Evaluation Speech and Transcripts        
LDC2002S22 1997 HUB5 Arabic Evaluation        
LDC2002T39 1997 HUB5 Arabic Transcripts        
LDC2002S24 1997 HUB5 German Evaluation        
LDC2003T03 1997 HUB5 German Transcripts        
LDC2002S25 1997 HUB5 Spanish Evaluation        
LDC2003T04 1997 HUB5 Spanish Transcripts        
LDC2002S10 1998 HUB5 English Evaluation        
LDC2003T02 1998 HUB5 English Transcripts Barbara Need Feb 26, 2004   11067
LDC2002S56 2000 Communicator Evaluation Gina-Anne Levow Jan 21, 2003 /data/lorien1/Data/Communicator2000 9044
LDC2002S13 2001 HUB5 English Evaluation Gina-Anne Levow May 16, 2002 /data/lorien1/data/HUB5E_01 8107
LDC2002S12 2001 HUB5 Mandarin Evaluation Gina-Anne Levow May 16, 2002 /data/lorien1/Data/HUB5 8107
LDC2003T01 2001 HUB5 Mandarin Transcripts Gina-Anne Levow Jun 11, 2003 /data/lorien1/Data/HUB5 9702
LDC2003T01 2001 HUB5 Mandarin Transcripts Jun Yang May 25, 2004   11609
LDC2002S34 2001 NIST Speaker Recognition Evaluation Corpus        
LDC2002E36 2002 DUC Evaluation Version 0.1        
LDC2002E33 ACE Phase 2 Training Data Version 6        
LDC2002E55 Arabic Treebank: Part 1 v1.0        
LDC2002E49 Buckwalter Arabic Morphological Analyzer        
LDC2002S37 Callhome Egyptian Arabic Speech Supplement        
LDC2002T38 Callhome Egyptian Arabic Transcripts Supplement        
LDC2002E27 Chinese English Translation Dictionary v3.0        
LDC2002E14 Chinese English Translation Lexicon Version 3-beta        
LDC2002S28 Emotional Prosody Speech and Transcripts Gina-Anne Levow Jan 21, 2003 /data/lorien1/Data/EMOTIONAL_PROSODY 9044
LDC2002E17 English Translation of Chinese Treebank Version 1 beta        
LDC2001S16 Grassfields Bantu Fieldwork: Ngomba Tone Paradigms Nadine Di Vito Mar 15, 2005 /data/lorien1/Data/Tone/Ngomba 13272
LDC2002E19 Hong Kong Hansard Parallel Text Version 2 beta        
LDC2002E16 Hong Kong News Parallel Text Version 2 beta        
LDC2002T26 Korean English Treebank Annotations        
LDC2002E54 Multiple-Translation Arabic Corpus        
LDC2002T01 Multiple-Translation Chinese Corpus Gina-Anne Levow May 16, 2002 merry:/export/data/1/Data/Chinese_translations 8107
LDC2002E53 Multiple-Translation Chinese Corpus 2.0        
LDC2002E50 Name-Annotated TDT Corpus Supplement for ACE        
LDC2002T07 RST Discourse Treebank Gina-Anne Levow May 16, 2002 /data/lorien1/Data/discourse_treebank 8107
LDC2002E58 Sinorama Chinese English Parallel Text        
LDC2001S08 Speech in Noisy Environments (SPINE2) Part 3 Audio        
LDC2001T09 Speech in Noisy Environments (SPINE2) Part 3 Transcripts        
LDC2002S06 Switchboard-2 Phase III Audio Gina-Anne Levow May 16, 2002 on shelf 8107
LDC2002E32 TDT3 Arabic Text Version 0.1        
LDC2002E52 TDT4 Multilanguage Text Corpus        
LDC2002T31 The AQUAINT Corpus of English News Text Gina-Anne Levow Oct 15, 2003 /data/lorien1/Data/NTCIR4 10301
LDC2002S04 Translanguage English Database (TED) Speech Gina-Anne Levow May 16, 2002 /data/lorien1/Data/TED* 8107
LDC2002T03 Translanguage English Database (TED) Transcripts Gina-Anne Levow May 16, 2002 /data/lorien1/Data/Translanguage 8107
LDC2002E15 UN Arabic English Parallel Text Version 1 beta        
LDC2002E48 Ummah Arabic English Parallel News Text        
LDC2002S35 Voicemail Corpus Part II Gina-Anne Levow Jan 21, 2003 /data/lorien1/data/VOICEMAIL 9044
LDC2002S02 West Point Arabic Speech Corpus        
LDC2002E18 Xinhua Chinese English Parallel News Text Version 1 beta        


LDC Catalog Information for 2003
Catalog IDCatalog NameRequested ByInvoice DateLocationInv #
LDC2003T03 1997 HUB5 German Transcripts        
LDC2002T42 1997 HUB5 Spanish Transcripts        
LDC2003T04 1997 HUB5 Spanish Transcripts        
LDC2003T02 1998 HUB5 English Transcripts Barbara Need Feb 26, 2004   11067
LDC2003S01 2001 Communicator Evaluation Gina-Anne Levow Jun 11, 2003 /data/lorien1/Data/Communicator2001 9702
LDC2003T01 2001 HUB5 Mandarin Transcripts Gina-Anne Levow Jun 11, 2003 /data/lorien1/Data/HUB5 9702
LDC2003T01 2001 HUB5 Mandarin Transcripts Jun Yang May 25, 2004   11609
LDC2003E26 ACE 2004 Pilot Corpus V1.0        
LDC2003T11 ACE-2 Version 1.0        
LDC2003E18 ACE3-V1.3        
LDC2003T20 ANC First Release Gina-Anne Levow Dec 02, 2003   10538
LDC2003E10 Aquaint Xinhua for NTCIR Evaluation        
LDC2003T12 Arabic Gigaword        
LDC2003E05 Arabic Translation Corpus Part 1        
LDC2003T07 Arabic Treebank: Part 1 - 10K-word English Translation Gina-Anne Levow Jun 11, 2003 /data/lorien1/Data/Arabic_treebank 9702
LDC2003T06 Arabic Treebank: Part 1 v 2.0 Gina-Anne Levow Jun 11, 2003 /data/lorien1/Data/Arabic_treebank 9702
LDC2003E17 Arabic Treebank: Part 2 v 1.0        
LDC2003E24 Arabic Treebank: Part 2 v 1.1        
LDC2004E14 Articulation Index Speech V1.0        
LDC2003E01 Chinese <-> English Name Entity Lists Version 1.0 beta        
LDC2003T09 Chinese Gigaword Gina-Anne Levow Jun 11, 2003 /data/lorien1/data/GIGAWORD_MAN// 9702
LDC2003E06 Chinese Treebank 3.0        
LDC2003E07 Chinese Treebank English Parallel Corpus        
LDC2003S04 Cross-Channel Forensic Speech for Automatic Speaker Recognition        
LDC2003E27 EARS MDE RT-03 DevTest and Evaluation Corpus        
LDC2003E19 EARS MDE RT-03F Training Corpus        
LDC2003T05 English Gigaword Gina-Anne Levow Jun 11, 2003 /data/lorien1/data/GIGAWORD_ENG/ 9702
LDC2003E14 FBIS Multilanguage Texts        
LDC2003V01 FORM2 Kinematic Gesture        
LDC2003E13 Fisher Quick Transcription Part 1 Version 1.0        
LDC2003E13C Fisher Quick Transcription Part 3 Version 1.0        
LDC2003E12D Fisher Training Speech Data, Part 4        
LDC2003E12 Fisher Training Speech Part 1        
LDC2003E12B Fisher Training Speech Part 2        
LDC2003E12C Fisher Training Speech Part 3        
LDC2003E13D Fisher Training Transcripts Part 4, v1.0        
LDC2003L01 Grassfields Bantu Fieldwork: Dschang Lexicon Barbara Need Nov 07, 2003   10445
LDC2003S02 Grassfields Bantu Fieldwork: Dschang Tone Paradigms Barbara Need Nov 07, 2003 /data/lorien1/Data/Tone/Dschang 10445
LDC2003E15 HARD GovDocs        
LDC2003E25 Hong Kong News Parallel Text        
LDC2003P01 Korean Telephone Conversations Complete Set        
LDC2003L02 Korean Telephone Conversations Lexicon Gina-Anne Levow Jun 11, 2003 /data/lorien1/Data/Korean 9702
LDC2003S03 Korean Telephone Conversations Speech Gina-Anne Levow Jun 11, 2003 /data/lorien1/data/KOR_SPEECH_1,2,3 9702
LDC2003T08 Korean Telephone Conversations Transcripts Gina-Anne Levow Jun 11, 2003 /data/lorien1/Data/Korean 9702
LDC2003T13 Message Understanding Conference (MUC) 6        
LDC2003E04 Multiple Translation Chinese Corpus Part 3        
LDC2003T18 Multiple-Translation Arabic (MTA) Part 1        
LDC2003T17 Multiple-Translation Chinese (MTC) Part 2        
LDC2003T10 SAID        
LDC2003E16 SIGHAN Bakeoff        
LDC2003T15 SLX Corpus of Classic Sociolinguistic Interviews        
LDC2003S06 Santa Barbara Corpus of Spoken American English Part-II Gina-Anne Levow Dec 02, 2003 /data/lorien1/data/SBCSAE_P2/ 10538
LDC2003T16 SummBank 1.0        
LDC2003E02 TDT4 Multilanguage Speech        
LDC2003E20 TDT4 Multilanguage Text Subset for TIDES Extraction 2003        
LDC2003E21 TDT4 Multilanguage Text Version 1.1        
LDC2003E03 TDT4 Multilanguage Transcripts        
LDC2003E22 The SLX Corpus of Classic Sociolinguistic Interviews        
LDC2003E11 UN Chinese English Parallel Text Version 1.0 beta        
LDC2003S05 West Point Russian Speech        


LDC Catalog Information for 2004
Catalog IDCatalog NameRequested ByInvoice DateLocationInv #
LDC2004T15 2000 Communicator Dialogue Act Tagged Gina-Anne Levow Aug 24, 2004 /data/lorien1/Data/CommDA 12259
LDC2004T16 2001 Communicator Dialogue Act Tagged Gina-Anne Levow Aug 24, 2004 /data/lorien1/data/2001_COMM_DIALOG_ACT 12259
LDC2004S04 2002 NIST Speaker Recognition Evaluation        
LDC2004S11 2002 Rich Transcription Broadcast News and Conversational Telephone Speech        
LDC2004E27 ACE 2004 English and Chinese Training Data Superset        
LDC2004E03 ACE 2004 Pilot Corpus V1.3        
LDC2004E06 AQUAINT Supplement for DUC2004        
LDC2004E71 ATB Part 3 (a) v.1.1        
LDC2004E22 Arabic CTS Levantine Fisher Training Data Set 1, Transcriptions        
LDC2004E21 Arabic CTS Levantine Fisher Training Data Set 1: Speech        
LDC2004E65 Arabic CTS Levantine Fisher Training Data Set 2, Speech        
LDC2004E66 Arabic CTS Levantine Fisher Training Data Set 2, Transcripts        
LDC2004T18 Arabic English Parallel News Part 1        
LDC2004E08 Arabic English Parallel News Text Part 1        
LDC2004E07 Arabic News Translation Corpus Part 3        
LDC2004E11 Arabic News Translation Corpus Part 4        
LDC2004T17 Arabic News Translation Text Part 1        
LDC2004T02 Arabic Treebank: Part 2 v 2.0        
LDC2004T11 Arabic Treebank: Part 3 v 1.0        
LDC2004E14 Articulation Index Speech V1.0        
LDC2004T27 Buckwalter(Bad catalog No. do not use) Arabic. Morph Analyzer        
LDC2004T05 Chinese Treebank Version 4.0 Jun Yang May 25, 2004   11609
LDC2004T05 Chinese Treebank Version 4.0 Gina-Anne Levow Aug 24, 2004   12259
LDC2004S01 Czech Broadcast News Speech        
LDC2004T01 Czech Broadcast News Transcripts        
LDC2004V01 FORM1 Kinematic Gesture        
LDC2004S13 Fisher English Training Speech Part 1 Speech        
LDC2004T19 Fisher English Training Speech Part 1, Transcripts        
LDC2004E30 HARD 2004 Corpus Gina-Anne Levow Jun 29, 2004 /data/lorien1/Data/HARD2004 11875
LDC2004E34 HARD 2004 Evaluation Topics V1.1 Gina-Anne Levow Jun 29, 2004 /data/lorien1/Data/HARD2004 11875
LDC2004E34 HARD 2004 Evaluation Topics V1.1 Gina-Anne Levow Jul 09, 2004 /data/lorien1/Data/HARD2004 11999
LDC2004E34 HARD 2004 Evaluation Topics V1.1 Gina-Anne Levow Jul 21, 2004 /data/lorien1/Data/HARD2004 12101
LDC2004E32 HARD 2004 Training Data Gina-Anne Levow Jun 29, 2004 /data/lorien1/Data/HARD2004 11875
LDC2004E09 Hong Kong Hansard Parallel Text        
LDC2004T08 Hong Kong Parallel Text        
LDC2004S02 ICSI Meeting Speech Gina-Anne Levow Mar 16, 2004 /data/lorien1/Data/icsi/ 11180
LDC2004T04 ICSI Meeting Transcripts Gina-Anne Levow Mar 16, 2004 /data/lorien1/Data/icsi 11180
LDC2004E04 ISL Meeting Corpus Speech     /data/lorien1/dataISL_MEETING_SPEECH_2//  
LDC2004E05 ISL Meeting Corpus Transcripts     /data/lorien1/Data/isl  
LDC2004S05 ISL Meeting Speech Part 1 Gina-Anne Levow Aug 24, 2004 /data/lorien1/data/ISL_MEETING_SPEECH_2 12259
LDC2004T10 ISL Meeting Transcripts Part 1 Gina-Anne Levow Aug 24, 2004 /data/lorien1/Data/isl 12259
LDC2004L01 Klex: Finite-State Lexical Transducer for Korean        
LDC2004S08 MDE RT-03 Training Data Speech Gina-Anne Levow Aug 24, 2004 /data/lorien1/data/MDE_RT03_TRAIN_SP_EECH_1,2 12259
LDC2004T12 MDE RT-03 Training Data Text and Annotations Gina-Anne Levow Aug 24, 2004 /data/lorien1/data/MDE_RT03_TRAIN_TE_XT/ 12259
LDC2004T03 Morphologically Annotated Korean Text Gina-Anne Levow Mar 16, 2004   11180
LDC2004T07 Multiple-Translation Chinese (MTC) Part 3        
LDC2004E15 NIST Meeting Evaluation Corpus        
LDC2004S09 NIST Meeting Pilot Corpus Speech Gina-Anne Levow Aug 24, 2004 /data/lorien1/data/NIST_MEET_PILOT_SP_8,9/ 12259
LDC2004T13 NIST Meeting Pilot Corpus Transcripts and Metadata Gina-Anne Levow Aug 24, 2004 /data/lorien1/data/NIST_MEET_PILOT/ 12259
LDC2004E01 NIST Pilot Meeting Corpus Speech        
LDC2004E02 NIST Pilot Meeting Corpus Transcripts V1.4        
LDC2004T23 Prague Arabic Dependency Treebank 1.0        
LDC2004T25 Prague Czech-English Dependency Treebank Version 1.0        
LDC2004E26 Proposition Bank 1 V1.0        
LDC2004T14 Proposition Bank I        
LDC2004E24 RT-04 MDE Annotation Consistency Study        
LDC2004E16 RT-04 MDE DevTest Set #1 Version 1.2        
LDC2004E29 RT-04 MDE DevTest Set #2 V1.2        
LDC2004E31 RT-04 MDE Training Data V1.2        
LDC2004E28 RT-04 STT Transcription Consistency Study        
LDC2004E67 RT-04F STT Chinese CTS Development Data Speech        
LDC2004E68 RT-04F STT Chinese CTS Development Data Transcripts        
LDC2004E69 RT-04F STT Chinese CTS Training Data Speech        
LDC2004E70 RT-04F STT Chinese CTS Training Data Transcripts        
LDC2004E10 RT-04F STT Multilingual Speech Development Data - Supplement        
LDC2004E18 RT-04F STT Multilingual Speech Development Data V1.1 Re-release        
LDC2004E19 RT-04F STT Multilingual Transcripts Devlopment Data V1.2        
LDC2004S10 Santa Barbara Corpus of Spoken American English III        
LDC2004S07 Switchboard Cellular Part 2 Audio        
LDC2004E20 TDT-4 Annotations        
LDC2004E36 TDT4 (Chinese, Arabic) Reformatted for MT Processing        
LDC2004E35 TDT5 (Chinese, Arabic) Reformatted for MT Processing        
LDC2004E23 TERN 2004 Training Data V1.3        
LDC2004T09 TIDES Extraction (ACE) 2003 Multilingual Training Data Gina-Anne Levow Aug 24, 2004 /data/lorien1/data/ace_tides_multling_train/ 12259
LDC2004E17 TIDES Extraction ACE 2004 Training Data V1.4        
LDC2004S12 Talkbank Ethology Data: Field Recordings of Vervet Monkey Calls        
LDC2004E13 UN Arabic English Parallel Text        
LDC2004E12 UN Chinese English Parallel Text        
LDC2004E72 eTIRR Arabic English News Text        


LDC Catalog Information for 2005
Catalog IDCatalog NameRequested ByInvoice DateLocationInv #
LDC2005E12 2005 MSE Arabic-English Clusters V1.2        
LDC2005T09 ACE 2004 Multilingual Training Corpus Michael Berger Mar 16, 2005 /data/lorien1/data/ace_tides_multling_train/ 13293
LDC2005E22 ACE 2005 Arabic Unsupervised Training Data        
LDC2005E21 ACE 2005 Chinese Unsupervised Training Data        
LDC2005E20 ACE 2005 English Unsupervised Training Data        
LDC2005E18 ACE 2005 Multilingual Training Data V6.0        
LDC2005T07 ACE Time Normalization (TERN) 2004 English Training Data V1.0 Michael Berger Feb 18, 2005 /data/lorien1/data/TERN 13132
LDC2005T35 ANC 2nd Release        
LDC2005S07 Arabic CTS Levantine Fisher Training Data Set 2, Speech Michael Berger Feb 18, 2005 /data/lorien1/data/cts_arabic_la_td3_speech/ 13132
LDC2005T03 Arabic CTS Levantine Fisher Training Data Set 3 , Transcripts Michael Berger Feb 18, 2005 /data/lorien1/data/cts_arabic_la_td3_trans/ 13132
LDC2005E46 Arabic Treebank English Translation        
LDC2005T02 Arabic Treebank: Part 1 v 3.0 (POS with full vocalization + syntactic analysis) Michael Berger Feb 18, 2005 /data/lorien1/data/ATB_PT1_VER3/ 13132
LDC2005T20 Arabic Treebank: Part 3 (full corpus) v2.0 (MPG + Syntactic Analysis) Michael Berger Jun 20, 2005 /data/lorien1/data/ATB_PT1_VER3/ 13812
LDC2005T30 Arabic Treebank: Part 4 v1.0 (MPG Annotation) Michael Berger Oct 14, 2005 /data/lorien1/data/atb_p4_v1/ 14458
LDC2005S22 Articulation Index Michael Berger Oct 14, 2005 /data/lorien1/data/artic_index/ 14458
LDC2005T33 BBN Pronoun Coreference and Entity Type Corpus Michael Berger Sep 16, 2005 /data/lorien1/data/bbn-pcet/ 14266
LDC2005S08 BBN/AUB DARPA Babylon Levantine Arabic Speech and Transcripts Michael Berger Feb 18, 2005 /data/lorien1/data/bablyon_arabic/ 13132
LDC2005T13 CCGbank Michael Berger May 23, 2005 /data/lorien1/data/ccgbank 13702
LDC2005T34 Chinese <-> English Name Entity Lists (v1.beta)        
LDC2005E47 Chinese English News Magazine Parallel Text        
LDC2005T10 Chinese English News Magazine Parallel Text Michael Berger Jun 20, 2005   13812
LDC2005T14 Chinese Gigaword Second Edition Michael Berger Aug 18, 2005 /data/lorien1/data/gigaword_cmn_v2// 14123
LDC2005T06 Chinese News Translation Text Part 1 Michael Berger Mar 16, 2005 /data/lorien1/data/CH_NEWS_TRANS/ 13293
LDC2005T23 Chinese Proposition Bank 1.0 Michael Berger Sep 16, 2005 /data/lorien1/data/cpb_ver1/ 14266
LDC2005T01 Chinese Treebank 5.0 Michael Berger Feb 18, 2005 /data/lorien1/data/ctb5 13132
LDC2005T01U01 Chinese Treebank 5.1        
LDC2005E43 CoNLL-2005 Shared Task Datasets        
LDC2005T08 Discourse Graphbank Michael Berger Mar 16, 2005 /data/lorien1/data/DISCOURSE_GRPH_B/ 13293
LDC2005T12 English Gigaword Second Edition Michael Berger Jul 22, 2005 /data/lorien1/data/gigaword_end_v2/ 14005
LDC2005E40B Fisher English Phase 2, Part 2, Training Speech        
LDC2005E39B Fisher English Phase 2, Part 2, Training Transcripts        
LDC2005S13 Fisher English Training Part 2, Speech Michael Berger Apr 14, 2005 /data/lorien1/data/FE_03_SP/ 13460
LDC2005T19 Fisher English Training Part 2, Transcripts Michael Berger Apr 14, 2005 /data/lorien1/data/FE_03_P2_TRAN/ 13460
LDC3001X01 Frank Test Corpus I        
LDC3001X10 Frank Test Corpus XV1        
LDC2005E69 GALE Kickoff Release - English-Arabic Parallel Treebank V1.0        
LDC2005S15 HKUST Mandarin Telephone Speech, Part 1 Michael Berger Jul 22, 2005 /data/lorien1/data/hkust_mcts_p1/ 14005
LDC2005T32 HKUST Mandarin Telephone Transcript Data, Part 1 Michael Berger Jul 22, 2005 /data/lorien1/data/hkust_mcts_p1tr/ 14005
LDC2005S14 Levantine Arabic QT Training Data Set 4 (Speech + Transcripts) Michael Berger Jun 20, 2005   13812
LDC2005T24 MDE RT-04 Training Data Text/Annotations Michael Berger Aug 18, 2005 /data/lorien1/data/mde_04_text_annot/ 14123
LDC2005S16 MDE RT04 Training Data Speech Michael Berger Aug 18, 2005 /data/lorien1/data/mde_04_speech_bnews/ 14123
LDC2005E14 MSE 2005 Sample Summary Topic        
LDC2005L01 Mawukakan Lexicon Michael Berger Apr 14, 2005 /data/lorien1/data/MAWU_LEXICON/ 13460
LDC2005T05 Multiple-Translation Arabic (MTA) Part 2 Michael Berger Feb 18, 2005 /data/lorien1/data/MTA_P2 13132
LDC2005S27 NIST 2003 Language Recognition Development Data II        
LDC2005E50 NTCIR Evaluation        
LDC2005S25 Santa Barbara Corpus of Spoken American English Part-IV Michael Berger Sep 16, 2005 /data/lorien1/data/sbcsae_4 14266
LDC2005S11 TDT4 Multilingual Broadcast News Speech Corpus Michael Berger May 23, 2005 /data/lorien1/data/TDT4* 13702
LDC2005T16 TDT4 Multilingual Text and Annotations Michael Berger May 23, 2005 /data/lorien1/data/tdt4_aem_txt/ 13702
LDC2005E44 TIDES MT 2003 Arabic Evaluation Set        
LDC2005E45 TIDES MT 2003 Chinese Evaluation Set        
LDC2005S28 West Point Croatian Speech Corpus Michael Berger Oct 14, 2005 /data/lorien1/data/wp_croatian/ 14458


Catalog Information for Non-Member Years
Catalog IDCatalog NameRequested ByInvoice DateLocationInv #
LDC98T24 1997 Mandarin Broadcast News Transcripts (Hub-4NE) Gina-Anne Levow Apr 01, 2005 /data/lorien1/data/HUB4_1997NE/ 13392
LDC2004E25 2003 HARD Annotations Gina-Anne Levow Jun 29, 2004 /data/lorien1/Data/HARD2004 11875
LDC96S36 Boston University Radio Speech Corpus Barbara Need Nov 16, 1998 /data/lorien1/Data/Tone/BURadio 4451
LDC96L14 CELEX2 Partha Niyogl Feb 06, 2001   6461
LDC96S30 CTIMIT Partha Niyogl Jun 20, 2002 Z399061 8268
LDC2002L27 Chinese-English Translation Lexicon Version 3.0 Gina-Anne Levow Jan 21, 2003   9044
LDC2004E42 HARD 2004 Reference Annotations Gina-Anne Levow Oct 18, 2004 /data/lorien1/Data/HARD2004 12522
LDC93S12 HCRC Map Task Corpus Gina-Anne Levow Apr 23, 2004 /data/lorien1/Data/Maptask 11412
LDC93S2 NTIMIT Partha Niyogl Jun 20, 2002 Z399061 8268
LDC93S10 TIDIGITS Professor A. Murua Feb 19, 1998 Z895150 3738
LDC93S1 TIMIT Acoustic-Phonetic Continuous Speech Corpus Partha Niyogl Jun 20, 2002 /data/lorien1/Data/TIMIT 8268
LDC98S72 Taiwanese Putonghua Speech and Transcripts Gina-Anne Levow Apr 01, 2005 /data/lorien1/Data/TWPTH 13392
LDC0000 To be filled Gina-Anne Levow Dec 01, 2004   12736
LDC99T42 Treebank-3 Derrick Higgins Apr 26, 2001   6735
N/A 20 Newsgroups Gina-Anne Levow   /data/lorien1/Data/20_newsgroups  
N/A Cross-language Evaluation Forum -2000; E/F/G/I Gina-Anne Levow   /data/lorien1/Data/CLEF  
N/A Cross-language Evaluation Forum -2004; E/Fr/Fi/Ru Gina-Anne Levow   /data/lorien1/Data/CLEF2004  
N/A CUHK Broadcast Cantonese-CUSENT Gina-Anne Levow   /data/lorien1/Data/CUSENT  
N/A ICSI Switchboard Close Transcription Gina-Anne Levow   /data/lorien1/Data/ICSI97  
N/A ICSI Meeting Recorder Dialogue Acts Gina-Anne Levow   /data/lorien1/Data/MRDA  
N/A LEAP Learners' English Gina-Anne Levow   /data/lorien1/data/LEAP_ENG  
N/A LEAP Learners' German Gina-Anne Levow   /data/lorien1/data/LEAP_GERMAN  
N/A NTCIR 4 Data: CJKE Gina-Anne Levow   /data/lorien1/Data/NTCIR4  
N/A Sun SpeechActs Data Gina-Anne Levow   /data/lorien1/Data/SpeechActs  
N/A TRAINS Dialogue Data Gina-Anne Levow   /data/lorien1/Data/Trains  
N/A Xu Focus Speech Data - Pitch tracks Gina-Anne Levow   /data/lorien1/Data/XuFocus1999  
N/A Xu Focus Raw Speech Gina-Anne Levow   /data/lorien1/Data/XuFocus1999_audio  
N/A Mini-newsgroups Gina-Anne Levow   /data/lorien1/Data/mini_newsgroups  
N/A Reuters Text Classification Data Gina-Anne Levow   /data/lorien1/Data/reuters  
N/A Switchboard Dialogue Acts Gina-Anne Levow   /data/lorien1/Data/swbda  
N/A McNeill Wombat Dialogues - DSP Gina-Anne Levow   /data/lorien1/Data/wombat