SPC - Stockholm Parallel Corpora
This is a collection of parallel corpora collected by Hercules Dalianis and his research group for bilingual dictionary construction.
More information in: Hercules Dalianis, Hao-chun Xing, Xin Zhang: Creating a Reusable English-Chinese Parallel Corpus for Bilingual Dictionary Construction, In Proceedings of LREC2010 (source: http://people.dsv.su.se/~hercules/SEC/) and Konstantinos Charitakis (2007): Using Parallel Corpora to Create a Greek-English Dictionary with UPLUG, In Proceedings of NODALIDA 2007. Afrikaans-English: Aldin Draghoender and Mattias Kanhov: Creating a reusable English – Afrikaans parallel corpora for bilingual dictionary construction
4 languages, 3 bitexts
total number of files: 6
total number of tokens: 1.32M
total number of sentence fragments: 0.15M
Please cite the following article if you use any part of the corpus in your own work:
Jörg Tiedemann, 2009, News from OPUS - A Collection of Multilingual Parallel Corpora with Tools and Interfaces. In N. Nicolov and K. Bontcheva and G. Angelova and R. Mitkov (eds.) Recent Advances in Natural Language Processing (vol V), pages 237-248, John Benjamins, Amsterdam/Philadelphia
People who looked at this resource also viewed the following:
- Multilingual Edition of Verne's Novel "Around the World in 80 Days"
- TSD13 dataset - English-Spanish WMT12 machine translations by various MT systems, post-edited by 10 translation students
- EAMT11 dataset - machine translations with human judgements and post-editions
- WPTP12 dataset - machine translations with post-editing performed by multiple translators with different levels of expertise
People who downloaded this resource also downloaded the following: