QTLP English-Portuguese Corpus for the AUTOMOTIVE domain


This data set was acquired in the framework of QTLP (http://www.qt21.eu/launchpad/), an EU-FP7 Funded Project under Grant Agreement 296347. The dataset contains automatically detected pairs of parallel documents that were acquired from the web (i.e. from multilingual sites which contain content in the targeted languages and domain).The majority of the crawled sites were: i) websites of automobile manufacturers and ii) websites of companies that produce car accessories or car parts.In addition, this dataset includes automatically aligned sentences that were extracted from pairs of parallel documents.The pairs of parallel documents have been classified (based on specific patterns which were detected in the URL or the title of the documents) into one of the following genre categories: "Reference", "News/Journalism", "Discussion", "Commercial" and "Information".

If you want your webpage/website to be removed from these corpora, please contact us.

You don’t have the permission to edit this resource.