MTRoget: Roget's Thesaurus Machine-Translated

Researchers: Gerard de Melo, Gerhard Weikum

Introduction

MTRoget is a collection of thesauri in different languages, obtained by machine translating Roget's Thesaurus, the most well-known thesaurus in the English-speaking world. Rather than relying on a conventional machine translation engine, a thesaurus-specific translation technique was used.
Translating a thesaurus differs from translating prose in several ways. Most importantly, translators have the choice of omitting words, so an overall higher accuracy is possible. Additionally, some words may be translated to more than one target word.

Browse Online

A portion of the data is available at lexvo.com, however without the sense-specific listings and the topic hierarchy available in the downloads offered below.

References

For academic use, please cite the following publication:

Gerard de Melo, Gerhard Weikum.
Mapping Roget's Thesaurus and WordNet to French   PDF    BibTeX
In: Proceedings of the 6th Language Resources and Evaluation Conference (LREC 2008), ELRA, Paris, France.

Downloads

MTRoget 2010-05-18 Bulgarian Thesaurus
MTRoget 2010-05-18 Catalan Thesaurus
MTRoget 2010-05-18 Czech Thesaurus
MTRoget 2010-05-18 Mandarin Chinese Thesaurus
MTRoget 2010-05-18 Danish Thesaurus
MTRoget 2010-05-18 German Thesaurus
MTRoget 2010-05-18 Modern Greek (1453-) Thesaurus
MTRoget 2010-05-18 Esperanto Thesaurus
MTRoget 2010-05-18 Finnish Thesaurus
MTRoget 2010-05-18 French Thesaurus
MTRoget 2010-05-18 Croatian Thesaurus
MTRoget 2010-05-18 Hungarian Thesaurus
MTRoget 2010-05-18 Armenian Thesaurus
MTRoget 2010-05-18 Indonesian Thesaurus
MTRoget 2010-05-18 Italian Thesaurus
MTRoget 2010-05-18 Japanese Thesaurus
MTRoget 2010-05-18 Georgian Thesaurus
MTRoget 2010-05-18 Dutch Thesaurus
MTRoget 2010-05-18 Polish Thesaurus
MTRoget 2010-05-18 Portuguese Thesaurus
MTRoget 2010-05-18 Romanian Thesaurus
MTRoget 2010-05-18 Russian Thesaurus
MTRoget 2010-05-18 Slovak Thesaurus
MTRoget 2010-05-18 Spanish Thesaurus
MTRoget 2010-05-18 Swedish Thesaurus
MTRoget 2010-05-18 Thai Thesaurus
MTRoget 2010-05-18 Turkish Thesaurus
MTRoget 2010-05-18 Ukrainian Thesaurus

License

Available under a CC-BY-SA 3.0 license. The original source information was compiled by Peter Mark Roget and later updated and released as an electronic text by Patrick Cassidy and distributed by Project Gutenberg (Version 1.02 Edition 15a).

Contact

Please get in touch with Gerard de Melo if you have any questions, suggestions, or research ideas.

 

Return to Main Page


© 2020 Gerard de Melo