Please click for explanations in Japanese:
NICT Introduction
The Structure of the EDR Electronic Dictionary
Technical Guide
User Support Tools
About Purchasing the Dictionary
Revision Information
Questions and Answers
Hot News

Copyright(C)2002-2004 National Institute of Information and Communications Technology. All Rights Reserved.

The Structure of the EDR Electronic Dictionary

The EDR Electronic Dictionary is composed of five types of dictionaries (Word, Bilingual, Concept, Co-occurrence, and Technical Terminology), as well as the EDR Corpus.


The basic roles of the Word Dictionary include providing the relations between words and concepts related to each other, and providing grammatical attributes regarding these relationships. The Japanese Word Dictionary contains approximately 270,000 words, and the English Word Dictionary contains approximately 190,000 words.

The Bilingual Dictionary lists the correspondences between headwords in the different languages. The Japanese-English Bilingual Dictionary contains approximately 230,000 words, and the English-Japanese Bilingual Dictionary contains approximately 160,000 words.

The Concept Dictionary contains information on the approximately 410,000 concepts listed in the Word Dictionary and is divided according to information type into the Headconcept Dictionary, the Concept Classification Dictionary, and the Concept Description Dictionary. The Headconcept Dictionary describes information on the concepts themselves. The Concept Classification Dictionary describes the super-sub relations among the approximately 410,000 concepts. The "super-sub" relation refers to the inclusion relation between concepts, and the set of interlinked concepts can be regarded as a type of thesaurus. The Concept Description Dictionary describes the semantic (binary) relations, such as 'agent,' 'implement,' and 'place,' between concepts that co-occur in a sentence.

The Co-occurrence Dictionary describes collocational information in the form of binary relations. The Japanese Co-occurrence Dictionary contains approximately 900,000 phrases, and the English Co-occurrence Dictionary contains approximately 460,000 phrases.

The Technical Terminology Dictionary covers the field of information processing, and is split into four types of dictionaries of Word, Bilingual, Concept (Classification), and Co-occurrence.

The linguistic data which the EDR Corpus contains has been obtained by collecting a large number of example sentences and analyzing them on morphological, syntactic, and semantic levels. The Japanese Corpus contains approximately 200,000 sentences, and the English Corpus contains approximately 120,000 sentences.