On Recent Activities of Oriental COCOSDA

Aug. 28, 1995

To those who are interested in Oriental COCOSDA

Long time has passed since I mailed you a proposal for Oriental COCOSDA in 
October 1994 by E-mail or Fax.  I have not received, however, any response 
from China and Korea except an E-mail from Dr. C-W Jo asking the recent 
situation of Oriental COCOSDA..

I would like to tell you the recent situation about speech and text corpora 
in Japan.

1. A CD-ROM containing spoken dialogues was created under the Priority Area 
   Project "Spoken Dialogue" lead by Prof. S. Doshita of Kyoto University. 
   This CD-ROM contains 32 simulated dialogues on a electronic secretary 
   system, geographycal and tourist guide, schedule management, crossword 
   puzzle, telephone shopping and the MAP TASK. The sampling rate is 16kHz.  
   This project is sponsored by the Ministry of Education, Science and 
   Culture during the period from 1993 to 1995.

2. The Speech Database Committee of the Acoustical Society of Japan plans to 
   collect read speech corpus of the Nikkei Shimbun, an economic newspaper, 
   referring to the WSJ corpus in U.S.A.

3. The Nikkei Shimbun released text corpora of newspaper articles.  
   There are 5 CD-ROMs, each of which contains newspaper articles issued 
   annually from 1990 to 1994.  The price is Y130,000 per CD-ROM. 

4. EDR (Japan Electronic Dictionary Research Institute, Ltd.) released 
   various dictionaries in an electronic form. They are Japanese word 
   dictionary (250k words), English word dictionary (190K words), Concept 
   dictionary (400k items), Japanese-English dictionary (230k words), 
   English-Japanese dictionary (160k words), Japanese co-occurrence 
   dictionary (900k items) with Japanese text corpus (220 sentences), 
   English co-occurrence dictionary (460k items) with English text corpus 
   (160k sentences), and Technical term dictionary (120k Japanese words, 80k 
   English words).  You will find more detailed information on the WWW at:

5. The Mainichi Shimbun newspaper also plans to release CD-ROMs.

Oriental COCOSDA Activities:

1. I am going to attend the Eurospeech'95 and COCOSDA meeting in Madrid.
   Please let me know who will attend the meetings.

2. I am planning to report the recent situations of speech/text corpora in 
   Oriental countries. I would appreciate it if you could tell me speech/text 
   corpora created recently or recent situations on speech/text corpora in 
   your country.

3. Please revise your address, telephone/fax number and E-mail address 
   attached below, if necessary.  Prof. C-W Jo's new E-mail address is
   shown in the list.

Existing organizations related to spoken language processing.

1)Chinese COCOSDA
  Prof. Jialu Zhang and several members
2)KCCSLP: Korean Coordinating Committee for Spoken Language Processing
  Prof. Souguil Ann and several members
3)Speech Database Committee, Acoustical Society of Japan
  Prof. S. Itahashi and 29 members
4)Speech Input/Output Systems Expert Committee, JEIDA
  Prof. S. Itahashi and 19 members
  JEIDA: Japan Electronic Industry Development Association
5)LRSI: Linguistic Resources Sharing Initiative
  Dr. T. Yokoi and 24 members
6)Database Workshop of RWCP (Real World Computing Partnership) 
  Prof. S. Itahashi and 14 members
7)ATR Interpreting Telecommunications Research Laboratories
  Dr. Yamazaki and many members
9)Grant-in-Aid for Scientific Research on Priority Areas Project 
  "Spoken Dialogue"    
  Prof. S. Doshita and 100 members
10)Monbusho International Scientific Research Program: Joint Research on 
   "Spoken Language Database"
  Prof. H. Fujisaki and 14 members