Newsgroups: sci.lang.japan,soc.culture.japan,fj.questions.misc,fj.kanji,fj.kanakan.wnn,fj.kanakan.misc
Path: galaxy.trc.rwcp.or.jp!jaist-news!cs.titech!wnoc-tyo-news!sh.wide!news.Hawaii.Edu!ames!elroy.jpl.nasa.gov!usc!math.ohio-state.edu!darwin.sura.net!sgiblab!adagio.panasonic.com!monu6!capek.rdt.monash.edu.au!jwb
From: jwb@capek.rdt.monash.edu.au (Jim Breen)
Subject: V93-005 of EDICT released [67,428 entries]
Message-ID: <jwb.744436437@capek.rdt.monash.edu.au>
Sender: news@monu6.cc.monash.edu.au (Usenet system)
Organization: Monash University, Melb., Australia.
Date: Wed, 4 Aug 1993 03:53:57 GMT
Lines: 43
Xref: galaxy.trc.rwcp.or.jp fj.questions.misc:5252 fj.kanji:1161 fj.kanakan.wnn:649 fj.kanakan.misc:241
X-originally-archived-at: http://galaxy.rwcp.or.jp/text/cgi-bin/newsarticle2?ng=fj.kanji&nb=1161&hd=a
X-reformat-date: Mon, 18 Oct 2004 15:18:22 +0900
X-reformat-comment: Tabs were expanded into 4 column tabstops by the Galaxy's archiver. See http://katsu.watanabe.name/ancientfj/galaxy-format.html for more info.


Yes, the next version of EDICT is now available on pub/nihongo
on monu6.cc.monash.edu.au. And yes, it really is about 22,000
entries bigger than the last version.

The main growth in EDICT since V93-004 has been the incorporation
of about 15,000 entries from the wnn gerodic project. I converted
all the jinmei in their files into EDICT format, added the Romaji
(by program, of course 8-)}), filtered out the entries that were
already in EDICT, and there they were, lots of new entries. About
3,000 of the gerodic entries were already in EDICT, but without the
"pn" marker, so the existing entries were updated. Curiously, about
2,000 of the "pn" entries in EDICT were not in the gerodic set.

All this happened some months ago. I have delayed the release until
the latest versions of JDIC (2.3) and XJDIC (1.1) were released,
because these include the option of suppressing "name" entries.

I have also released a little utility program, ESPLIT, (.c & .exe)
which splits EDICT into name and non-name files for users who wish
to avoid the names altogether.

I did not include some of the more bizarre entries from gerodic,
and I had quite a bit of trouble with some entries which were
obviously Chinese, and where the kana appeared to be an attempt to
map the PinYin. Needless to say, my kana->romaji routine had acute
indigestion with some of the moras.

The other main source of new entries has been the indefatigable
Kurt Stueber, who has delivered many thousands of entries in the
past months. To Kurt, and the other contributors, my deep 
gratitude.

This will be probably the second-last update to EDICT this year.
Late next month I will be going to France for 4 months (sabbatical),
and I am most unlikely to be able to use the Unix facilities which
are essential to my EDICT maintenance. If you are hatching a batch of
new entries, don't delay too long in mailing tham to me.
-- 
Jim Breen          [$B%8%`(B.$B%V%j!<%s(B@$B%b%J%7%eBg3X(B]
Department of Robotics & Digital Technology. Monash University. 
Wellington Rd, Clayton VIC 3168 Australia (ph) +61 3 565 3298 
(fax) +61 3 565 3574  AARNet/Internet:j.breen@rdt.monash.edu.au  
