Newsgroups: fj.editor.emacs
Path: galaxy.trc.rwcp.or.jp!sparky!uunet!ccut!sh.wide!cc-keio!cs-keio!fukumoto
From: fukumoto@aa.cs.keio.ac.jp (FUKUMOTO Atsushi)
Subject: Re: Does a thesaurus program exist?
In-Reply-To: ike@bi.a.u-tokyo.ac.jp's message of 8 Apr 92 08:17:24 GMT
Message-ID: <FUKUMOTO.92Apr8220548@lyra.aa.cs.keio.ac.jp>
Sender: news@sync.cs.keio.ac.jp
Nntp-Posting-Host: lyra.aa.cs.keio.ac.jp
Organization: Anzai Lab., Keio Univ., Yokohama, Japan.
References: <IKE.92Apr8171724@presto.bi.a.u-tokyo.ac.jp>
Distribution: fj
Date: Wed, 8 Apr 1992 13:05:48 GMT
Lines: 184
Xref: galaxy.trc.rwcp.or.jp fj.editor.emacs:3008
X-originally-archived-at: http://galaxy.rwcp.or.jp/text/cgi-bin/newsarticle2?ng=fj.editor.emacs&nb=3008&hd=a
X-reformat-date: Mon, 18 Oct 2004 15:18:22 +0900
X-reformat-comment: Tabs were expanded into 4 column tabstops by the Galaxy's archiver. See http://katsu.watanabe.name/ancientfj/galaxy-format.html for more info.


In article <IKE.92Apr8171724@presto.bi.a.u-tokyo.ac.jp>,
ike@bi.a.u-tokyo.ac.jp (Mitsunori Ikeguchi) writes:
> $B%9%Z%k%A%'%C%+!<$K4X$7$F$O!"$9$P$i$7$$%W%m%0%i%`!"(Bispell$B$,$"$k$N$G=EJu(B
> $B$7$F$$$^$9$,!"F1$8$h$&$J7A<0$G!"%7%=!<%i%9$,$"$l$PJXMx$@$H;W$$$^$9!#(B


$B!!0J2<$N$b$N$,$"$j$^$9!#$,!"$=$N$&$A;H$C$F$_$h$&$H;W$$$D$D!"$^$@;H$C$F(B
$B$_$?$3$H$,$J$$$N$G!"$I$NDxEY;H$$$b$N$K$J$k$N$+$O;d$K$O$o$+$j$^$;$s!#(B


$B!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!J!K\!w0B@>E7Ln8&(B.$B7W;;5!2J3X@l96(B.$B7D1~Bg(B
$B!!!!!!!!!!!!!!!!!!!!!!!!!J$3$N%"%+%&%s%H$O4V$b$J$/>CLG$9$k!J$H;W$&!K!K(B



From: darrylo@hpnmxx.sr.hp.com (Darryl Okahata)
Newsgroups: gnu.emacs.sources
Subject: A thesaurus for GNU Emacs!
Message-ID: <9112190324.AA05734@hpnmxx.sr.hp.com>
Date: 19 Dec 91 03:24:03 GMT
Distribution: gnu
Organization: Source only  Discussion and requests in gnu.emacs.help.
Lines: 681


     Here's a holiday present -- a thesaurus interface for GNU Emacs.
The people at Project Gutenberg were nice enough to release a copy of
the 1911 Roget's Thesaurus (yes, that's "nineteen-eleven"), and so I'm
releasing a simple interface to it.

     The first part of the README file follows, followed by a shar file
containing the interface and indexing routines.

     -- Darryl Okahata
Internet: darrylo@sr.hp.com

DISCLAIMER: this message is the author's personal opinion and does not
constitute the support, opinion or policy of Hewlett-Packard or of the
little green men that have been following him all day.

===============================================================================
[ IMPORTANT --> Be sure to read the section on requirements! ]

     This is a very ALPHA-TEST implementation of a thesaurus for GNU
Emacs.  Although it is not complete, I'm not sure when or if I'll have
the time to spiff it up.  As a result, I'm posting what I have here (is
anyone else working on something similar?).  It's copyrighted and is
being released under the GNU Public License (see the end of this file
for more details).  Note that only this interface falls under the GNU
Public License; the thesaurus itself has a completely separate and
independent "copyright".

     The Emacs-Lisp functions in this package allow you to query a
thesaurus for synonyms of a word.  For example, you can ask Emacs to
quickly display a thesaurus entry for "editor":

-------------------------------------------------------------------------------
***** Word: editor

     #593. Book. -- N. booklet; writing, work, volume, tome, opuscule;
tract, tractate; livret; brochure, libretto, handbook, codex, manual,
pamphjlet, enchiridion, circular, publication; chap book.
     part, issue, number livraison; album, portfolio; periodical, serial,
magazine, ephemeris, annual, journal.
     paper, bill, sheet, broadsheet; leaf, leaflet; fly leaf, page; quire,
ream
     chapter, section head, article paragraph, passage, clause.
     folio, quarto, octavo; duodecimo, sextodecimo, octodecimo.
     encyclopedia; encompilation;  library, bibliotheca; press &c.
(publication) 531.
     writer, author, litterateur, essayist, journalism; pen, scribbler, the
scribbling race; literary hack, Grub-street writer; writerr for the press,
gentleman of the press, representative of the press; adjective jerker,
diaskeaus, ghost, hack writer, ink slinger; publicist; reporter, penny a
liner; editor, subeditor; playwright &c. 599; powt &c. 597.
     bookseller, publisher; bibliopole, bibliopolist; librarian; bookstore,
bookseller's shop.
     knowledge of books, bibliography; book learning &c. (knowledge) 490.
     Phr. "among the giant fossils of my past" [E. B. Browning]; craignez
tout d'un auteur en courroux; "for authors nobler palms remain" [Pope]; "I
lived to write and wrote to live" [Rogers]; "look in thy heart and write"
[Sidney]; "there is no Past so long as Books shall live" [Bulwer Lytton);
"the public mind is the creation of the Master-Writers" [Disraeli]; volumes
that I prize above my dukedom" [Tempest].
-------------------------------------------------------------------------------


*******************************************************************************
***** REQUIREMENTS:

     To use this, you need the following (besides the files that came
with this README file):

* A copy of the thesaurus itself (which is not included with this README
  file).  Thanks to Project Gutenberg, a copy of the 1911 Roget's
  Thesaurus has been made available via anonymous ftp from
  mrcnext.cso.uiuc.edu [ 128.174.201.12 ] (please ftp the file during
  off-hours -- at times OTHER THAN 10:00 AM to 6:00 PM Central Standard
  Time (Daylight in summer)).  It's in the directory "/etext":

-rw-r--r--  1 24       micro    1377400 Jun 19 18:08 roget11.txt
-rw-r--r--  1 24       micro     592247 Jun 19 18:13 roget11.zip

  You only need one of these, as roget11.zip is roget11.txt in a .ZIP
  file.  Note, however, the size.

* A copy of Perl 4.0, compiled with dbm/ndbm support, as the thesaurus
  indexing and low-level access routines are written as Perl scripts
  (this was done to avoid having to load the entire 1.3MB thesaurus into
  Emacs, bloating its process size).  Part of the index is stored as a
  dbm database, and so dbm/ndbm support must be compiled into Perl.

* While building the index (an index must be built from the raw
  thesaurus data), it is recommended that your system have plenty of
  free RAM and swap space, as a single 10-12 megabyte process is created
  during the indexing process.  Once the index is created, you need much
  less resources to access the thesaurus.

* You need about two megabytes of free disk space.  The thesaurus
  occupies about 1.3MB, and the index files occupies another half
  megabyte or so.


     Installation instructions are mentioned below.


*******************************************************************************
***** USAGE:

     The GNU Emacs interface provides three functions:

thesaurus-lookup-word
     This function will prompt for a word to look up, and all entries
     that begin with this word will be displayed.  To display the
     entry that contains only this word, specify a prefix.

thesaurus-lookup-word-in-text
     This function will extract the word under the cursor and run
     `thesaurus-lookup-word' upon it.  A prefix can be specified to
     force the display of only the entry that contains this word.

thesaurus-show-words
     This function will prompt for a word and will display all words
     in the thesaurus that begin with this word.

These functions should be bound to some key sequences; however, this
package does not do this.  You'll have to do it yourself.

     There is also a shell-command-line interface to the thesaurus
(which is what the GNU Emacs interface uses).  Using the "th" Perl
script, you can query the thesaurus for a number of things:

th <word> [<word> ...]
Search the thesaurus for all entries that begin with
"<word>".  Multiple words can be specified here.

th -V <word> [<word> ...]
Search the thesaurus for all entries that begin with
"<word>".  All displayed entries are separated by a line
of dashes.

th -W <word> [<word> ...]
Search the thesaurus for the entry that contains
"<word>" exactly.

th -w <word> [<word> ...]
Display all words in the thesaurus that begin with
"<word>".

th -w -v <word> [<word> ...]
Display all words in the thesaurus that begin with
"<word>".  Alongside each word, the numbers of the
entries that contain the word are displayed.

th -n <number>
Display thesaurus entry number "<number>".  Unlike a
word, only one number can be specified.

Generally, you will want to pipe the output to more(1) or less(1).


#---------------------------------- cut here ----------------------------------
$B!J0J2<N,!K(B
