This package consists of Perl modules along with supporting Perl programs that
implement the semantic relatedness measures described by Leacock Chodorow
(1998), Jiang Conrath (1997), Resnik (1995), Lin (1998), Hirst St Onge (1998)
and the adapted gloss overlap measure by Banerjee and Pedersen (2002). The Perl
modules are designed as object classes with methods that take as input two word
senses. The semantic relatedness of these word senses is returned by these
methods. A quantitative measure of the degree to which two word senses are
related has wide ranging applications in numerous areas, such as word sense
disambiguation, information retrieval, etc. For example, in order to determine
which sense of a given word is being used in a particular context, the sense
having the highest relatedness with its context word senses is most likely to
be the sense being used. Similarly, in information retrieval, retrieving
documents containing highly related concepts are more likely to have higher
precision and recall values.
A command line interface to these modules is also present in the package. The
simple, user-friendly interface returns the relatedness measure of two given
words. A number of switches and options have been provided to modify the output
and enhance it with trace information and other useful output. Details of the
usage are provided in other sections of this README. Supporting utilities for
generating information content files from various corpora are also available in
the package. The information content files are required by three of the
measures for computing the relatedness of concepts.
WordNet::QueryData provides a direct interface to the WordNet database files.
It requires the WordNet package. It allows the user direct access to the full
WordNet semantic lexicon. All parts of speech are supported and access is
generally very efficient because the index and morphical exclusion tables are
loaded at initialization. This initialization step is slow (appx. 10-15
seconds), but queries are very fast thereafter---thousands of queries can be
completed every second.
o Fix hanging problems.
o Update Italian patch.
o Remove DocBook 3.0 patch and dependency.
Submitted by: Alex Dupre <sysadmin@alexdupre.com>
PR: ports/50030
* kill devel/libtool and move to devel/libtool13, upgrading to 1.3.5
* upgrade repo-copied devel/libtool14 to 1.4.3
* break out libltdl into its own separate port
* move to version-numbered binaries/scripts (ie: there is *no* 'libtool'
any more -- USE_LIBTOOL and USE_LIBTOOL_VER are your friends)
Approved by: portmgr (kris) - for the bsd.port.mk hooks
Tested by: bento 4-exp builds (repeatedly)
-added support for different Ghostscript output devices (-dev)
-updated Xpdf to 2.02 (security fix)
-a couple of fixes and tweaks for better output
-ported duplicate lines elimination from pdftotext (does not affect complex mode)
-fixed bug which caused bold to spread from one sentence to the entire document
-support for document outlines (patch by Nicolas Pitre)