mirror of
https://git.FreeBSD.org/ports.git
synced 2024-12-28 05:29:48 +00:00
abb5037267
"stemmed" form of a word. This is a form with most of the common morphological endings removed; hopefully representing a common linguistic base form. This is most useful in building search engines and information retrieval software; for example, a search with stemming enabled should be able to find a document containing "cycling" given the query "cycles". PyStemmer provides algorithms for several (mainly european) languages, by wrapping the libstemmer library from the Snowball project in a Python module. It also provides access to the classic Porter stemming algorithm for english: although this has been superceded by an improved algorithm, the original algorithm may be of interest to information retrieval researchers wishing to reproduce results of earlier experiments. WWW: http://pypi.python.org/pypi/PyStemmer/ PR: ports/132695 Submitted by: Wen Heping <wenheping at gmail.com>
17 lines
894 B
Plaintext
17 lines
894 B
Plaintext
PyStemmer provides access to efficient algorithms for calculating a
|
|
"stemmed" form of a word. This is a form with most of the common
|
|
morphological endings removed; hopefully representing a common
|
|
linguistic base form. This is most useful in building search engines
|
|
and information retrieval software; for example, a search with stemming
|
|
enabled should be able to find a document containing "cycling" given the
|
|
query "cycles".
|
|
|
|
PyStemmer provides algorithms for several (mainly european) languages,
|
|
by wrapping the libstemmer library from the Snowball project in a Python
|
|
module. It also provides access to the classic Porter stemming algorithm
|
|
for english: although this has been superceded by an improved algorithm,
|
|
the original algorithm may be of interest to information retrieval
|
|
researchers wishing to reproduce results of earlier experiments.
|
|
|
|
WWW: http://pypi.python.org/pypi/PyStemmer/
|