mirror of
https://git.FreeBSD.org/ports.git
synced 2025-01-08 06:48:28 +00:00
fb16dfecae
Commit b7f05445c0
has added WWW entries to port Makefiles based on
WWW: lines in pkg-descr files.
This commit removes the WWW: lines of moved-over URLs from these
pkg-descr files.
Approved by: portmgr (tcberner)
9 lines
501 B
Plaintext
9 lines
501 B
Plaintext
Ucto tokenizes text files: it separates words from punctuation, and splits
|
|
sentences. It offers several other basic preprocessing steps such as changing
|
|
case that you can all use to make your text suited for further processing such
|
|
as indexing, part-of-speech tagging, or machine translation.
|
|
|
|
Ucto comes with tokenisation rules for several languages and can be easily
|
|
extended to suit other languages. It has been incorporated for tokenizing Dutch
|
|
text in Frog, our Dutch morpho-syntactic processor.
|