mirror of
https://git.FreeBSD.org/ports.git
synced 2024-12-25 04:43:33 +00:00
58a9f2a8df
- Remove unnecessary MASTER_SITE_SUBDIR - Reformat pkg-descr - Use single space after WWW:
14 lines
591 B
Plaintext
14 lines
591 B
Plaintext
This is a perl version of simplified Chinese word segmentation.
|
|
|
|
The algorithm for this segmenter is to search the longest word at each point
|
|
from both left and right directions, and choose the one with higher frequency
|
|
product.
|
|
|
|
The original program is from the CPAN module Lingua::ZH::WordSegment
|
|
(http://search.cpan.org/~chenyr/) I did the follwing changes: 1) make the
|
|
interface object oriented; 2) make the internal string into utf8; 3) using
|
|
sogou's dictionary (http://www.sogou.com/labs/dl/w.html) as the default
|
|
dictionary.
|
|
|
|
WWW: http://search.cpan.org/dist/Lingua-ZH-WordSegmenter/
|