1
0
mirror of https://git.FreeBSD.org/ports.git synced 2025-01-03 06:04:53 +00:00
freebsd-ports/textproc/py-gensim/pkg-descr

22 lines
996 B
Plaintext

Gensim is a Python library for topic modelling, document indexing and similarity
retrieval with large corpora. Target audience is the natural language processing
(NLP) and information retrieval (IR) community.
Features:
* All algorithms are memory-independent w.r.t. the corpus size (can process
input larger than RAM, streamed, out-of-core),
* Intuitive interfaces
* easy to plug in your own input corpus/datastream (trivial streaming API)
* easy to extend with other Vector Space algorithms (trivial transformation
API)
* Efficient multicore implementations of popular algorithms, such as online
Latent Semantic Analysis (LSA/LSI/SVD), Latent Dirichlet Allocation (LDA),
Random Projections (RP), Hierarchical Dirichlet Process (HDP) or word2vec deep
learning.
* Distributed computing: can run Latent Semantic Analysis and Latent Dirichlet
Allocation on a cluster of computers.
* Extensive documentation and Jupyter Notebook tutorials.
WWW: https://radimrehurek.com/gensim/