biology/seqan1: create port from current SeqAn 1.3.1 for legacy usage
UPDATING: document SeqAn updates and seqan1 port for legacy usage
PR: 204127
Submitted by: Hannes Hauswedell <h2+fbsdports@fsfe.org>
SeqAn is an open source C++ library of efficient algorithms
and data structures for the analysis of sequences with the
focus on biological data.
This port contains applications built on SeqAn and developed
within the SeqAn project. Among them are famous read mappers
like RazerS and Yara, as well as many other tools. Some
applications are packaged seperately and the library
can be found at biology/seqan.
WWW: http://www.seqan.de/
PR: 204127
Submitted by: Hannes Hauswedell <h2+fbsdports@fsfe.org>
general-use format for representing biological sample by observation contingency
tables. BIOM is a recognized standard for the Earth Microbiome Project and is a
Genomics Standards Consortium supported project.
The BIOM format is designed for general use in broad areas of comparative
-omics. For example, in marker-gene surveys, the primary use of this format is
to represent OTU tables: the observations in this case are OTUs and the matrix
contains counts corresponding to the number of times each OTU is observed in
each sample. With respect to metagenome data, this format would be used to
represent metagenome tables: the observations in this case might correspond to
SEED subsystems, and the matrix would contain counts corresponding to the number
of times each subsystem is observed in each metagenome. Similarly, with respect
to genome data, this format may be used to represent a set of genomes: the
observations in this case again might correspond to SEED subsystems, and the
counts would correspond to the number of times each subsystem is observed in
each genome.
WWW: http://biom-format.org/
PR: 209193
Submitted by: Joseph Mingrone
errors with libc++ 3.8.0:
In file included from src/QScoreAdapter.cpp:1:
In file included from src/QScoreAdapter.h:4:
In file included from ../../include/U2Core/MAlignment.h:1:
In file included from ../../include/U2Core/../../corelibs/U2Core/src/datatype/MAlignment.h:25:
In file included from ../../include/U2Core/../../corelibs/U2Core/src/datatype/MAlignmentInfo.h:25:
In file included from /usr/local/include/qt5/QtCore/QString:1:
In file included from /usr/local/include/qt5/QtCore/qstring.h:41:
In file included from /usr/local/include/qt5/QtCore/qchar.h:37:
In file included from /usr/local/include/qt5/QtCore/qglobal.h:39:
/usr/include/c++/v1/cstddef:43:15: fatal error: 'stddef.h' file not found
#include_next <stddef.h>
^
This is because the port tries to add /usr/include as a system include
directory, using -isystem, and this screws up the order of include
directories. Fix it by patching up a number of .pri files to avoid
using the -isystem flag.
Approved by: h2+fbsdports@fsfe.org (maintainer)
PR: 209366
MFH: 2016Q2
alignment mode). The speedup over BLAST is up to 20,000 on short reads at a
typical sensitivity of 90-99% relative to BLAST depending on the data and
settings.
WWW: http://ab.inf.uni-tuebingen.de/software/diamond/
PR: 208998
Submitted by: jrm@ftfl.ca
- fixes build on 9.x
- Added a feedback box to show the read depth at the selected coord.
Hover over this box to see the specific base support (i.e. the
number of A/C/G/Ts).
since around v1.18.0, UGENE is using Google's Breakpad library for crash
reporting, which is very system-specific and does not support FreeBSD at
the moment. Due to lack of resources and interest in porting it, simply
disable crash reporting code for the time being.
A k-mer is a substring of length k, and counting the occurrences of all such
substrings is a central step in many analyses of DNA sequence. JELLYFISH can
count k-mers quickly by using an efficient encoding of a hash table and by
exploiting the "compare-and-swap" CPU instruction to increase parallelism.
WWW: http://www.genome.umd.edu/jellyfish.html
PR: 207929
Submitted by: bacon4000@gmail.com
Bowtie is an ultrafast, memory-efficient short read aligner. It aligns short
DNA sequences (reads) to the human genome at a rate of over 25 million 35-bp
reads per hour.
This is Bowtie version 2, which will need to coexists with Bowtie 1 for the
foreseeable future. Both are required by certain genomics pipelines, in some
cases (e.g. Trinity) by the same pipeline.
WWW: https://github.com/BenLangmead/bowtie2
PR: 207908
Submitted by: Jason Bacon <bacon4000@gmail.com>
Slclust is a utility that performs single-linkage clustering with the option of
applying a Jaccard similarity coefficient to break weakly bound clusters into
distinct clusters.
WWW: http://sourceforge.net/projects/slclust/
PR: 207997
Submitted by: Jason Bacon <bacon4000@gmail.com>
TransDecoder identifies candidate coding regions within transcript
sequences, such as those generated by de novo RNA-Seq transcript
assembly using Trinity, or constructed based on RNA-Seq alignments
to the genome using Tophat and Cufflinks.
WWW: http://transdecoder.github.io/
PR: 207993
Submitted by: Jason Bacon <bacon4000@gmail.com>
Fix distinfo for the offending ports.
lang/yorick's tag was moved, and the added patch was no longer needed.
PR: 207644
Submitted by: mat
Exp-run by by: antoine
Sponsored by: Absolight
Differential Revision: https://reviews.freebsd.org/D4268
Bowtie is an ultrafast, memory-efficient short read aligner. It aligns short
DNA sequences (reads) to the human genome at a rate of over 25 million 35-bp
reads per hour.
WWW: http://bowtie-bio.sourceforge.net/index.shtml
PR: 206939
Submitted by: Jason Bacon <bacon4000@gmail.com>
A set of tools written in Perl and C++ for working with VCF files, such as
those generated by the 1000 Genomes Project.
WWW: https://github.com/vcftools/vcftools
PR: 206926
Submitted by: Jason Bacon <bacon4000@gmail.com>
- Blixem can now accept a 'command' tag in a gff line for a region
feature. The given command should return gff and will be run by
blixem on the given region.
- Blixem now supports fetching sequence data over https.
genomics analysis tasks. The most widely-used of these tools enable genome
arithmetic, i.e., set theory on the genome. For example, with bedtools one
can intersect, merge, count, complement, and shuffle genomic intervals from
multiple files in common genomic formats such as BAM, BED, GFF/GTF, and VCF.
Although each individual utility is designed to do a relatively simple task,
e.g., intersect two interval files, more sophisticated analyses can be
conducted by stringing together multiple bedtools operations on the command
line or in shell scripts.
WWW: http://bedtools.readthedocs.org/
PR: 204536
Submitted by: scottcheloha@gmail.com
Changes:
- Missing hydrogens atoms of HETATM molecules (pdb files) are stored locally.
- Pdb files can now be clipped.
- Fixes missing residues. Introduces an alternative way of generating rotamers,
without using z-matrices. This potentially a faster algorithm.
- Supports Movie making through avconv/ffmpeg.
- calculates the Electron Localization Function (ELF).
- Updated code of ambfor/ambmd to speed up optimisations/MD of proteins
solvated in water (Jan, 2014).
PR: 204341
LICENSE= PD
Note that although Public Domain is not technically a license, it's
handled in the same way as licenses here, which is a common practice
(Arch, Gentoo, Fedora, Debian, even FOSSology do the same).
Convert all ports which redefine Public Domain LICENSE to LICENSE=PD.
Approved by: portmgr (bapt)
Differential Revision: D4149
This port will not build if ENOTRECOVERABLE is not defined, period.
It's not necessary to check OPSYS and version, just apply the fallback
definition if it's not defined. This unbreaks DragonFly after last
commit.
- Update ports to 1.3 and set BUILD_DEPENDS of dependent ports to require
version 1.3 of htslib.
- Add CURL option to htslib
- Add TEST_TARGET with perl and bash dependencies for testing
- Tidy up spacing and pkg-message's
PR: 205524
PR: 205525
PR: 205526
Submitted by: cartwright@asu.edu (maintainer)