All Classes and Interfaces

Class
Description
Factory class to get Providers for substitution matrices that are provided by the AAINDEX database.
 
 
Title: ABITrace
 
The details of a Compound
 
 
A feature is currently any descriptive item that can be associated with a sequence position(s) A feature has a type and a source which is currently a string to allow flexibility for the user Ideally well defined features should have a class to describe attributes of that feature
Base abstraction of a location which encodes for the majority of important features about a location such as the start, end and strand
 
 
The base class for DNA, RNA and Protein sequences.
 
A location which is bound to an AccessionID.
Indicates an entity is accessioned
Used in Sequences as the unique identifier.
AlignedSequence<S extends Sequence<C>,C extends Compound>
Defines a data structure for a Sequence within an alignment.
Defines an alignment step in order to pass alignment information from an Aligner to a constructor.
 
Ambiguity set for hybrid DNA/RNA sequences.
 
Used to describe an Amino Acid.
Set of proteinogenic amino acids.
 
Stores a Sequence as a collection of compounds in an ArrayList
Bare bones version of the Sequence object to be used sparingly.
An implementation of the popular bit encodings.
The logic of working with a bit has been separated out into this class to help developers create the bit data structures without having to put the code into an intermediate format and to also use the format without the need to copy this code.
Designed by Paolo Pavan.
Designed by Paolo Pavan.
This class models a Blast/Blast plus result.
Designed by Paolo Pavan.
Designed by Paolo Pavan.
Re-designed by Paolo Pavan on the footprint of: org.biojava.nbio.genome.query.BlastXMLQuery by Scooter Willis You may want to find my contacts on Github and LinkedIn for code info or discuss major changes.
Need to keep track of actual bytes read and take advantage of buffered reader performance.
Attempts to wrap compounds so it is possible to view them in a case insensitive manner
A sequence creator which preserves the case of its input string in the user collection of the returned ProteinSequence.
 
Represents a exon or coding sequence in a gene.
A ChromosomeSequence is a DNASequence but keeps track of geneSequences
This object represents a classpath resource on the local system.
Define a codon
 
For a given sequence this class will create a view over the top of it and for every request the code will return the complement of the underlying base e.g.
 
 
 
 
Static utility to easily share a thread pool for concurrent/parallel/lazy execution.
Utility class that calculates a CRC64 checksum on a stream of bytes.
If a SequenceProxyReader implements this interface then that external source has a list of cross reference id(s)
GenBank gi|gi-number|gb|accession|locus ENA Data Library gi|gi-number|emb|accession|locus DDBJ, DNA Database of Japan gi|gi-number|dbj|accession|locus NBRF PIR pir||entry Protein Research Foundation prf||name SWISS-PROT UNIPROT sp|accession|name Brookhaven Protein Data Bank (1) pdb|entry|chain Brookhaven Protein Data Bank (2) entry:chain|PDBID|CHAIN|SEQUENCE Patents pat|country|number GenInfo Backbone Id bbs|number General database identifier gnl|database|identifier NCBI Reference Sequence ref|accession|locus Local Sequence identifier lcl|identifier
If you have a uniprot ID then it is possible to get a collection of other id(s) that the protein is known by.
The default provider for AAINDEX loads substitution matrices from the AAINDEX file in the resources directory
Created by andreas on 8/10/15.
 
This is class should model the attributes associated with a DNA sequence
The type of DNA sequence
A helper class that allows different ways to read a string and create a DNA sequence.
Performs the first stage of transcription by going from DNA to RNA.
Edit<C extends Compound>
Interface for carrying out edit operations on a Sequence.
Abstract class which defines all edit operations as a call to discover what 5' and 3' ends of an editing Sequence should be joined together with a target Sequence.
Implementation which allows for the deletion of bases from a Sequence
Edit implementation which allows us to insert a base at any position in a Sequence.
Allows for the substitution of bases into an existing Sequence.
This class contains the processed data of embl file Primary accession number Sequence version number Topology: 'circular' or 'linear' Molecule type Data class Taxonomic division Sequence length
This class should process the data of embl file
this class contains the parsed data of embl file
This class contains the processed data of embl file that contains the referenceNumber, referenceComment, referencePosition referenceCrossReference, referenceGroup, referenceAuthor referenceTitle, referenceLocation
A set of helper methods which return true if the two parameters are equal to each other.
Sort Exon where it is a little confusing if exons should always be ordered left to right where a negative stranded gene should go the other direction.
A gene contains a collection of Exon sequences
A Gene sequence has a Positive or Negative Strand where we want to write out to a stream the 5 to 3 prime version.
 
FastaReader<S extends Sequence<?>,C extends Compound>
Use FastaReaderHelper as an example of how to use this class where FastaReaderHelper should be the primary class used to read Fasta files
 
Used to parse a stream of a fasta file to get the sequence
FastaWriter<S extends Sequence<?>,C extends Compound>
The FastaWriter writes a collection of sequences to an outputStream.
The class that should be used to write out fasta file of a sequence collection
It is DBReferenceInfo which implements FeatureInterface.
Interface class to handle describing arbitrary features.
If a SequenceProxyReader implements this interface then that external source has a list features
Models the keywords that are annotated for a protein sequence at Uniprot.
 
 
This class is a good example of using the SequenceCreatorInterface where during parsing of the stream the sequence and the offset index are passed to create a Protein sequence that will be loaded in lazily.
This class is a good example of using the SequenceCreatorInterface where during parsing of the stream the sequence and the offset index are passed to create a Protein sequence that will be loaded in lazily.
This class is a good example of using the SequenceCreatorInterface where during parsing of the stream the sequence and the offset index are passed to create a Protein sequence that will be loaded in lazily.
Provides a cache for storing multiple small files in memory.
Four bit encoding of the bit formats.
A four bit per compound implementation of the bit array worker code.
Indicates a way of translating a sequence.
Implementation for resolving fuzzy locations.
 
 
Use GenbankReaderHelper as an example of how to use this class where GenbankReaderHelper should be the primary class used to read Genbank files
 
For Genbank format file only.
 
GenbankWriter<S extends Sequence<?>,C extends Compound>
 
The class that should be used to write out genbank file of a sequence collection
We store the original header if the sequence is parsed from a fasta file and will use that exact sequence if we write out the sequences to a fasta file.
The default fasta header parser where some headers are well defined based on the source database which allows us to set the source of the protein sequence and the identifier that can be used in future implementations to load features from external sources If the user has a custom header with local data then they can create their own implementation of a FastaHeaderParserInterface
 
 
 
 
Contains helper methods for generating a HashCode without having to resort to the commons lang hashcode builders.
This class models a search Hit.
Hsp<S extends Sequence<C>,C extends Compound>
This class models a search Hsp.
A class that provides an InputStream from a File.
A collection of locations which are used whenever we work with INSDC; some of which could be deprecated (from INSDC's point of view) yet appear in records.
Used to represent bond locations equivalent to bond(7,8) or bond(7).
Deprecated in INSDC yet still appears; equivalent to the order() directive except no 5' to 3' ordering is defined.
Deprecated in INSDC; refers to a set of locations of which one location could be valid e.g.
Used to describe a 5' to 3' ordering but no firm assurance it is correct
Parser for working with INSDC style locations.
 
 
Closure interface used when working with IOUtils#processReader(String).
Available translations 1 - UNIVERSAL 2 - VERTEBRATE_MITOCHONDRIAL 3 - YEAST_MITOCHONDRIAL 4 - MOLD_MITOCHONDRIAL 5 - INVERTEBRATE_MITOCHONDRIAL 6 - CILIATE_NUCLEAR 9 - ECHINODERM_MITOCHONDRIAL 10 - EUPLOTID_NUCLEAR 11 - BACTERIAL 12 - ALTERNATIVE_YEAST_NUCLEAR 13 - ASCIDIAN_MITOCHONDRIAL 14 - FLATWORM_MITOCHONDRIAL 15 - BLEPHARISMA_MACRONUCLEAR 16 - 2CHLOROPHYCEAN_MITOCHONDRIAL 21 - TREMATODE_MITOCHONDRIAL 23 - SCENEDESMUS_MITOCHONDRIAL Taken from NCBI with slight modification and put into the classpath resource.
Holds the concept of a codon table from the IUPAC format
This reader actually proxies onto multiple types of sequence in order to allow a number of sequence objects to act as if they are one sequence.
Defines a minimal data structure for reading and writing a sequence alignment.
List of output formats.
Sets of integers used to represent the location of features on sequence.
Helper methods for use with the Location classes.
Helper methods for use with the Location classes.
 
Implements a minimal data structure for reading and writing a sequence alignment.
Defines a mutable (editable) data structure for an AlignedSequence.
MutableProfile<S extends Sequence<C>,C extends Compound>
Defines a mutable (editable) data structure for a Profile.
Defines a mutable (editable) data structure for a ProfilePair.
Defines a mutable (editable) data structure for the results of pairwise sequence alignment.
 
Created by andreas on 6/17/15.
General abstraction of different parsing errors
The plain fasta header takes everything in the header as a single entity.
Holds a single point part of a location
Used to resolve a position about a point
Implementation of XMLWriter which emits nicely formatted documents to a PrintWriter.
Profile<S extends Sequence<C>,C extends Compound>
Defines a data structure for the results of sequence alignment.
List of output formats.
ProfilePair<S extends Sequence<C>,C extends Compound>
Defines a data structure for the results of the alignment of a pair of Profiles.
ProfileView<S extends Sequence<C>,C extends Compound>
Defines a data structure for a view of sequence alignment.
The representation of a ProteinSequence
Used to create a ProteinSequence from a String to allow for details about the location of the sequence etc.
 
 
DNA Sequences produced by modern sequencers usually have quality informaion attached to them.
It is common to have a numerical value or values associated with a feature.
 
This class models a search result.
Designed by Paolo Pavan.
For a given sequence this class will return the base at the reversed position i.e.
 
RNASequence where RNACompoundSet are the allowed values
Used to create a RNA sequence
Attempts to do on the fly translation of RNA by not requesting the compounds until asked.
Takes a Sequence of NucleotideCompound which should represent an RNA sequence (RNASequence is good for this) and returns a list of Sequence which hold AminoAcidCompound.
The biojava-alignment module represents substitution matrices with short values.
Designed by Paolo Pavan.
Main interface for defining a collection of Compounds and accessing them using biological indexes
This is a common method that can be used across multiple storage/proxy implementations to handle Negative strand and other interesting elements of sequence data.
Used to sort sequences in ascending order of bioBegin property.
 
This class represents the storage container of a sequence stored in a fasta file where the initial parsing of the file we store the offset and length of the sequence.
 
A location in a sequence that keeps a reference to its parent sequence
Provides a set of static methods to be used as static imports when needed across multiple Sequence implementations but inheritance gets in the way.
A basic sequence iterator which iterates over the given Sequence by biological index.
A static class that provides optimization hints for memory or performance handling of sequence data.
 
 
SequencePair<S extends Sequence<C>,C extends Compound>
Defines a data structure for the results of pairwise sequence alignment.
 
 
 
 
 
Implements a data structure for a Sequence within an alignment.
Very basic implementation of the Location interface which defines a series of simple constructors.
Basic implementation of the Point interface.
SimpleProfile<S extends Sequence<C>,C extends Compound>
Implements a data structure for the results of sequence alignment.
Implements a data structure for the results of the alignment of a pair of Profiles.
Implements a data structure for the results of pairwise sequence alignment.
Implements a data structure which holds the score (penalty or bonus) given during alignment for the exchange of one Compound in a sequence for another.
An implementation of the SequenceReader interface which for every call will return only 1 compound (given to it during construction; a String is also valid but will require a CompoundSet).
An implementation of a single linkage clusterer See http://en.wikipedia.org/wiki/Single-linkage_clustering
A in memory cache using soft references.
Used to map the start codon feature on a gene
Used to map the stop codon sequence on a gene
Provides a way of representing the strand of a sequence, location hit or feature.
A utility class for common String manipulation tasks.
An example of a ProxySequenceReader that is created from a String.
Defines a data structure which holds the score (penalty or bonus) given during alignment for the exchange of one Compound in a sequence for another.
Static utility to access substitution matrices that come bundled with BioJava.
Provides a way of separating us from the specific IUPACParser.IUPACTable even though this is the only implementing class for the interface.
Class used to hold three nucleotides together and allow for equality to be assessed in a case insensitive manner.
Instance of a Codon which is 3 NucleotideCompounds, its corresponding AminoAcidCompound and if it is a start or stop codon.
A sequence can be associated with a species or Taxonomy ID
A implmentation of AbstractFeature
Used as a way of encapsulating the data structures required to parse DNA to a Protein sequence.
This class is the way to create a TranslationEngine.
This is the sequence if you want to go from a gene sequence to a protein sequence.
Thrown from AbstractCompundTranslator
Implementation of the 2bit encoding.
Extension of the BitArrayWorker which provides the 2bit implementation code.
Uncompresses a single tarred or zipped file, writing output to stdandard out
This class decompresses an input stream containing data compressed with the unix "compress" utility (LZC, a LZW variant).
Pass in a Uniprot ID and this ProxySequenceReader when passed to a ProteinSequence will get the sequence data and other data elements associated with the ProteinSequence by Uniprot.
A sliding window view of a sequence which does not implement any interfaces like Sequence because they do not fit how this works.
Helper methods to simplify boilerplate XML parsing code for org.w3c.dom XML objects
Simple interface for building XML documents.