Class FastaReferenceWriter
- All Implemented Interfaces:
AutoCloseable
Example:
String[] seqNames = ...; byte[][] seqBases = ...; ... try (final FastaReferenceWriter writer = new FastaReferenceFileWriter(outputFile)) { for (int i = 0; i < seqNames.length; i++) { writer.startSequence(seqNames[i]).appendBases(seqBases[i]); } }
The two main operations that one can invoke on a opened writer is startSequence(java.lang.String)
and appendBases(java.lang.String)
.
The former indicates that we are going to append a new sequence to the output and is invoked once per sequence.
The latter adds bases to the current sequence and can be called as many times as is needed.
The writer will make sure that the output adheres to the FASTA reference sequence file format restrictions:
- Sequence names are valid (non-empty, without space/blank, control characters),
- Sequence description are valid (without control characters),
- Bases are valid nucleotides or IUPAC redundancy codes and X [ACGTNX...] (lower or uppercase are accepted),
- Sequence cannot have 0 length,
- And that each sequence can only appear once in the output
-
Field Summary
Modifier and TypeFieldDescriptionstatic final int
Default number of bases per line.static final char
Character used to separate the sequence name and the description if any.static final char
Sequence header start character. -
Method Summary
Modifier and TypeMethodDescriptionaddSequence
(ReferenceSequence sequence) Appends a new sequence to the output.appendBases
(byte[] bases) Adds bases to current sequence from abyte
array.appendBases
(byte[] bases, int offset, int length) Adds bases to current sequence from a range in abyte
array.appendBases
(String basesBases) Adds bases to current sequence from abyte
array.appendSequence
(String name, String description, byte[] bases) Appends a new sequence to the output with or without a description.appendSequence
(String name, String description, int basesPerLine, byte[] bases) Appends a new sequence to the output with or without a description and an alternative number of bases-per-line.void
close()
Closes this writer flushing all remaining writing operation input the output resources.startSequence
(String sequenceName) Starts the input of the bases of a new sequence.startSequence
(String sequenceName, int basesPerLine) Starts the input of the bases of a new sequence.startSequence
(String sequenceName, String description) Starts the input of the bases of a new sequence.startSequence
(String sequenceName, String description, int basesPerLine) Starts the input of the bases of a new sequence.static void
writeSingleSequenceReference
(Path whereTo, boolean makeIndex, boolean makeDict, String name, String description, byte[] bases) Convenient method to write a FASTA file with a single sequence.static void
writeSingleSequenceReference
(Path whereTo, int basesPerLine, boolean makeIndex, boolean makeDict, String name, String description, byte[] bases) Convenient method to write a FASTA file with a single sequence.
-
Field Details
-
DEFAULT_BASES_PER_LINE
public static final int DEFAULT_BASES_PER_LINEDefault number of bases per line.- See Also:
-
HEADER_START_CHAR
public static final char HEADER_START_CHARSequence header start character.- See Also:
-
HEADER_NAME_AND_DESCRIPTION_SEPARATOR
public static final char HEADER_NAME_AND_DESCRIPTION_SEPARATORCharacter used to separate the sequence name and the description if any.- See Also:
-
-
Method Details
-
startSequence
Starts the input of the bases of a new sequence.This operation automatically closes the previous sequence base input if any.
The sequence name cannot contain any blank characters (as determined by
Character.isWhitespace(char)
), control characters (as determined byCharacter.isISOControl(char)
) or the the FASTA header start character '>'. It cannot be the empty string either ("").No description is included in the output.
The input bases-per-line is set to the default provided at construction or
DEFAULT_BASES_PER_LINE
if none was provided.This method cannot be called after the writer has been closed.
It also will fail if no base was added to the previous sequence if any.
- Parameters:
sequenceName
- the name of the new sequence.- Returns:
- this instance.
- Throws:
IllegalArgumentException
- if any argument does not comply with requirements listed above or if a sequence with the same name has already been added to the writer.IllegalStateException
- if no base was added to the previous sequence or the writer is already closed.IOException
- if such exception is thrown when writing into the output resources.
-
startSequence
Starts the input of the bases of a new sequence.This operation automatically closes the previous sequence base input if any.
The sequence name cannot contain any blank characters (as determined by
Character.isWhitespace(char)
), control characters (as determined byCharacter.isISOControl(char)
) or the the FASTA header start character '>'. It cannot be the empty string either ("").The input bases-per-line must be 1 or greater.
This method cannot be called after the writer has been closed.
It also will fail if no base was added to the previous sequence if any.
- Parameters:
sequenceName
- the name of the new sequence.basesPerLine
- number of bases per line for this sequence.- Returns:
- this instance.
- Throws:
IllegalArgumentException
- if any argument does not comply with requirements listed above or if a sequence with the same name has already been added to the writer.IllegalStateException
- if no base was added to the previous sequence or the writer is already closed.IOException
- if such exception is thrown when writing into the output resources.
-
startSequence
public FastaReferenceWriter startSequence(String sequenceName, String description) throws IOException Starts the input of the bases of a new sequence.This operation automatically closes the previous sequence base input if any.
The sequence name cannot contain any blank characters (as determined by
Character.isWhitespace(char)
), control characters (as determined byCharacter.isISOControl(char)
) or the the FASTA header start character '>'. It cannot be the empty string either ("").The description cannot contain
Character.isISOControl(char)
. If set tonull
or the empty string ("") no description will be outputted.The input bases-per-line is set to the default provided at construction or
DEFAULT_BASES_PER_LINE
if none was provided.This method cannot be called after the writer has been closed.
It also will fail if no base was added to the previous sequence if any.
- Parameters:
sequenceName
- the name of the new sequence.description
- optional description for that sequence.- Returns:
- this instance.
- Throws:
IllegalArgumentException
- if any argument does not comply with requirements listed above or if a sequence with the same name has already been added to the writer.IllegalStateException
- if no base was added to the previous sequence or the writer is already closed.IOException
- if such exception is thrown when writing into the output resources.
-
startSequence
public FastaReferenceWriter startSequence(String sequenceName, String description, int basesPerLine) throws IOException Starts the input of the bases of a new sequence.This operation automatically closes the previous sequence base input if any.
The sequence name cannot contain any blank characters (as determined by
Character.isWhitespace(char)
), control characters (as determined byCharacter.isISOControl(char)
) or the the FASTA header start character '>'. It cannot be the empty string either ("").The description cannot contain
Character.isISOControl(char)
. If set tonull
or the empty string ("") no description will be outputted.The input bases-per-line must be 1 or greater.
This method cannot be called after the writer has been closed.
It also will fail if no base was added to the previous sequence if any.
- Parameters:
sequenceName
- the name of the new sequence.description
- optional description for that sequence.basesPerLine
- number of bases per line for this sequence.- Returns:
- this instance.
- Throws:
IllegalArgumentException
- if any argument does not comply with requirements listed above.IllegalStateException
- if no base was added to the previous sequence or the writer is already closed of the sequence has been already added.IOException
- if such exception is thrown when writing into the output resources.
-
appendBases
Adds bases to current sequence from abyte
array.- Parameters:
basesBases
- String containing the bases to be added. string will be interpreted using ascii and will throw if any character is >= 127.- Returns:
- this instance.
- Throws:
IllegalArgumentException
- ifbases
isnull
or the input array contains invalid bases (as assessed by:SequenceUtil.isIUPAC(byte)
).IllegalStateException
- if no sequence was started or the writer is already closed.IOException
- if such exception is throw when writing in any of the outputs.
-
appendBases
Adds bases to current sequence from abyte
array. Will throw if any character is >= 127.- Parameters:
bases
- array containing the bases to be added.- Returns:
- this instance.
- Throws:
IllegalArgumentException
- ifbases
isnull
or the input array contains invalid bases (as assessed by:SequenceUtil.isIUPAC(byte)
).IllegalStateException
- if no sequence was started or the writer is already closed.IOException
- if such exception is throw when writing in any of the outputs.
-
appendBases
Adds bases to current sequence from a range in abyte
array. Will throw if any character is >= 127.- Parameters:
bases
- array containing the bases to be added.offset
- the position of the first base to add.length
- how many bases to be added starting from positionoffset
.- Returns:
- this instance.
- Throws:
IllegalArgumentException
- ifbases
isnull
oroffset
andlength
do not entail a valid range inbases
or that range inbase
contain invalid bases (as assessed by:SequenceUtil.isIUPAC(byte)
).IllegalStateException
- if no sequence was started or the writer is already closed.IOException
- if such exception is throw when writing in any of the outputs.
-
addSequence
Appends a new sequence to the output.This is a convenient short handle for
startSequence(name).appendBases(bases)
.The new sequence remains open meaning that additional bases for that sequence can be added with additional calls to
appendBases(java.lang.String)
.- Parameters:
sequence
- aReferenceSequence
to add.- Returns:
- a reference to this very same writer.
- Throws:
IOException
- if such an exception is thrown when actually writing into the output streams/channels.IllegalArgumentException
- if eithername
orbases
isnull
or contains an invalid value (e.g. unsupported bases or sequence names).IllegalStateException
- if the writer is already closed, a previous sequence (if any was opened) has no base appended to it or a sequence with such name was already appended to this writer.
-
appendSequence
public FastaReferenceWriter appendSequence(String name, String description, byte[] bases) throws IOException Appends a new sequence to the output with or without a description.This is a convenient short handle for
startSequence(name, description).appendBases(bases)
.A
null
or empty ("") description will be ignored (no description will be output).The new sequence remains open meaning that additional bases for that sequence can be added with additional calls to
appendBases(java.lang.String)
.- Parameters:
name
- the name of the new sequence.bases
- the (first) bases of the sequence.description
- the description for the new sequence.- Returns:
- a reference to this very same writer.
- Throws:
IOException
- if such an exception is thrown when actually writing into the output streams/channels.IllegalArgumentException
- if eithername
orbases
isnull
or contains an invalid value (e.g. unsupported bases or sequence names). Also when thedescription
contains unsupported characters.IllegalStateException
- if the writer is already closed, a previous sequence (if any was opened) has no base appended to it or a sequence with such name was already appended to this writer.
-
appendSequence
public FastaReferenceWriter appendSequence(String name, String description, int basesPerLine, byte[] bases) throws IOException Appends a new sequence to the output with or without a description and an alternative number of bases-per-line.This is a convenient short handle for
startSequence(name, description, bpl).appendBases(bases)
.A
null
or empty ("") description will be ignored (no description will be output).The new sequence remains open meaning that additional bases for that sequence can be added with additional calls to
appendBases(java.lang.String)
.- Parameters:
name
- the name of the new sequence.bases
- the (first) bases of the sequence.description
- the description for the sequence.basesPerLine
- alternative number of bases per line to be used for the sequence.- Returns:
- a reference to this very same writer.
- Throws:
IOException
- if such an exception is thrown when actually writing into the output streams/channels.IllegalArgumentException
- if eithername
orbases
isnull
or contains an invalid value (e.g. unsupported bases or sequence names). Also when thedescription
contains unsupported characters orbasesPerLine
is 0 or negative.IllegalStateException
- if the writer is already closed, a previous sequence (if any was opened) has no base appended to it or a sequence with such name was already appended to this writer.
-
close
Closes this writer flushing all remaining writing operation input the output resources.Further calls to
appendBases(java.lang.String)
orstartSequence(java.lang.String)
will result in an exception.- Specified by:
close
in interfaceAutoCloseable
- Throws:
IOException
- if such exception is thrown when closing output writers and output streams.IllegalStateException
- if closing without writing any sequences or closing when writing a sequence is in progress
-
writeSingleSequenceReference
public static void writeSingleSequenceReference(Path whereTo, boolean makeIndex, boolean makeDict, String name, String description, byte[] bases) throws IOException Convenient method to write a FASTA file with a single sequence.- Parameters:
whereTo
- the path to. must not be null.makeIndex
- whether the index file should be written at its standard location.makeDict
- whether the dictionary file should be written at it standard location.name
- the sequence name, cannot contain white space, or control chracter or the header start character.description
- the sequence description, can be null or "" if no description.bases
- the sequence bases, cannot benull
.- Throws:
IOException
- if such exception is thrown when writing in the output resources.
-
writeSingleSequenceReference
public static void writeSingleSequenceReference(Path whereTo, int basesPerLine, boolean makeIndex, boolean makeDict, String name, String description, byte[] bases) throws IOException Convenient method to write a FASTA file with a single sequence.- Parameters:
whereTo
- the path to. must not be null.basesPerLine
- number of bases per line. must be 1 or greater.makeIndex
- whether the index file should be written at its standard location.makeDict
- whether the dictionary file should be written at it standard location.name
- the sequence name, cannot contain white space, or control chracter or the header start character.description
- the sequence description, can be null or "" if no description.bases
- the sequence bases, cannot benull
.- Throws:
IOException
- if such exception is thrown when writing in the output resources.
-