Package htsjdk.variant.vcf
Class AbstractVCFCodec
- All Implemented Interfaces:
FeatureCodec<VariantContext,
,LineIterator> NameAwareCodec
public abstract class AbstractVCFCodec
extends AsciiFeatureCodec<VariantContext>
implements NameAwareCodec
-
Field Summary
Modifier and TypeFieldDescriptionprotected boolean
If true, then we'll magically fix up VCF headers on the fly when we read them inprotected String[]
protected VCFHeader
protected int
protected final String[]
static final int
protected String
protected static final int
protected String[]
protected String
If non-null, we will replace the sample name read from the VCF header with this sample name.static boolean
protected VCFHeaderVersion
protected boolean
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionstatic boolean
canDecodeFile
(String potentialInput, String MAGIC_HEADER_LINE) createGenotypeMap
(String str, List<Allele> alleles, String chr, int pos) create a genotype mapdecode the line into a feature (VariantContext)the fast decode functionfinal void
Forces all VCFCodecs to not perform any on the fly modifications to the VCF header of VCF records.protected void
generateException
(String message) protected static void
generateException
(String message, int lineNo) getAltHeaderLine
(String headerLineString, VCFHeaderVersion sourceVersion) Create and return a VCFAltHeaderLine object from a header line string that conforms to thesourceVersion
protected String
getCachedString
(String str) Return a cached copy of the supplied string.getMetaHeaderLine
(String headerLineString, VCFHeaderVersion sourceVersion) Create and return a VCFMetaHeaderLine object from a header line string that conforms to thesourceVersion
getName()
get the name of this codecgetPedigreeHeaderLine
(String headerLineString, VCFHeaderVersion sourceVersion) Create and return a VCFPedigreeHeaderLine object from a header line string that conforms to thesourceVersion
getSampleHeaderLine
(String headerLineString, VCFHeaderVersion sourceVersion) Create and return a VCFSampleHeaderLine object from a header line string that conforms to thesourceVersion
Define the tabix format for the feature, used for indexing.protected static Allele
create a an allele from an index and an array of allelesparseAlleles
(String ref, String alts, int lineNo) parse out the allelesparseFilters
(String filterString) parse the filter string, first checking to see if we already have parsed it in a previous attemptparse genotype alleles from the genotype stringprotected VCFHeader
parseHeaderFromLines
(List<String> headerStrings, VCFHeaderVersion version) create a VCF header from a set of header record linesprotected static Double
parse out the qual valuevoid
set the name of this codecvoid
setRemappedSampleName
(String remappedSampleName) Replaces the sample name read from the VCF header with the remappedSampleName.setVCFHeader
(VCFHeader newHeader, VCFHeaderVersion newVersion) Explicitly set the VCFHeader on this codec.Methods inherited from class htsjdk.tribble.AsciiFeatureCodec
close, decode, isDone, makeIndexableSourceFromStream, makeSourceFromStream, readActualHeader, readHeader
Methods inherited from class htsjdk.tribble.AbstractFeatureCodec
decodeLoc, getFeatureType
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
Methods inherited from interface htsjdk.tribble.FeatureCodec
canDecode, getPathToDataFile
-
Field Details
-
MAX_ALLELE_SIZE_BEFORE_WARNING
public static final int MAX_ALLELE_SIZE_BEFORE_WARNING -
NUM_STANDARD_FIELDS
protected static final int NUM_STANDARD_FIELDS- See Also:
-
header
-
version
-
alleleMap
-
validate
public static boolean validate -
parts
-
genotypeParts
-
locParts
-
filterHash
-
name
-
lineNo
protected int lineNo -
stringCache
-
warnedAboutNoEqualsForNonFlag
protected boolean warnedAboutNoEqualsForNonFlag -
doOnTheFlyModifications
protected boolean doOnTheFlyModificationsIf true, then we'll magically fix up VCF headers on the fly when we read them in -
remappedSampleName
If non-null, we will replace the sample name read from the VCF header with this sample name. This feature works only for single-sample VCFs.
-
-
Constructor Details
-
AbstractVCFCodec
protected AbstractVCFCodec()
-
-
Method Details
-
parseFilters
parse the filter string, first checking to see if we already have parsed it in a previous attempt- Parameters:
filterString
- the string to parse- Returns:
- a set of the filters applied
-
parseHeaderFromLines
create a VCF header from a set of header record lines- Parameters:
headerStrings
- a list of strings that represent all the ## and # entries- Returns:
- a VCFHeader object
-
getHeader
- Returns:
- the header that was either explicitly set on this codec, or read from the file. May be null. The returned value should not be modified.
-
getVersion
- Returns:
- the version number that was either explicitly set on this codec, or read from the file. May be null.
-
setVCFHeader
Explicitly set the VCFHeader on this codec. This will overwrite the header read from the file and the version state stored in this instance; conversely, reading the header from a file will overwrite whatever is set here.- Parameters:
newHeader
-newVersion
-- Returns:
- the actual header for this codec. The returned header may not be identical to the header argument since the header lines may be "repaired" (i.e., rewritten) if doOnTheFlyModifications is set.
- Throws:
TribbleException
- if the requested header version is not compatible with the existing version
-
getAltHeaderLine
Create and return a VCFAltHeaderLine object from a header line string that conforms to thesourceVersion
- Parameters:
headerLineString
- VCF header line being parsed without the leading "##ALT="sourceVersion
- the VCF header version derived from which the source was retrieved. The resulting header line object should be validate for this header version.- Returns:
- a VCFAltHeaderLine object
-
getPedigreeHeaderLine
public VCFPedigreeHeaderLine getPedigreeHeaderLine(String headerLineString, VCFHeaderVersion sourceVersion) Create and return a VCFPedigreeHeaderLine object from a header line string that conforms to thesourceVersion
- Parameters:
headerLineString
- VCF header line being parsed without the leading "##PEDIGREE="sourceVersion
- the VCF header version derived from which the source was retrieved. The resulting header line object should be validate for this header version.- Returns:
- a VCFPedigreeHeaderLine object
-
getMetaHeaderLine
Create and return a VCFMetaHeaderLine object from a header line string that conforms to thesourceVersion
- Parameters:
headerLineString
- VCF header line being parsed without the leading "##META="sourceVersion
- the VCF header version derived from which the source was retrieved. The resulting header line object should be validate for this header version.- Returns:
- a VCFMetaHeaderLine object
-
getSampleHeaderLine
public VCFSampleHeaderLine getSampleHeaderLine(String headerLineString, VCFHeaderVersion sourceVersion) Create and return a VCFSampleHeaderLine object from a header line string that conforms to thesourceVersion
- Parameters:
headerLineString
- VCF header line being parsed without the leading "##SAMPLE="sourceVersion
- the VCF header version derived from which the source was retrieved. The resulting header line object should be validate for this header version.- Returns:
- a VCFSampleHeaderLine object
-
decodeLoc
the fast decode function- Parameters:
line
- the line of text for the record- Returns:
- a feature, (not guaranteed complete) that has the correct start and stop
-
decode
decode the line into a feature (VariantContext)- Specified by:
decode
in classAsciiFeatureCodec<VariantContext>
- Parameters:
line
- the line- Returns:
- a VariantContext
- See Also:
-
getName
get the name of this codec- Specified by:
getName
in interfaceNameAwareCodec
- Returns:
- our set name
-
setName
set the name of this codec- Specified by:
setName
in interfaceNameAwareCodec
- Parameters:
name
- new name
-
getCachedString
Return a cached copy of the supplied string.- Parameters:
str
- string- Returns:
- interned string
-
oneAllele
create a an allele from an index and an array of alleles- Parameters:
index
- the indexalleles
- the alleles- Returns:
- an Allele
-
parseGenotypeAlleles
protected static List<Allele> parseGenotypeAlleles(String GT, List<Allele> alleles, Map<String, List<Allele>> cache) parse genotype alleles from the genotype string- Parameters:
GT
- GT stringalleles
- list of possible allelescache
- cache of alleles for GT- Returns:
- the allele list for the GT string
-
parseQual
parse out the qual value- Parameters:
qualString
- the quality string- Returns:
- return a double
-
parseAlleles
parse out the alleles- Parameters:
ref
- the reference basealts
- a string of alternates to break into alleleslineNo
- the line number for this record- Returns:
- a list of alleles, and a pair of the shortest and longest sequence
-
canDecodeFile
-
createGenotypeMap
public LazyGenotypesContext.LazyData createGenotypeMap(String str, List<Allele> alleles, String chr, int pos) create a genotype map- Parameters:
str
- the stringalleles
- the list of alleles- Returns:
- a mapping of sample name to genotype object
-
disableOnTheFlyModifications
public final void disableOnTheFlyModifications()Forces all VCFCodecs to not perform any on the fly modifications to the VCF header of VCF records. Useful primarily for raw comparisons such as when comparing raw VCF records -
setRemappedSampleName
Replaces the sample name read from the VCF header with the remappedSampleName. Works only for single-sample VCFs -- attempting to perform sample name remapping for multi-sample VCFs will produce an Exception.- Parameters:
remappedSampleName
- replacement sample name for the sample specified in the VCF header
-
generateException
-
generateException
-
getTabixFormat
Description copied from interface:FeatureCodec
Define the tabix format for the feature, used for indexing. Default implementation throws an exception. Note that onlyAsciiFeatureCodec
could read tabix files as defined inAbstractFeatureReader.getFeatureReader(String, String, FeatureCodec, boolean, java.util.function.Function, java.util.function.Function)
- Specified by:
getTabixFormat
in interfaceFeatureCodec<VariantContext,
LineIterator> - Returns:
- the format to use with tabix
-