-
- All Implemented Interfaces:
public class ConsensusProcessingUtils
This class holds methods for processing consensus using the variants and alleles tables or data derived from them.
-
-
Method Summary
Modifier and Type Method Description static Map<Byte, Integer>
createGtvalueMap()
static long
getLongFromRefAltData(RefAltData rad)
RefAltData comes from processing the Longs stored for each haplotype. static RefAltData
findTaxonRefCoverage(Position startPos, int maxRefLen, TaxaList tList, Map<Taxon, RangeMap<Position, RefAltData>> taxonToPosDataMap, double maxError, double minTaxaCoverage)
static TaxaList
splitConsensusTaxa(TaxaList tList)
static Map<Position, Integer>
chooseVarIdForSNPPositionFromGenotypeTable(GenotypeTable genotypeTable, Map<Integer, BiMap<Integer, Integer>> posCallVarIdMap)
static Map<Integer, RefAltData>
createVarIDtoRefAltData(Map<Taxon, RangeMap<Position, RefAltData>> taxonToPosDataMap)
Returns a map of variantID to RefAltData for each variantId/data lookup. static String
convertVariantsToSequence(List<HaplotypeNode.VariantInfo> variants, ReferenceRange refRange, GenomeSequence refSequence)
static boolean
variantsInOrder(List<HaplotypeNode.VariantInfo> variants)
static Long
encodeRefBlockToLong(int refLength, int refDepth, int pos)
Encode a reference block to a Long static Long
encodeVariantToLong(int variantId, int refDepth, int altDepth, boolean isIndel)
Encode variant information as a Long static boolean
areGraphTaxaInRankingMap(HaplotypeGraph graph, Map<String, Double> rankingMap)
Method to verify that the taxa in the graph are in the ranking file. static boolean
areRankingsUnique(Map<String, Double> rankingMap)
Method to verify that rankings are unique. static DistanceMatrix
createDistanceMatrix(int ntaxa, Chromosome chr, ReferenceRange currentRefRange, TaxaList taxaWithInfo, Map<Taxon, RangeMap<Integer, HaplotypeNode.VariantInfo>> taxonToVariantInfoMap)
Function to create a disntance matrix given a set of variants for a single reference range. static DistanceMatrix
setNsToMax(DistanceMatrix originalDM)
Function to create a new DistanceMatrix setting NaNs to the maximum value in both its row and column. static float
maxDistance(DistanceMatrix matrix, int row, int col)
Function to figure out what the maximum row and col distance are for a given position in the oringinal distance Matrix. -
-
Method Detail
-
createGtvalueMap
static Map<Byte, Integer> createGtvalueMap()
-
getLongFromRefAltData
static long getLongFromRefAltData(RefAltData rad)
RefAltData comes from processing the Longs stored for each haplotype. It is used for consensus processing. Different data is available than is with the VariantMappingData used for processing single haplotypes.
-
findTaxonRefCoverage
static RefAltData findTaxonRefCoverage(Position startPos, int maxRefLen, TaxaList tList, Map<Taxon, RangeMap<Position, RefAltData>> taxonToPosDataMap, double maxError, double minTaxaCoverage)
-
splitConsensusTaxa
static TaxaList splitConsensusTaxa(TaxaList tList)
-
chooseVarIdForSNPPositionFromGenotypeTable
static Map<Position, Integer> chooseVarIdForSNPPositionFromGenotypeTable(GenotypeTable genotypeTable, Map<Integer, BiMap<Integer, Integer>> posCallVarIdMap)
-
createVarIDtoRefAltData
static Map<Integer, RefAltData> createVarIDtoRefAltData(Map<Taxon, RangeMap<Position, RefAltData>> taxonToPosDataMap)
Returns a map of variantID to RefAltData for each variantId/data lookup. variantID of -1 (ref) is handled in calling method.
-
convertVariantsToSequence
static String convertVariantsToSequence(List<HaplotypeNode.VariantInfo> variants, ReferenceRange refRange, GenomeSequence refSequence)
-
variantsInOrder
static boolean variantsInOrder(List<HaplotypeNode.VariantInfo> variants)
-
encodeRefBlockToLong
static Long encodeRefBlockToLong(int refLength, int refDepth, int pos)
Encode a reference block to a Long
- Parameters:
refLength
- length of the reference blockrefDepth
- readDepth in the reference block.pos
- the start position of the block on a chromosome
-
encodeVariantToLong
static Long encodeVariantToLong(int variantId, int refDepth, int altDepth, boolean isIndel)
Encode variant information as a Long
- Parameters:
refDepth
- read depth of the reference allele.altDepth
- read depth of the alt allele.isIndel
- true if the variant is an indel, false otherwise.
-
areGraphTaxaInRankingMap
static boolean areGraphTaxaInRankingMap(HaplotypeGraph graph, Map<String, Double> rankingMap)
Method to verify that the taxa in the graph are in the ranking file. We can have more taxon in the ranking file than in the graph, but not the other way around.
-
areRankingsUnique
static boolean areRankingsUnique(Map<String, Double> rankingMap)
Method to verify that rankings are unique. This is used to throw a Warning message in the Assembly consensus.
-
createDistanceMatrix
static DistanceMatrix createDistanceMatrix(int ntaxa, Chromosome chr, ReferenceRange currentRefRange, TaxaList taxaWithInfo, Map<Taxon, RangeMap<Integer, HaplotypeNode.VariantInfo>> taxonToVariantInfoMap)
Function to create a disntance matrix given a set of variants for a single reference range. This will ignore Ns and indels Distance is #SNPs/(#SNPs + #RefPos)
-
setNsToMax
static DistanceMatrix setNsToMax(DistanceMatrix originalDM)
Function to create a new DistanceMatrix setting NaNs to the maximum value in both its row and column.
-
maxDistance
static float maxDistance(DistanceMatrix matrix, int row, int col)
Function to figure out what the maximum row and col distance are for a given position in the oringinal distance Matrix.
-
-
-
-