public class DiploidCountsToPath
Used to find the most likely pair of paths through a graph given a set of read mappings. The two paths represent the phased haplotypes imputed from the reads.
public DiploidCountsToPath(@NotNull HaplotypeGraph myGraph, @NotNull com.google.common.collect.Multimap<net.maizegenetics.pangenome.api.ReferenceRange,net.maizegenetics.pangenome.hapCalling.HapIdSetCount> readHapids, double probabilityCorrect, double minTransitionProbability, int maxNodesPerRange, int minReadsPerRange, boolean removeRangesWithEqualCounts, int maxReadsPerKB, boolean splitNodes, double splitTransitionProb)
Used to find the most likely pair of paths through a graph given a set of read mappings. The two paths represent the phased haplotypes imputed from the reads.
myGraph
- the HaplotypeGraph to be used for path findingreadHapids
- a Multimap of class ReferenceRange
-> class HapIdSetCount
retrieved from a PHG DBprobabilityCorrect
- the probability that a read has aligned correctlyminTransitionProbability
- the minimum transition probabilitymaxNodesPerRange
- ranges with more nodes will not be usedminReadsPerRange
- ranges with fewer reads will not be usedremoveRangesWithEqualCounts
- ranges for which all hapid sets have the same count will not be usedmaxReadsPerKB
- ranges with more reads per kb will not be usedsplitNodes
- should nodes with more than one taxon be split into nodes of one taxon eachsplitTransitionProb
- the probability of a node given that the previous node was the same taxon or 1 - probability of a recombination@NotNull public NonExistentClass getMyLogger()
@NotNull public java.util.List<java.util.List> getDiploidPath()
Finds the most likely diploid path through myGraph given readHapids. Before the path is imputed, the graph is filtered based on maxNodesPerRange, minReadsPerRange, removeRangesWithEqualCounts, and maxReadsPerKB then missing sequence nodes are added and, finally, the nodes are split into individual taxa if splitNodes is true.
@NotNull public HaplotypeGraph filteredGraph()
Filters myGraph using readHapids based on maxNodesPerRange, minReadsPerRange, removeRangesWithEqualCounts, and maxReadsPerKB
@NotNull public HaplotypeGraph getMyGraph()
the HaplotypeGraph to be used for path finding
@NotNull public com.google.common.collect.Multimap<net.maizegenetics.pangenome.api.ReferenceRange,net.maizegenetics.pangenome.hapCalling.HapIdSetCount> getReadHapids()
a Multimap of
class ReferenceRange
-> class HapIdSetCount
retrieved from a PHG DB
class ReferenceRange
,
class HapIdSetCount
public double getProbabilityCorrect()
the probability that a read has aligned correctly
public double getMinTransitionProbability()
the minimum transition probability
public int getMaxNodesPerRange()
ranges with more nodes will not be used
public int getMinReadsPerRange()
ranges with fewer reads will not be used
public boolean getRemoveRangesWithEqualCounts()
ranges for which all hapid sets have the same count will not be used
public int getMaxReadsPerKB()
ranges with more reads per kb will not be used
public boolean getSplitNodes()
should nodes with more than one taxon be split into nodes of one taxon each
public double getSplitTransitionProb()
the probability of a node given that the previous node was the same taxon or 1
- probability of a recombination