Class AlleleLikelihoods<EVIDENCE extends htsjdk.samtools.util.Locatable,A extends htsjdk.variant.variantcontext.Allele>
- Type Parameters:
A
- the type of the allele the likelihood makes reference to. Note: this class uses FastUtil collections for speed.
- All Implemented Interfaces:
AlleleList<A>
,SampleList
-
Nested Class Summary
Modifier and TypeClassDescriptionfinal class
Contains information about the best allele for a unit of evidence.Nested classes/interfaces inherited from interface org.broadinstitute.hellbender.utils.genotyper.AlleleList
AlleleList.ActualPermutation<A extends htsjdk.variant.variantcontext.Allele>, AlleleList.NonPermutation<A extends htsjdk.variant.variantcontext.Allele>
-
Field Summary
Modifier and TypeFieldDescriptionprotected AlleleList<A>
Allele list.Evidence disqualified by .static final double
static final double
protected final SampleList
Sample list.Fields inherited from interface org.broadinstitute.hellbender.utils.genotyper.AlleleList
EMPTY_LIST
Fields inherited from interface org.broadinstitute.hellbender.utils.genotyper.SampleList
EMPTY_LIST
-
Constructor Summary
ConstructorDescriptionAlleleLikelihoods
(SampleList samples, AlleleList<A> alleles, Map<String, List<EVIDENCE>> evidenceBySample) Constructs a new evidence-likelihood collection. -
Method Summary
Modifier and TypeMethodDescriptionvoid
addEvidence
(Map<String, List<EVIDENCE>> evidenceBySample, double initialLikelihood) Add more evidence to the collection.boolean
addMissingAlleles
(Collection<A> candidateAlleles, double defaultLikelihood) Add alleles that are missing in the evidence-likelihoods collection giving all evidence a default likelihood value.void
addNonReferenceAllele
(A nonRefAllele) Adds the non-reference allele to the evidence-likelihood collection setting each evidence likelihood to the second best found (or best one if only one allele has likelihood).alleles()
Returns the samples in this evidence-likelihood collection.Default version where ties are broken in favor of the reference allelebestAllelesBreakingTies
(String sample) Default version where ties are broken in favor of the reference allelebestAllelesBreakingTies
(String sample, Function<A, Double> tieBreakingPriority) Returns the collection of best allele estimates for one sample's evidence based on the evidence-likelihoods.bestAllelesBreakingTies
(Function<A, Double> tieBreakingPriority) Returns the collection of best allele estimates for the evidence based on the evidence-likelihoods.void
changeAlleles
(List<A> newAlleles) Replaces the alleles in the readLikelihood matrix.void
changeEvidence
(Map<EVIDENCE, EVIDENCE> evidenceReplacements) void
contaminationDownsampling
(Map<String, Double> perSampleDownsamplingFraction) Downsamples reads based on contamination fractions making sure that all alleles are affected proportionally.static <EVIDENCE extends htsjdk.samtools.util.Locatable,
A extends htsjdk.variant.variantcontext.Allele>
AlleleLikelihoodscreateAlleleLikelihoods
(AlleleList alleles, SampleList samples, List<List<EVIDENCE>> evidenceBySampleIndex, List<List<EVIDENCE>> filteredEvidenceBySampleIndex, double[][][] values) int
Returns the total count of evidence in the evidence-likelihood collection.filteredSampleEvidence
(int sampleIndex) Returns the units of evidence that have been removed by PairHMM error score filtering (and intentially not evidence filtered by any other mechanism).void
filterPoorlyModeledEvidence
(ToDoubleFunction<EVIDENCE> log10MinTrueLikelihood) Removes those read that the best possible likelihood given any allele is just too low.getAllele
(int alleleIndex) Returns the allele given its index.int
getSample
(int sampleIndex) Returns sample name given its index.Returns the location used for subsetting.<U,
NEW_EVIDENCE_TYPE extends htsjdk.samtools.util.Locatable>
AlleleLikelihoods<NEW_EVIDENCE_TYPE,A> groupEvidence
(Function<EVIDENCE, U> groupingFunction, Function<List<EVIDENCE>, NEW_EVIDENCE_TYPE> gather) Group evidence into lists of evidence -- for example group by read name to force read pairs to support a single haplotype.int
indexOfAllele
(htsjdk.variant.variantcontext.Allele allele) Returns the index of an allele within the likelihood collection.int
indexOfSample
(String sample) Returns the index of a sample within the likelihood collection.boolean
<B extends htsjdk.variant.variantcontext.Allele>
AlleleLikelihoods<EVIDENCE,B> marginalize
(Map<B, List<A>> newToOldAlleleMap) Perform marginalization from an allele set to another (smaller one) taking the maximum value for each evidence in the original allele subset.protected double
maximumLikelihoodOverAllAlleles
(int sampleIndex, int evidenceIndex) void
normalizeLikelihoods
(double maximumLikelihoodDifferenceCap, boolean symmetricallyNormalizeAllelesToReference) Adjusts likelihoods so that for each unit of evidence, the best allele likelihood is 0 and caps the minimum likelihood of any allele for each unit of evidence based on the maximum alternative allele likelihood.int
Returns number of alleles in the collection.int
Number of samples included in the likelihood collection.removeAllelesToSubset
(Collection<A> subsetOfAlleles) Removes subset of allelesvoid
retainEvidence
(Predicate<? super EVIDENCE> predicate) Remove those reads that do not comply with a requirement.sampleEvidence
(int sampleIndex) Returns the units of evidence that belong to a sample sorted by their index (within that sample).int
sampleEvidenceCount
(int sampleIndex) Returns the quantity of evidence that belongs to a sample in the evidence-likelihood collection.sampleMatrix
(int sampleIndex) Returns an evidence vs allele likelihood matrix corresponding to a sample.samples()
Returns the samples in this evidence-likelihood collection.void
setFilteredHaplotypeCount
(int filteredHaplotypeCount) void
void
void
Updates the likelihoods of the non-ref allele, if present, considering all non-symbolic alleles avaialble.void
updateNonRefAlleleLikelihoods
(AlleleList<A> allelesToConsider) Updates the likelihood of the NonRef allele (if present) based on the likelihoods of a set of non-symbolicMethods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
Methods inherited from interface org.broadinstitute.hellbender.utils.genotyper.AlleleList
asListOfAlleles, containsAllele, indexOfReference, permutation
Methods inherited from interface org.broadinstitute.hellbender.utils.genotyper.SampleList
asListOfSamples, asSetOfSamples
-
Field Details
-
LOG_10_INFORMATIVE_THRESHOLD
public static final double LOG_10_INFORMATIVE_THRESHOLD- See Also:
-
NATURAL_LOG_INFORMATIVE_THRESHOLD
public static final double NATURAL_LOG_INFORMATIVE_THRESHOLD -
filteredEvidenceBySampleIndex
protected final List<List<EVIDENCE extends htsjdk.samtools.util.Locatable>> filteredEvidenceBySampleIndexEvidence disqualified by . -
samples
Sample list. -
alleles
Allele list.
-
-
Constructor Details
-
AlleleLikelihoods
public AlleleLikelihoods(SampleList samples, AlleleList<A> alleles, Map<String, List<EVIDENCE>> evidenceBySample) Constructs a new evidence-likelihood collection.The initial likelihoods for all allele-evidence combinations are 0.
- Parameters:
samples
- all supported samples in the collection.alleles
- all supported alleles in the collection.evidenceBySample
- evidence stratified per sample.- Throws:
IllegalArgumentException
- if any ofallele
,samples
orevidenceBySample
isnull
, or if they contain null values.
-
-
Method Details
-
isNaturalLog
public boolean isNaturalLog() -
createAlleleLikelihoods
public static <EVIDENCE extends htsjdk.samtools.util.Locatable,A extends htsjdk.variant.variantcontext.Allele> AlleleLikelihoods createAlleleLikelihoods(AlleleList alleles, SampleList samples, List<List<EVIDENCE>> evidenceBySampleIndex, List<List<EVIDENCE>> filteredEvidenceBySampleIndex, double[][][] values) -
removeAllelesToSubset
Removes subset of alleles -
indexOfSample
Returns the index of a sample within the likelihood collection.- Specified by:
indexOfSample
in interfaceSampleList
- Parameters:
sample
- the query sample.- Returns:
- -1 if the allele is not included, 0 or greater otherwise.
- Throws:
IllegalArgumentException
- ifsample
isnull
.
-
numberOfSamples
public int numberOfSamples()Number of samples included in the likelihood collection.- Specified by:
numberOfSamples
in interfaceSampleList
- Returns:
- 0 or greater.
-
getSample
Returns sample name given its index.- Specified by:
getSample
in interfaceSampleList
- Parameters:
sampleIndex
- query index.- Returns:
- never
null
. - Throws:
IllegalArgumentException
- ifsampleIndex
is negative.
-
indexOfAllele
public int indexOfAllele(htsjdk.variant.variantcontext.Allele allele) Returns the index of an allele within the likelihood collection.- Specified by:
indexOfAllele
in interfaceAlleleList<EVIDENCE extends htsjdk.samtools.util.Locatable>
- Parameters:
allele
- the query allele.- Returns:
- -1 if the allele is not included, 0 or greater otherwise.
- Throws:
IllegalArgumentException
- ifallele
isnull
.
-
numberOfAlleles
public int numberOfAlleles()Returns number of alleles in the collection.- Specified by:
numberOfAlleles
in interfaceAlleleList<EVIDENCE extends htsjdk.samtools.util.Locatable>
- Returns:
- 0 or greater.
-
getAllele
Returns the allele given its index.- Specified by:
getAllele
in interfaceAlleleList<EVIDENCE extends htsjdk.samtools.util.Locatable>
- Parameters:
alleleIndex
- the allele index.- Returns:
- never
null
. - Throws:
IllegalArgumentException
- the allele index isnull
.
-
sampleEvidence
Returns the units of evidence that belong to a sample sorted by their index (within that sample).- Parameters:
sampleIndex
- the requested sample.- Returns:
- never
null
but perhaps a zero-length array if there is no evidence in sample. No element in the array will be null.
-
filteredSampleEvidence
Returns the units of evidence that have been removed by PairHMM error score filtering (and intentially not evidence filtered by any other mechanism).- Parameters:
sampleIndex
- the requested sample.- Returns:
- never
null
but perhaps a zero-length array if there is no filtered evidence for a sample. No element in the array will be null.
-
sampleMatrix
Returns an evidence vs allele likelihood matrix corresponding to a sample.- Parameters:
sampleIndex
- target sample.- Returns:
- never
null
- Throws:
IllegalArgumentException
- ifsampleIndex
is not null.
-
switchToNaturalLog
public void switchToNaturalLog() -
contaminationDownsampling
Downsamples reads based on contamination fractions making sure that all alleles are affected proportionally.- Parameters:
perSampleDownsamplingFraction
- contamination sample map where the sample name are the keys and the fractions are the values.- Throws:
IllegalArgumentException
- ifperSampleDownsamplingFraction
isnull
.
-
normalizeLikelihoods
public void normalizeLikelihoods(double maximumLikelihoodDifferenceCap, boolean symmetricallyNormalizeAllelesToReference) Adjusts likelihoods so that for each unit of evidence, the best allele likelihood is 0 and caps the minimum likelihood of any allele for each unit of evidence based on the maximum alternative allele likelihood.- Parameters:
maximumLikelihoodDifferenceCap
- maximum difference between the best alternative allele likelihood and any other likelihood.- Throws:
IllegalArgumentException
- ifmaximumDifferenceWithBestAlternative
is not 0 or less.
-
samples
Returns the samples in this evidence-likelihood collection.Samples are sorted by their index in the collection.
The returned list is an unmodifiable view on the evidence-likelihoods sample list.
- Returns:
- never
null
.
-
alleles
Returns the samples in this evidence-likelihood collection.Samples are sorted by their index in the collection.
The returned list is an unmodifiable. It will not be updated if the collection allele list changes.
- Returns:
- never
null
.
-
changeEvidence
-
addMissingAlleles
Add alleles that are missing in the evidence-likelihoods collection giving all evidence a default likelihood value.- Parameters:
candidateAlleles
- the potentially missing alleles.defaultLikelihood
- the default evidence likelihood value for that allele.- Returns:
true
iff the the evidence-likelihood collection was modified by the addition of the input alleles. So if all the alleles in the input collection were already present in the evidence-likelihood collection this method will returnfalse
.- Throws:
IllegalArgumentException
- ifcandidateAlleles
isnull
or there is more than one missing allele that is a reference or there is one but the collection already has a reference allele.
-
groupEvidence
public <U,NEW_EVIDENCE_TYPE extends htsjdk.samtools.util.Locatable> AlleleLikelihoods<NEW_EVIDENCE_TYPE,A> groupEvidence(Function<EVIDENCE, U> groupingFunction, Function<List<EVIDENCE>, NEW_EVIDENCE_TYPE> gather) Group evidence into lists of evidence -- for example group by read name to force read pairs to support a single haplotype. Log Likelihoods are summed over all evidence in a group, corresponding to an independent evidence assumption. Since this container's likelihoods generally pertain to sequencing only (and not sample prep etc) this is usually a good assumption.- Parameters:
groupingFunction
- Attribute function for grouping evidence, for example GATKRead::getNamegather
- Transformation applied to collections of evidence with same value of groupingFunction. For example, Fragment::new to construct a fragment out of a pair of reads with the same name- Returns:
- a new AlleleLikelihoods based on the grouped, transformed evidence.
-
marginalize
public <B extends htsjdk.variant.variantcontext.Allele> AlleleLikelihoods<EVIDENCE,B> marginalize(Map<B, List<A>> newToOldAlleleMap) Perform marginalization from an allele set to another (smaller one) taking the maximum value for each evidence in the original allele subset.- Parameters:
newToOldAlleleMap
- map where the keys are the new alleles and the value list the original alleles that correspond to the new one.- Returns:
- never
null
. The result will have the requested set of new alleles (keys innewToOldAlleleMap
, and the same set of samples and evidence as the original. - Throws:
IllegalArgumentException
- isnewToOldAlleleMap
isnull
or containsnull
values, or its values contain reference to non-existing alleles in this evidence-likelihood collection. Also no new allele can have zero old alleles mapping nor two new alleles can make reference to the same old allele.
-
addEvidence
Add more evidence to the collection.- Parameters:
evidenceBySample
- evidence to add.initialLikelihood
- the likelihood for the new entries.- Throws:
IllegalArgumentException
- ifevidenceBySample
isnull
orevidenceBySample
containsnull
evidence, orevidenceBySample
contains evidence already present in the evidence-likelihood collection.
-
addNonReferenceAllele
Adds the non-reference allele to the evidence-likelihood collection setting each evidence likelihood to the second best found (or best one if only one allele has likelihood).Nothing will happen if the evidence-likelihoods collection already includes the non-ref allele
Implementation note: even when strictly speaking we do not need to demand the calling code to pass the reference the non-ref allele, we still demand it in order to lead the the calling code to use the right generic type for this likelihoods collection
Allele
.- Parameters:
nonRefAllele
- the non-ref allele.- Throws:
IllegalArgumentException
- ifnonRefAllele
is anything but the designated <NON_REF> symbolic alleleAllele.NON_REF_ALLELE
.
-
updateNonRefAlleleLikelihoods
public void updateNonRefAlleleLikelihoods()Updates the likelihoods of the non-ref allele, if present, considering all non-symbolic alleles avaialble. -
updateNonRefAlleleLikelihoods
Updates the likelihood of the NonRef allele (if present) based on the likelihoods of a set of non-symbolic -
bestAllelesBreakingTies
public Collection<AlleleLikelihoods<EVIDENCE,A>.BestAllele> bestAllelesBreakingTies(Function<A, Double> tieBreakingPriority) Returns the collection of best allele estimates for the evidence based on the evidence-likelihoods. "Ties" where the ref likelihood is withinAlleleLikelihoods.INFORMATIVE_THRESHOLD
of the greatest likelihood are broken by thetieBreakingPriority
function.- Returns:
- never
null
, one element per unit fo evidence in the evidence-likelihoods collection. - Throws:
IllegalStateException
- if there is no alleles.
-
bestAllelesBreakingTies
Default version where ties are broken in favor of the reference allele -
bestAllelesBreakingTies
public Collection<AlleleLikelihoods<EVIDENCE,A>.BestAllele> bestAllelesBreakingTies(String sample, Function<A, Double> tieBreakingPriority) Returns the collection of best allele estimates for one sample's evidence based on the evidence-likelihoods. "Ties" where the ref likelihood is withinAlleleLikelihoods.INFORMATIVE_THRESHOLD
of the greatest likelihood are broken by thetieBreakingPriority
function.- Returns:
- never
null
, one element per unit of evidence in the evidence-likelihoods collection. - Throws:
IllegalStateException
- if there is no alleles.
-
bestAllelesBreakingTies
Default version where ties are broken in favor of the reference allele -
evidenceCount
public int evidenceCount()Returns the total count of evidence in the evidence-likelihood collection. -
sampleEvidenceCount
public int sampleEvidenceCount(int sampleIndex) Returns the quantity of evidence that belongs to a sample in the evidence-likelihood collection.- Parameters:
sampleIndex
- the query sample index.- Returns:
- 0 or greater.
- Throws:
IllegalArgumentException
- ifsampleIndex
is not a valid sample index.
-
retainEvidence
Remove those reads that do not comply with a requirement.- Parameters:
predicate
- the predicate representing the requirement.This method modifies the current read-likelihoods collection.
Any exception thrown by the predicate will be propagated to the calling code.
- Throws:
IllegalArgumentException
- ifpredicate
isnull
.
-
maximumLikelihoodOverAllAlleles
protected double maximumLikelihoodOverAllAlleles(int sampleIndex, int evidenceIndex) -
setVariantCallingSubsetUsed
-
getVariantCallingSubsetApplied
Returns the location used for subsetting. May be null. -
changeAlleles
Replaces the alleles in the readLikelihood matrix. Relevant for the uncollapsing code- Parameters:
newAlleles
-
-
getFilteredHaplotypeCount
public int getFilteredHaplotypeCount() -
setFilteredHaplotypeCount
public void setFilteredHaplotypeCount(int filteredHaplotypeCount) -
filterPoorlyModeledEvidence
Removes those read that the best possible likelihood given any allele is just too low.This is determined by a maximum error per read-base against the best likelihood possible.
- Parameters:
log10MinTrueLikelihood
- Function that returns the minimum likelihood that the best allele for a unit of evidence must have- Throws:
IllegalStateException
- is not supported for read-likelihood that do not contain alleles.IllegalArgumentException
- ifmaximumErrorPerBase
is negative.
-