Class AlleleLikelihoods<EVIDENCE extends htsjdk.samtools.util.Locatable,A extends htsjdk.variant.variantcontext.Allele>

java.lang.Object
org.broadinstitute.hellbender.utils.genotyper.AlleleLikelihoods<EVIDENCE,A>
Type Parameters:
A - the type of the allele the likelihood makes reference to. Note: this class uses FastUtil collections for speed.
All Implemented Interfaces:
AlleleList<A>, SampleList

public class AlleleLikelihoods<EVIDENCE extends htsjdk.samtools.util.Locatable,A extends htsjdk.variant.variantcontext.Allele> extends Object implements SampleList, AlleleList<A>
Evidence-likelihoods container implementation based on integer indexed arrays.
  • Field Details

    • LOG_10_INFORMATIVE_THRESHOLD

      public static final double LOG_10_INFORMATIVE_THRESHOLD
      See Also:
    • NATURAL_LOG_INFORMATIVE_THRESHOLD

      public static final double NATURAL_LOG_INFORMATIVE_THRESHOLD
    • filteredEvidenceBySampleIndex

      protected final List<List<EVIDENCE extends htsjdk.samtools.util.Locatable>> filteredEvidenceBySampleIndex
      Evidence disqualified by .
    • samples

      protected final SampleList samples
      Sample list.
    • alleles

      protected AlleleList<A extends htsjdk.variant.variantcontext.Allele> alleles
      Allele list.
  • Constructor Details

    • AlleleLikelihoods

      public AlleleLikelihoods(SampleList samples, AlleleList<A> alleles, Map<String,List<EVIDENCE>> evidenceBySample)
      Constructs a new evidence-likelihood collection.

      The initial likelihoods for all allele-evidence combinations are 0.

      Parameters:
      samples - all supported samples in the collection.
      alleles - all supported alleles in the collection.
      evidenceBySample - evidence stratified per sample.
      Throws:
      IllegalArgumentException - if any of allele, samples or evidenceBySample is null, or if they contain null values.
  • Method Details

    • isNaturalLog

      public boolean isNaturalLog()
    • createAlleleLikelihoods

      public static <EVIDENCE extends htsjdk.samtools.util.Locatable, A extends htsjdk.variant.variantcontext.Allele> AlleleLikelihoods createAlleleLikelihoods(AlleleList alleles, SampleList samples, List<List<EVIDENCE>> evidenceBySampleIndex, List<List<EVIDENCE>> filteredEvidenceBySampleIndex, double[][][] values)
    • removeAllelesToSubset

      public AlleleLikelihoods<EVIDENCE,A> removeAllelesToSubset(Collection<A> subsetOfAlleles)
      Removes subset of alleles
    • indexOfSample

      public int indexOfSample(String sample)
      Returns the index of a sample within the likelihood collection.
      Specified by:
      indexOfSample in interface SampleList
      Parameters:
      sample - the query sample.
      Returns:
      -1 if the allele is not included, 0 or greater otherwise.
      Throws:
      IllegalArgumentException - if sample is null.
    • numberOfSamples

      public int numberOfSamples()
      Number of samples included in the likelihood collection.
      Specified by:
      numberOfSamples in interface SampleList
      Returns:
      0 or greater.
    • getSample

      public String getSample(int sampleIndex)
      Returns sample name given its index.
      Specified by:
      getSample in interface SampleList
      Parameters:
      sampleIndex - query index.
      Returns:
      never null.
      Throws:
      IllegalArgumentException - if sampleIndex is negative.
    • indexOfAllele

      public int indexOfAllele(htsjdk.variant.variantcontext.Allele allele)
      Returns the index of an allele within the likelihood collection.
      Specified by:
      indexOfAllele in interface AlleleList<EVIDENCE extends htsjdk.samtools.util.Locatable>
      Parameters:
      allele - the query allele.
      Returns:
      -1 if the allele is not included, 0 or greater otherwise.
      Throws:
      IllegalArgumentException - if allele is null.
    • numberOfAlleles

      public int numberOfAlleles()
      Returns number of alleles in the collection.
      Specified by:
      numberOfAlleles in interface AlleleList<EVIDENCE extends htsjdk.samtools.util.Locatable>
      Returns:
      0 or greater.
    • getAllele

      public A getAllele(int alleleIndex)
      Returns the allele given its index.
      Specified by:
      getAllele in interface AlleleList<EVIDENCE extends htsjdk.samtools.util.Locatable>
      Parameters:
      alleleIndex - the allele index.
      Returns:
      never null.
      Throws:
      IllegalArgumentException - the allele index is null.
    • sampleEvidence

      public List<EVIDENCE> sampleEvidence(int sampleIndex)
      Returns the units of evidence that belong to a sample sorted by their index (within that sample).
      Parameters:
      sampleIndex - the requested sample.
      Returns:
      never null but perhaps a zero-length array if there is no evidence in sample. No element in the array will be null.
    • filteredSampleEvidence

      public List<EVIDENCE> filteredSampleEvidence(int sampleIndex)
      Returns the units of evidence that have been removed by PairHMM error score filtering (and intentially not evidence filtered by any other mechanism).
      Parameters:
      sampleIndex - the requested sample.
      Returns:
      never null but perhaps a zero-length array if there is no filtered evidence for a sample. No element in the array will be null.
    • sampleMatrix

      public LikelihoodMatrix<EVIDENCE,A> sampleMatrix(int sampleIndex)
      Returns an evidence vs allele likelihood matrix corresponding to a sample.
      Parameters:
      sampleIndex - target sample.
      Returns:
      never null
      Throws:
      IllegalArgumentException - if sampleIndex is not null.
    • switchToNaturalLog

      public void switchToNaturalLog()
    • contaminationDownsampling

      public void contaminationDownsampling(Map<String,Double> perSampleDownsamplingFraction)
      Downsamples reads based on contamination fractions making sure that all alleles are affected proportionally.
      Parameters:
      perSampleDownsamplingFraction - contamination sample map where the sample name are the keys and the fractions are the values.
      Throws:
      IllegalArgumentException - if perSampleDownsamplingFraction is null.
    • normalizeLikelihoods

      public void normalizeLikelihoods(double maximumLikelihoodDifferenceCap, boolean symmetricallyNormalizeAllelesToReference)
      Adjusts likelihoods so that for each unit of evidence, the best allele likelihood is 0 and caps the minimum likelihood of any allele for each unit of evidence based on the maximum alternative allele likelihood.
      Parameters:
      maximumLikelihoodDifferenceCap - maximum difference between the best alternative allele likelihood and any other likelihood.
      Throws:
      IllegalArgumentException - if maximumDifferenceWithBestAlternative is not 0 or less.
    • samples

      public List<String> samples()
      Returns the samples in this evidence-likelihood collection.

      Samples are sorted by their index in the collection.

      The returned list is an unmodifiable view on the evidence-likelihoods sample list.

      Returns:
      never null.
    • alleles

      public List<A> alleles()
      Returns the samples in this evidence-likelihood collection.

      Samples are sorted by their index in the collection.

      The returned list is an unmodifiable. It will not be updated if the collection allele list changes.

      Returns:
      never null.
    • changeEvidence

      public void changeEvidence(Map<EVIDENCE,EVIDENCE> evidenceReplacements)
    • addMissingAlleles

      public boolean addMissingAlleles(Collection<A> candidateAlleles, double defaultLikelihood)
      Add alleles that are missing in the evidence-likelihoods collection giving all evidence a default likelihood value.
      Parameters:
      candidateAlleles - the potentially missing alleles.
      defaultLikelihood - the default evidence likelihood value for that allele.
      Returns:
      true iff the the evidence-likelihood collection was modified by the addition of the input alleles. So if all the alleles in the input collection were already present in the evidence-likelihood collection this method will return false.
      Throws:
      IllegalArgumentException - if candidateAlleles is null or there is more than one missing allele that is a reference or there is one but the collection already has a reference allele.
    • groupEvidence

      public <U, NEW_EVIDENCE_TYPE extends htsjdk.samtools.util.Locatable> AlleleLikelihoods<NEW_EVIDENCE_TYPE,A> groupEvidence(Function<EVIDENCE,U> groupingFunction, Function<List<EVIDENCE>,NEW_EVIDENCE_TYPE> gather)
      Group evidence into lists of evidence -- for example group by read name to force read pairs to support a single haplotype. Log Likelihoods are summed over all evidence in a group, corresponding to an independent evidence assumption. Since this container's likelihoods generally pertain to sequencing only (and not sample prep etc) this is usually a good assumption.
      Parameters:
      groupingFunction - Attribute function for grouping evidence, for example GATKRead::getName
      gather - Transformation applied to collections of evidence with same value of groupingFunction. For example, Fragment::new to construct a fragment out of a pair of reads with the same name
      Returns:
      a new AlleleLikelihoods based on the grouped, transformed evidence.
    • marginalize

      public <B extends htsjdk.variant.variantcontext.Allele> AlleleLikelihoods<EVIDENCE,B> marginalize(Map<B,List<A>> newToOldAlleleMap)
      Perform marginalization from an allele set to another (smaller one) taking the maximum value for each evidence in the original allele subset.
      Parameters:
      newToOldAlleleMap - map where the keys are the new alleles and the value list the original alleles that correspond to the new one.
      Returns:
      never null. The result will have the requested set of new alleles (keys in newToOldAlleleMap, and the same set of samples and evidence as the original.
      Throws:
      IllegalArgumentException - is newToOldAlleleMap is null or contains null values, or its values contain reference to non-existing alleles in this evidence-likelihood collection. Also no new allele can have zero old alleles mapping nor two new alleles can make reference to the same old allele.
    • addEvidence

      public void addEvidence(Map<String,List<EVIDENCE>> evidenceBySample, double initialLikelihood)
      Add more evidence to the collection.
      Parameters:
      evidenceBySample - evidence to add.
      initialLikelihood - the likelihood for the new entries.
      Throws:
      IllegalArgumentException - if evidenceBySample is null or evidenceBySample contains null evidence, or evidenceBySample contains evidence already present in the evidence-likelihood collection.
    • addNonReferenceAllele

      public void addNonReferenceAllele(A nonRefAllele)
      Adds the non-reference allele to the evidence-likelihood collection setting each evidence likelihood to the second best found (or best one if only one allele has likelihood).

      Nothing will happen if the evidence-likelihoods collection already includes the non-ref allele

      Implementation note: even when strictly speaking we do not need to demand the calling code to pass the reference the non-ref allele, we still demand it in order to lead the the calling code to use the right generic type for this likelihoods collection Allele.

      Parameters:
      nonRefAllele - the non-ref allele.
      Throws:
      IllegalArgumentException - if nonRefAllele is anything but the designated <NON_REF> symbolic allele Allele.NON_REF_ALLELE.
    • updateNonRefAlleleLikelihoods

      public void updateNonRefAlleleLikelihoods()
      Updates the likelihoods of the non-ref allele, if present, considering all non-symbolic alleles avaialble.
    • updateNonRefAlleleLikelihoods

      public void updateNonRefAlleleLikelihoods(AlleleList<A> allelesToConsider)
      Updates the likelihood of the NonRef allele (if present) based on the likelihoods of a set of non-symbolic
    • bestAllelesBreakingTies

      public Collection<AlleleLikelihoods<EVIDENCE,A>.BestAllele> bestAllelesBreakingTies(Function<A,Double> tieBreakingPriority)
      Returns the collection of best allele estimates for the evidence based on the evidence-likelihoods. "Ties" where the ref likelihood is within AlleleLikelihoods.INFORMATIVE_THRESHOLD of the greatest likelihood are broken by the tieBreakingPriority function.
      Returns:
      never null, one element per unit fo evidence in the evidence-likelihoods collection.
      Throws:
      IllegalStateException - if there is no alleles.
    • bestAllelesBreakingTies

      public Collection<AlleleLikelihoods<EVIDENCE,A>.BestAllele> bestAllelesBreakingTies()
      Default version where ties are broken in favor of the reference allele
    • bestAllelesBreakingTies

      public Collection<AlleleLikelihoods<EVIDENCE,A>.BestAllele> bestAllelesBreakingTies(String sample, Function<A,Double> tieBreakingPriority)
      Returns the collection of best allele estimates for one sample's evidence based on the evidence-likelihoods. "Ties" where the ref likelihood is within AlleleLikelihoods.INFORMATIVE_THRESHOLD of the greatest likelihood are broken by the tieBreakingPriority function.
      Returns:
      never null, one element per unit of evidence in the evidence-likelihoods collection.
      Throws:
      IllegalStateException - if there is no alleles.
    • bestAllelesBreakingTies

      public Collection<AlleleLikelihoods<EVIDENCE,A>.BestAllele> bestAllelesBreakingTies(String sample)
      Default version where ties are broken in favor of the reference allele
    • evidenceCount

      public int evidenceCount()
      Returns the total count of evidence in the evidence-likelihood collection.
    • sampleEvidenceCount

      public int sampleEvidenceCount(int sampleIndex)
      Returns the quantity of evidence that belongs to a sample in the evidence-likelihood collection.
      Parameters:
      sampleIndex - the query sample index.
      Returns:
      0 or greater.
      Throws:
      IllegalArgumentException - if sampleIndex is not a valid sample index.
    • retainEvidence

      public void retainEvidence(Predicate<? super EVIDENCE> predicate)
      Remove those reads that do not comply with a requirement.
      Parameters:
      predicate - the predicate representing the requirement.

      This method modifies the current read-likelihoods collection.

      Any exception thrown by the predicate will be propagated to the calling code.

      Throws:
      IllegalArgumentException - if predicate is null.
    • maximumLikelihoodOverAllAlleles

      protected double maximumLikelihoodOverAllAlleles(int sampleIndex, int evidenceIndex)
    • setVariantCallingSubsetUsed

      public void setVariantCallingSubsetUsed(SimpleInterval loc)
    • getVariantCallingSubsetApplied

      public SimpleInterval getVariantCallingSubsetApplied()
      Returns the location used for subsetting. May be null.
    • changeAlleles

      public void changeAlleles(List<A> newAlleles)
      Replaces the alleles in the readLikelihood matrix. Relevant for the uncollapsing code
      Parameters:
      newAlleles -
    • getFilteredHaplotypeCount

      public int getFilteredHaplotypeCount()
    • setFilteredHaplotypeCount

      public void setFilteredHaplotypeCount(int filteredHaplotypeCount)
    • filterPoorlyModeledEvidence

      public void filterPoorlyModeledEvidence(ToDoubleFunction<EVIDENCE> log10MinTrueLikelihood)
      Removes those read that the best possible likelihood given any allele is just too low.

      This is determined by a maximum error per read-base against the best likelihood possible.

      Parameters:
      log10MinTrueLikelihood - Function that returns the minimum likelihood that the best allele for a unit of evidence must have
      Throws:
      IllegalStateException - is not supported for read-likelihood that do not contain alleles.
      IllegalArgumentException - if maximumErrorPerBase is negative.