Class AlleleLikelihoods<EVIDENCE extends htsjdk.samtools.util.Locatable,A extends htsjdk.variant.variantcontext.Allele>

Type Parameters:
A - the type of the allele the likelihood makes reference to. Note: this class uses FastUtil collections for speed.
All Implemented Interfaces:
AlleleList<A>, SampleList

public class AlleleLikelihoods<EVIDENCE extends htsjdk.samtools.util.Locatable,A extends htsjdk.variant.variantcontext.Allele> extends Object implements SampleList, AlleleList<A>
Evidence-likelihoods container implementation based on integer indexed arrays.
  • Field Details


      public static final double LOG_10_INFORMATIVE_THRESHOLD
      public static final double NATURAL_LOG_INFORMATIVE_THRESHOLD
    • filteredEvidenceBySampleIndex

      protected final List<List<EVIDENCE extends htsjdk.samtools.util.Locatable>> filteredEvidenceBySampleIndex
      Evidence disqualified by .
    • samples

      protected final SampleList samples
      Sample list.
    • alleles

      protected AlleleList<A extends htsjdk.variant.variantcontext.Allele> alleles
      Allele list.
  • Constructor Details

    • AlleleLikelihoods

      public AlleleLikelihoods(SampleList samples, AlleleList<A> alleles, Map<String,List<EVIDENCE>> evidenceBySample)
      Constructs a new evidence-likelihood collection.

      The initial likelihoods for all allele-evidence combinations are 0.

      samples - all supported samples in the collection.
      alleles - all supported alleles in the collection.
      evidenceBySample - evidence stratified per sample.
      IllegalArgumentException - if any of allele, samples or evidenceBySample is null, or if they contain null values.
  • Method Details

    • isNaturalLog

      public boolean isNaturalLog()
    • createAlleleLikelihoods

      public static <EVIDENCE extends htsjdk.samtools.util.Locatable, A extends htsjdk.variant.variantcontext.Allele> AlleleLikelihoods createAlleleLikelihoods(AlleleList alleles, SampleList samples, List<List<EVIDENCE>> evidenceBySampleIndex, List<List<EVIDENCE>> filteredEvidenceBySampleIndex, double[][][] values)
    • removeAllelesToSubset

      public AlleleLikelihoods<EVIDENCE,A> removeAllelesToSubset(Collection<A> subsetOfAlleles)
      Removes subset of alleles
    • indexOfSample

      public int indexOfSample(String sample)
      Returns the index of a sample within the likelihood collection.
      Specified by:
      indexOfSample in interface SampleList
      sample - the query sample.
      -1 if the allele is not included, 0 or greater otherwise.
      IllegalArgumentException - if sample is null.
    • numberOfSamples

      public int numberOfSamples()
      Number of samples included in the likelihood collection.
      Specified by:
      numberOfSamples in interface SampleList
      0 or greater.
    • getSample

      public String getSample(int sampleIndex)
      Returns sample name given its index.
      Specified by:
      getSample in interface SampleList
      sampleIndex - query index.
      never null.
      IllegalArgumentException - if sampleIndex is negative.
    • indexOfAllele

      public int indexOfAllele(htsjdk.variant.variantcontext.Allele allele)
      Returns the index of an allele within the likelihood collection.
      Specified by:
      indexOfAllele in interface AlleleList<EVIDENCE extends htsjdk.samtools.util.Locatable>
      allele - the query allele.
      -1 if the allele is not included, 0 or greater otherwise.
      IllegalArgumentException - if allele is null.
    • numberOfAlleles

      public int numberOfAlleles()
      Returns number of alleles in the collection.
      Specified by:
      numberOfAlleles in interface AlleleList<EVIDENCE extends htsjdk.samtools.util.Locatable>
      0 or greater.
    • getAllele

      public A getAllele(int alleleIndex)
      Returns the allele given its index.
      Specified by:
      getAllele in interface AlleleList<EVIDENCE extends htsjdk.samtools.util.Locatable>
      alleleIndex - the allele index.
      never null.
      IllegalArgumentException - the allele index is null.
    • sampleEvidence

      public List<EVIDENCE> sampleEvidence(int sampleIndex)
      Returns the units of evidence that belong to a sample sorted by their index (within that sample).
      sampleIndex - the requested sample.
      never null but perhaps a zero-length array if there is no evidence in sample. No element in the array will be null.
    • filteredSampleEvidence

      public List<EVIDENCE> filteredSampleEvidence(int sampleIndex)
      Returns the units of evidence that have been removed by PairHMM error score filtering (and intentially not evidence filtered by any other mechanism).
      sampleIndex - the requested sample.
      never null but perhaps a zero-length array if there is no filtered evidence for a sample. No element in the array will be null.
    • sampleMatrix

      public LikelihoodMatrix<EVIDENCE,A> sampleMatrix(int sampleIndex)
      Returns an evidence vs allele likelihood matrix corresponding to a sample.
      sampleIndex - target sample.
      never null
      IllegalArgumentException - if sampleIndex is not null.
    • switchToNaturalLog

      public void switchToNaturalLog()
    • contaminationDownsampling

      public void contaminationDownsampling(Map<String,Double> perSampleDownsamplingFraction)
      Downsamples reads based on contamination fractions making sure that all alleles are affected proportionally.
      perSampleDownsamplingFraction - contamination sample map where the sample name are the keys and the fractions are the values.
      IllegalArgumentException - if perSampleDownsamplingFraction is null.
    • normalizeLikelihoods

      public void normalizeLikelihoods(double maximumLikelihoodDifferenceCap, boolean symmetricallyNormalizeAllelesToReference)
      Adjusts likelihoods so that for each unit of evidence, the best allele likelihood is 0 and caps the minimum likelihood of any allele for each unit of evidence based on the maximum alternative allele likelihood.
      maximumLikelihoodDifferenceCap - maximum difference between the best alternative allele likelihood and any other likelihood.
      IllegalArgumentException - if maximumDifferenceWithBestAlternative is not 0 or less.
    • samples

      public List<String> samples()
      Returns the samples in this evidence-likelihood collection.

      Samples are sorted by their index in the collection.

      The returned list is an unmodifiable view on the evidence-likelihoods sample list.

      never null.
    • alleles

      public List<A> alleles()
      Returns the samples in this evidence-likelihood collection.

      Samples are sorted by their index in the collection.

      The returned list is an unmodifiable. It will not be updated if the collection allele list changes.

      never null.
    • changeEvidence

      public void changeEvidence(Map<EVIDENCE,EVIDENCE> evidenceReplacements)
    • addMissingAlleles

      public boolean addMissingAlleles(Collection<A> candidateAlleles, double defaultLikelihood)
      Add alleles that are missing in the evidence-likelihoods collection giving all evidence a default likelihood value.
      candidateAlleles - the potentially missing alleles.
      defaultLikelihood - the default evidence likelihood value for that allele.
      true iff the the evidence-likelihood collection was modified by the addition of the input alleles. So if all the alleles in the input collection were already present in the evidence-likelihood collection this method will return false.
      IllegalArgumentException - if candidateAlleles is null or there is more than one missing allele that is a reference or there is one but the collection already has a reference allele.
    • groupEvidence

      public <U, NEW_EVIDENCE_TYPE extends htsjdk.samtools.util.Locatable> AlleleLikelihoods<NEW_EVIDENCE_TYPE,A> groupEvidence(Function<EVIDENCE,U> groupingFunction, Function<List<EVIDENCE>,NEW_EVIDENCE_TYPE> gather)
      Group evidence into lists of evidence -- for example group by read name to force read pairs to support a single haplotype. Log Likelihoods are summed over all evidence in a group, corresponding to an independent evidence assumption. Since this container's likelihoods generally pertain to sequencing only (and not sample prep etc) this is usually a good assumption.
      groupingFunction - Attribute function for grouping evidence, for example GATKRead::getName
      gather - Transformation applied to collections of evidence with same value of groupingFunction. For example, Fragment::new to construct a fragment out of a pair of reads with the same name
      a new AlleleLikelihoods based on the grouped, transformed evidence.
    • marginalize

      public <B extends htsjdk.variant.variantcontext.Allele> AlleleLikelihoods<EVIDENCE,B> marginalize(Map<B,List<A>> newToOldAlleleMap)
      Perform marginalization from an allele set to another (smaller one) taking the maximum value for each evidence in the original allele subset.
      newToOldAlleleMap - map where the keys are the new alleles and the value list the original alleles that correspond to the new one.
      never null. The result will have the requested set of new alleles (keys in newToOldAlleleMap, and the same set of samples and evidence as the original.
      IllegalArgumentException - is newToOldAlleleMap is null or contains null values, or its values contain reference to non-existing alleles in this evidence-likelihood collection. Also no new allele can have zero old alleles mapping nor two new alleles can make reference to the same old allele.
    • addEvidence

      public void addEvidence(Map<String,List<EVIDENCE>> evidenceBySample, double initialLikelihood)
      Add more evidence to the collection.
      evidenceBySample - evidence to add.
      initialLikelihood - the likelihood for the new entries.
      IllegalArgumentException - if evidenceBySample is null or evidenceBySample contains null evidence, or evidenceBySample contains evidence already present in the evidence-likelihood collection.
    • addNonReferenceAllele

      public void addNonReferenceAllele(A nonRefAllele)
      Adds the non-reference allele to the evidence-likelihood collection setting each evidence likelihood to the second best found (or best one if only one allele has likelihood).

      Nothing will happen if the evidence-likelihoods collection already includes the non-ref allele

      Implementation note: even when strictly speaking we do not need to demand the calling code to pass the reference the non-ref allele, we still demand it in order to lead the the calling code to use the right generic type for this likelihoods collection Allele.

      nonRefAllele - the non-ref allele.
      IllegalArgumentException - if nonRefAllele is anything but the designated <NON_REF> symbolic allele Allele.NON_REF_ALLELE.
    • updateNonRefAlleleLikelihoods

      public void updateNonRefAlleleLikelihoods()
      Updates the likelihoods of the non-ref allele, if present, considering all non-symbolic alleles avaialble.
    • updateNonRefAlleleLikelihoods

      public void updateNonRefAlleleLikelihoods(AlleleList<A> allelesToConsider)
      Updates the likelihood of the NonRef allele (if present) based on the likelihoods of a set of non-symbolic
    • bestAllelesBreakingTies

      public Collection<AlleleLikelihoods<EVIDENCE,A>.BestAllele> bestAllelesBreakingTies(Function<A,Double> tieBreakingPriority)
      Returns the collection of best allele estimates for the evidence based on the evidence-likelihoods. "Ties" where the ref likelihood is within AlleleLikelihoods.INFORMATIVE_THRESHOLD of the greatest likelihood are broken by the tieBreakingPriority function.
      never null, one element per unit fo evidence in the evidence-likelihoods collection.
      IllegalStateException - if there is no alleles.
    • bestAllelesBreakingTies

      public Collection<AlleleLikelihoods<EVIDENCE,A>.BestAllele> bestAllelesBreakingTies()
      Default version where ties are broken in favor of the reference allele
    • bestAllelesBreakingTies

      public Collection<AlleleLikelihoods<EVIDENCE,A>.BestAllele> bestAllelesBreakingTies(String sample, Function<A,Double> tieBreakingPriority)
      Returns the collection of best allele estimates for one sample's evidence based on the evidence-likelihoods. "Ties" where the ref likelihood is within AlleleLikelihoods.INFORMATIVE_THRESHOLD of the greatest likelihood are broken by the tieBreakingPriority function.
      never null, one element per unit of evidence in the evidence-likelihoods collection.
      IllegalStateException - if there is no alleles.
    • bestAllelesBreakingTies

      public Collection<AlleleLikelihoods<EVIDENCE,A>.BestAllele> bestAllelesBreakingTies(String sample)
      Default version where ties are broken in favor of the reference allele
    • evidenceCount

      public int evidenceCount()
      Returns the total count of evidence in the evidence-likelihood collection.
    • sampleEvidenceCount

      public int sampleEvidenceCount(int sampleIndex)
      Returns the quantity of evidence that belongs to a sample in the evidence-likelihood collection.
      sampleIndex - the query sample index.
      0 or greater.
      IllegalArgumentException - if sampleIndex is not a valid sample index.
    • retainEvidence

      public void retainEvidence(Predicate<? super EVIDENCE> predicate)
      Remove those reads that do not comply with a requirement.
      predicate - the predicate representing the requirement.

      This method modifies the current read-likelihoods collection.

      Any exception thrown by the predicate will be propagated to the calling code.

      IllegalArgumentException - if predicate is null.
    • maximumLikelihoodOverAllAlleles

      protected double maximumLikelihoodOverAllAlleles(int sampleIndex, int evidenceIndex)
    • setVariantCallingSubsetUsed

      public void setVariantCallingSubsetUsed(SimpleInterval loc)
    • getVariantCallingSubsetApplied

      public SimpleInterval getVariantCallingSubsetApplied()
      Returns the location used for subsetting. May be null.
    • changeAlleles

      public void changeAlleles(List<A> newAlleles)
      Replaces the alleles in the readLikelihood matrix. Relevant for the uncollapsing code
      newAlleles -
    • getFilteredHaplotypeCount

      public int getFilteredHaplotypeCount()
    • setFilteredHaplotypeCount

      public void setFilteredHaplotypeCount(int filteredHaplotypeCount)
    • filterPoorlyModeledEvidence

      public void filterPoorlyModeledEvidence(ToDoubleFunction<EVIDENCE> log10MinTrueLikelihood)
      Removes those read that the best possible likelihood given any allele is just too low.

      This is determined by a maximum error per read-base against the best likelihood possible.

      log10MinTrueLikelihood - Function that returns the minimum likelihood that the best allele for a unit of evidence must have
      IllegalStateException - is not supported for read-likelihood that do not contain alleles.
      IllegalArgumentException - if maximumErrorPerBase is negative.