Class GenotypesContext

java.lang.Object
htsjdk.variant.variantcontext.GenotypesContext
All Implemented Interfaces:
Serializable, Iterable<Genotype>, Collection<Genotype>, List<Genotype>
Direct Known Subclasses:
LazyGenotypesContext

public class GenotypesContext extends Object implements List<Genotype>, Serializable
Represents an ordered collection of Genotype objects
See Also:
  • Field Details

    • serialVersionUID

      public static final long serialVersionUID
      See Also:
    • NO_GENOTYPES

      public static final GenotypesContext NO_GENOTYPES
      static constant value for an empty GenotypesContext. Useful since so many VariantContexts have no genotypes
    • sampleNamesInOrder

      protected List<String> sampleNamesInOrder
      sampleNamesInOrder a list of sample names, one for each genotype in genotypes, sorted in alphabetical order
    • sampleNameToOffset

      protected Map<String,Integer> sampleNameToOffset
      a map optimized for efficient lookup. Each genotype in genotypes must have its sample name in sampleNameToOffset, with a corresponding integer value that indicates the offset of that genotype in the vector of genotypes
    • notToBeDirectlyAccessedGenotypes

      protected ArrayList<Genotype> notToBeDirectlyAccessedGenotypes
      An ArrayList of genotypes contained in this context WARNING: TO ENABLE THE LAZY VERSION OF THIS CLASS, NO METHODS SHOULD DIRECTLY ACCESS THIS VARIABLE. USE getGenotypes() INSTEAD.
  • Constructor Details

    • GenotypesContext

      protected GenotypesContext()
      Create an empty GenotypeContext
    • GenotypesContext

      protected GenotypesContext(int n)
      Create an empty GenotypeContext, with initial capacity for n elements
    • GenotypesContext

      protected GenotypesContext(ArrayList<Genotype> genotypes)
      Create an GenotypeContext containing genotypes
    • GenotypesContext

      protected GenotypesContext(ArrayList<Genotype> genotypes, Map<String,Integer> sampleNameToOffset, List<String> sampleNamesInOrder)
      Create a fully resolved GenotypeContext containing genotypes, sample lookup table, and sorted sample names
      Parameters:
      genotypes - our genotypes in arbitrary
      sampleNameToOffset - map optimized for efficient lookup. Each genotype in genotypes must have its sample name in sampleNameToOffset, with a corresponding integer value that indicates the offset of that genotype in the vector of genotypes
      sampleNamesInOrder - a list of sample names, one for each genotype in genotypes, sorted in alphabetical order.
  • Method Details

    • create

      public static final GenotypesContext create()
      Basic creation routine
      Returns:
      an empty, mutable GenotypeContext
    • create

      public static final GenotypesContext create(int nGenotypes)
      Basic creation routine
      Returns:
      an empty, mutable GenotypeContext with initial capacity for nGenotypes
    • create

      public static final GenotypesContext create(ArrayList<Genotype> genotypes, Map<String,Integer> sampleNameToOffset, List<String> sampleNamesInOrder)
      Create a fully resolved GenotypeContext containing genotypes, sample lookup table, and sorted sample names
      Parameters:
      genotypes - our genotypes in arbitrary
      sampleNameToOffset - map optimized for efficient lookup. Each genotype in genotypes must have its sample name in sampleNameToOffset, with a corresponding integer value that indicates the offset of that genotype in the vector of genotypes
      sampleNamesInOrder - a list of sample names, one for each genotype in genotypes, sorted in alphabetical order.
      Returns:
      an mutable GenotypeContext containing genotypes with already present lookup data
    • create

      public static final GenotypesContext create(ArrayList<Genotype> genotypes)
      Create a fully resolved GenotypeContext containing genotypes
      Parameters:
      genotypes - our genotypes in arbitrary
      Returns:
      an mutable GenotypeContext containing genotypes
    • create

      public static final GenotypesContext create(Genotype... genotypes)
      Create a fully resolved GenotypeContext containing genotypes
      Parameters:
      genotypes - our genotypes in arbitrary
      Returns:
      an mutable GenotypeContext containing genotypes
    • copy

      public static final GenotypesContext copy(GenotypesContext toCopy)
      Create a freshly allocated GenotypeContext containing the genotypes in toCopy
      Parameters:
      toCopy - the GenotypesContext to copy
      Returns:
      an mutable GenotypeContext containing genotypes
    • copy

      public static final GenotypesContext copy(Collection<Genotype> toCopy)
      Create a GenotypesContext containing the genotypes in iteration order contained in toCopy
      Parameters:
      toCopy - the collection of genotypes
      Returns:
      an mutable GenotypeContext containing genotypes
    • immutable

      public final GenotypesContext immutable()
    • isMutable

      public boolean isMutable()
    • checkImmutability

      public final void checkImmutability() throws UnsupportedOperationException
      Throws:
      UnsupportedOperationException
    • invalidateSampleNameMap

      protected void invalidateSampleNameMap()
    • invalidateSampleOrdering

      protected void invalidateSampleOrdering()
    • ensureSampleOrdering

      protected void ensureSampleOrdering()
    • ensureSampleNameMap

      protected void ensureSampleNameMap()
    • isLazyWithData

      public boolean isLazyWithData()
    • getGenotypes

      protected ArrayList<Genotype> getGenotypes()
    • clear

      public void clear()
      Specified by:
      clear in interface Collection<Genotype>
      Specified by:
      clear in interface List<Genotype>
    • size

      public int size()
      Specified by:
      size in interface Collection<Genotype>
      Specified by:
      size in interface List<Genotype>
    • isEmpty

      public boolean isEmpty()
      Specified by:
      isEmpty in interface Collection<Genotype>
      Specified by:
      isEmpty in interface List<Genotype>
    • add

      public boolean add(Genotype genotype) throws UnsupportedOperationException
      Adds a single genotype to this context. There are many constraints on this input, and important impacts on the performance of other functions provided by this context. First, the sample name of genotype must be unique within this context. However, this is not enforced in the code itself, through you will invalid the contract on this context if you add duplicate samples and are running with CoFoJa enabled. Second, adding genotype also updates the sample name -> index map, so add() followed by containsSample and related function is an efficient series of operations. Third, adding the genotype invalidates the sorted list of sample names, to add() followed by any of the SampleNamesInOrder operations is inefficient, as each SampleNamesInOrder must rebuild the sorted list of sample names at an O(n log n) cost.
      Specified by:
      add in interface Collection<Genotype>
      Specified by:
      add in interface List<Genotype>
      Parameters:
      genotype -
      Returns:
      Throws:
      UnsupportedOperationException - if the context has been made immutable
    • add

      public void add(int i, Genotype genotype)
      Specified by:
      add in interface List<Genotype>
    • addAll

      public boolean addAll(Collection<? extends Genotype> genotypes)
      Adds all of the genotypes to this context See add(Genotype) for important information about this functions constraints and performance costs
      Specified by:
      addAll in interface Collection<Genotype>
      Specified by:
      addAll in interface List<Genotype>
      Parameters:
      genotypes -
      Returns:
    • addAll

      public boolean addAll(int i, Collection<? extends Genotype> genotypes)
      Specified by:
      addAll in interface List<Genotype>
    • contains

      public boolean contains(Object o)
      Specified by:
      contains in interface Collection<Genotype>
      Specified by:
      contains in interface List<Genotype>
    • containsAll

      public boolean containsAll(Collection<?> objects)
      Specified by:
      containsAll in interface Collection<Genotype>
      Specified by:
      containsAll in interface List<Genotype>
    • get

      public Genotype get(int i)
      Specified by:
      get in interface List<Genotype>
    • getMaxPloidy

      public int getMaxPloidy(int defaultPloidy)
      What is the max ploidy among all samples? Returns defaultPloidy if no genotypes are present
      Parameters:
      defaultPloidy - the default ploidy, if all samples are no-called
      Returns:
    • get

      public Genotype get(String sampleName)
      Gets sample associated with this sampleName, or null if none is found
      Parameters:
      sampleName -
      Returns:
    • indexOf

      public int indexOf(Object o)
      Specified by:
      indexOf in interface List<Genotype>
    • iterator

      public Iterator<Genotype> iterator()
      Specified by:
      iterator in interface Collection<Genotype>
      Specified by:
      iterator in interface Iterable<Genotype>
      Specified by:
      iterator in interface List<Genotype>
    • lastIndexOf

      public int lastIndexOf(Object o)
      Specified by:
      lastIndexOf in interface List<Genotype>
    • listIterator

      public ListIterator<Genotype> listIterator()
      Specified by:
      listIterator in interface List<Genotype>
    • listIterator

      public ListIterator<Genotype> listIterator(int i)
      Specified by:
      listIterator in interface List<Genotype>
    • remove

      public Genotype remove(int i)
      Note that remove requires us to invalidate our sample -> index cache. The loop: GenotypesContext gc = ... for ( sample in samples ) if ( gc.containsSample(sample) ) gc.remove(sample) is extremely inefficient, as each call to remove invalidates the cache and containsSample requires us to rebuild it, an O(n) operation. If you must remove many samples from the GC, use either removeAll or retainAll to avoid this O(n * m) operation.
      Specified by:
      remove in interface List<Genotype>
      Parameters:
      i -
      Returns:
    • remove

      public boolean remove(Object o)
      See for important warning remove(int)
      Specified by:
      remove in interface Collection<Genotype>
      Specified by:
      remove in interface List<Genotype>
      Parameters:
      o -
      Returns:
    • removeAll

      public boolean removeAll(Collection<?> objects)
      Specified by:
      removeAll in interface Collection<Genotype>
      Specified by:
      removeAll in interface List<Genotype>
    • retainAll

      public boolean retainAll(Collection<?> objects)
      Specified by:
      retainAll in interface Collection<Genotype>
      Specified by:
      retainAll in interface List<Genotype>
    • set

      public Genotype set(int i, Genotype genotype)
      Specified by:
      set in interface List<Genotype>
    • replace

      public Genotype replace(Genotype genotype)
      Replaces the genotype in this context -- note for efficiency reasons we do not add the genotype if it's not present. The return value will be null indicating this happened. Note this operation is preserves the map cache Sample -> Offset but invalidates the sorted list of samples. Using replace within a loop containing any of the SampleNameInOrder operation requires an O(n log n) resorting after each replace operation.
      Parameters:
      genotype - a non null genotype to bind in this context
      Returns:
      null if genotype was not added, otherwise returns the previous genotype
    • subList

      public List<Genotype> subList(int i, int i1)
      Specified by:
      subList in interface List<Genotype>
    • toArray

      public Object[] toArray()
      Specified by:
      toArray in interface Collection<Genotype>
      Specified by:
      toArray in interface List<Genotype>
    • toArray

      public <T> T[] toArray(T[] ts)
      Specified by:
      toArray in interface Collection<Genotype>
      Specified by:
      toArray in interface List<Genotype>
    • iterateInSampleNameOrder

      public Iterable<Genotype> iterateInSampleNameOrder(Iterable<String> sampleNamesInOrder)
      Iterate over the Genotypes in this context in the order specified by sampleNamesInOrder
      Parameters:
      sampleNamesInOrder - a Iterable of String, containing exactly one entry for each Genotype sample name in this context
      Returns:
      a Iterable over the genotypes in this context.
    • iterateInSampleNameOrder

      public Iterable<Genotype> iterateInSampleNameOrder()
      Iterate over the Genotypes in this context in their sample name order (A, B, C) regardless of the underlying order in the vector of genotypes
      Returns:
      a Iterable over the genotypes in this context.
    • getSampleNames

      public Set<String> getSampleNames()
      Returns:
      The set of sample names for all genotypes in this context, in arbitrary order
    • getSampleNamesOrderedByName

      public List<String> getSampleNamesOrderedByName()
      Returns:
      The set of sample names for all genotypes in this context, in their natural ordering (A, B, C)
    • containsSample

      public boolean containsSample(String sample)
    • containsSamples

      public boolean containsSamples(Collection<String> samples)
    • subsetToSamples

      public GenotypesContext subsetToSamples(Set<String> samples)
      Return a freshly allocated subcontext of this context containing only the samples listed in samples. Note that samples can contain names not in this context, they will just be ignored.
      Parameters:
      samples -
      Returns:
    • toString

      public String toString()
      Overrides:
      toString in class Object