All Classes and Interfaces (gatk 4.4.0.0 API)

Class

Description

Class to translate back and forth from absolute long-typed base positions to relative ones (the usual contig, position pairs).

AbsoluteCoordinates.Relative

AbstractAlignmentMerger

Abstract class that coordinates the general task of taking in a set of alignment information, possibly in SAM format, possibly in other formats, and merging that with the set of all reads for which alignment was attempted, stored in an unmapped SAM file.

AbstractAlignmentMerger

AbstractAlignmentMerger.UnmappingReadStrategy

AbstractBCICodec<F extends htsjdk.tribble.Feature>

AbstractConcordanceWalker

Base class for concordance walkers, which process one variant at a time from one or more sources of variants, with optional contextual information from a reference, sets of reads, and/or supplementary sources of Features.

AbstractConcordanceWalker.TruthVersusEval

store a truth vc in case of a false negative, an eval vc in case of a false positive, or a concordance pair of truth and eval in case of a true positive.

AbstractEvidenceSortMerger<F extends SVFeature>

A FeatureSink that buffers and resolves (by merging, or by checking for redundancy) features that occur on the same interval.

AbstractGtfCodec

AbstractIlluminaPositionFileReader

The position files of Illumina are nearly the same form: Pos files consist of text based tabbed x-y coordinate float pairs, locs files are binary x-y float pairs, clocs are compressed binary x-y float pairs.

AbstractInputParser

Class for parsing text files where each line consists of fields separated by whitespace.

AbstractLocatableCollection<METADATA extends LocatableMetadata,RECORD extends htsjdk.samtools.util.Locatable>

Represents a sequence dictionary, an immutable, coordinate-sorted (with no overlaps allowed) collection of records that extend Locatable (although contigs are assumed to be non-null when writing to file), a set of mandatory column headers given by a TableColumnCollection, and lambdas for reading and writing records.

AbstractMarkDuplicatesCommandLineProgram

Abstract class that holds parameters and methods common to classes that perform duplicate detection and/or marking within SAM/BAM/CRAM files.

AbstractMarkDuplicatesCommandLineProgram.SamHeaderAndIterator

Little class used to package up a header and an iterable/iterator.

AbstractOpticalDuplicateFinderCommandLineProgram

Abstract class that holds parameters and methods common to classes that optical duplicate detection.

AbstractReadThreadingGraph

Read threading graph class intended to contain duplicated code between ReadThreadingGraph and JunctionTreeLinkedDeBruijnGraph.

AbstractReadThreadingGraph.MyEdgeFactory

Edge factory that encapsulates the numPruningSamples assembly parameter

AbstractReadThreadingGraph.TraversalDirection

AbstractRecordCollection<METADATA extends Metadata,RECORD>

Represents AbstractRecordCollection (which can be represented as a SAMFileHeader), an immutable collection of records, a set of mandatory column headers given by a TableColumnCollection, and lambdas for reading and writing records.

AbstractSampleLocatableCollection<RECORD extends htsjdk.samtools.util.Locatable>

Represents a sample name, a sequence dictionary, an immutable, coordinate-sorted (with no overlaps allowed) collection of records that extend Locatable (although contigs are assumed to be non-null when writing to file), a set of mandatory column headers given by a TableColumnCollection, and lambdas for reading and writing records.

AbstractSampleRecordCollection<RECORD>

Represents a sample name, an immutable collection of records, a set of mandatory column headers given by a TableColumnCollection, and lambdas for reading and writing records.

AbstractWgsMetricsCollector<T extends htsjdk.samtools.util.AbstractRecordAndOffset>

Class for collecting data on reference coverage, base qualities and excluded bases from one AbstractLocusInfo object for CollectWgsMetrics.

AccumulateQualityYieldMetrics

Combines multiple Picard QualityYieldMetrics files into a single file.

AccumulateVariantCallingMetrics

Combines multiple Variant Calling Metrics files into a single file.

ActivityProfile

Class holding information about per-base activity scores for assembly region traversal

ActivityProfileState

Captures the probability that a specific locus in the genome represents an "active" site containing real variation.

ActivityProfileState.Type

The type of the value returned by ActivityProfileState.getResultValue()

ActivityProfileStateIterator

Given a MultiIntervalShard of GATKRead, iterates over each locus within that shard, and calculates the ActivityProfileState there, using the provided AssemblyRegionEvaluator to determine if each site is active.

ActivityProfileStateRange

An efficient way of representing a set of ActivityProfileStates in an interval; used by Spark.

AdapterMarker

Store one or more AdapterPairs to use to mark adapter sequence of SAMRecords.

AdapterPair

AdapterTrimTransformer

Trims (hard clips) adapter sequences from read ends.

AdapterUtility

A utility class for matching reads to adapters.

AdaptiveChainPruner<V extends BaseVertex,E extends BaseEdge>

AdaptiveMetropolisSampler

Metropolis MCMC sampler using an adaptive step size that increases / decreases in order to decrease / increase acceptance rate to some desired value.

AddCommentsToBam

A tool to add comments to a BAM file header.

AddOATag

AddOriginalAlignmentTags

AddOrReplaceReadGroups

Assigns all the reads in a file to a single new read-group.

AFCalculationResult

Describes the results of the AFCalc Only the bare essentials are represented here, as all AFCalc models must return meaningful results for all of these fields.

Affection

Categorical sample trait for association and analysis Samples can have unknown status, be affected or unaffected by the categorical trait, or they can be marked as actually having an other trait value (stored in an associated value in the Sample class)

AliasProvider

Class for managing aliases and querying Funcotation to determine fields.

AlignedAssembly

Holding necessary information about a local assembly for use in SV discovery.

AlignedAssembly.Serializer

AlignedAssemblyOrExcuse

An assembly with its contigs aligned to reference, or a reason that there isn't an assembly.

AlignedAssemblyOrExcuse.Serializer

AlignedContig

Locally assembled contig: its name its sequence as produced by the assembler (no reverse complement like in the SAM record if it maps to '-' strand), and its stripped-down alignment information.

AlignedContig.GoodAndBadMappings

After configuration scoring and picking, the original alignments can be classified as good and bad mappings: good: the ones used the picked configuration bad: unused alignments in the chosen configuration; these likely contain more noise than information they can be turned into string representation following the format as in AlignmentInterval.toPackedString()

AlignedContig.Serializer

AlignedContigGenerator

Loads various upstream assembly and alignment formats and turn into custom AlignedContig format in the discovery stage.

AlignmentAgreesWithHeaderReadFilter

Filter out reads where the alignment does not match the contents of the header.

AlignmentAndReferenceContext

Bundles together and AlignmentContext and a ReferenceContext

AlignmentContext

Bundles together a pileup and a location.

AlignmentContext.ReadOrientation

AlignmentContextIteratorBuilder

Create an iterator for traversing alignment contexts in a specified manner.

AlignmentInterval

Each assembled contig should have at least one such accompanying structure, or 0 when it is unmapped.

AlignmentInterval.Serializer

AlignmentStateMachine

Steps a single read along its alignment to the genome The logical model for generating extended events is as follows: the "record state" implements the traversal along the reference; thus stepForwardOnGenome() returns on every and only on actual reference bases.

AlignmentSummaryMetrics

High level metrics about the alignment of reads within a SAM file, produced by the CollectAlignmentSummaryMetrics program and usually stored in a file with the extension ".alignment_summary_metrics".

AlignmentSummaryMetrics.Category

AlignmentSummaryMetricsCollector

AlignmentUtils

AlignmentUtils.MismatchCount

AlleleAndContext

This class is similar to LocationAndAlleles but allows keeping only an allele/ref pair rather than a list of alleles.

AlleleBalanceFilter

Filters out a record if the allele balance for heterozygotes is out of a defined range across all samples.

AlleleBiasedDownsamplingUtils

The purpose of this set of utilities is to downsample a set of reads to remove contamination.

AlleleCount

Stratifies the eval RODs by the allele count of the alternate allele Looks first at the MLEAC value in the INFO field, and uses that value if present.

AlleleFiltering

Filtering haplotypes that contribute weak alleles to the genotyping.

AlleleFilteringHC

Filtering haplotypes that contribute weak alleles to the genotyping.

AlleleFilteringMutect

Filtering haplotypes that contribute weak alleles to the genotyping.

AlleleFilterUtils

Helps read and set allele specific filters in the INFO field.

AlleleFraction

Variant allele fraction for each sample.

AlleleFractionCluster

AlleleFractionKernelSegmenter

Segments alternate-allele-fraction data using kernel segmentation.

AlleleFractionModeller

Given segments and counts of alt and ref reads over a list of het sites, infers the minor-allele fraction of each segment.

AlleleFractionParameter

Enumerates the parameters for AlleleFractionState.

AlleleFractionPrior

Represents priors for the allele-fraction model.

AlleleFrequency

Stratifies the eval RODs by the allele frequency of the alternate allele Either uses a constant 0.005 frequency grid, and projects the AF INFO field value or logit scale from -30 to 30.

AlleleFrequencyCalculator

AlleleFrequencyExacUtils

Allele frequency calculations for the Exac dataset

AlleleFrequencyQC

This tool uses VariantEval to bin variants in Thousand Genomes by allele frequency.

AlleleFrequencyQCMetric

AlleleFrequencyUtils

Allele frequency utilities that are dataset-agnostic

AlleleLikelihoods<EVIDENCE extends htsjdk.samtools.util.Locatable,A extends htsjdk.variant.variantcontext.Allele>

Evidence-likelihoods container implementation based on integer indexed arrays.

AlleleLikelihoodWriter

AlleleList<A extends htsjdk.variant.variantcontext.Allele>

Minimal interface for random access to a collection of Alleles.

AlleleList.ActualPermutation<A extends htsjdk.variant.variantcontext.Allele>

AlleleList.NonPermutation<A extends htsjdk.variant.variantcontext.Allele>

This is the identity permutation.

AlleleListPermutation<A extends htsjdk.variant.variantcontext.Allele>

Marks allele list permutation implementation classes.

AllelePileupCounter

Useful when you know the interval and the alleles of interest ahead of the counting.

AllelePseudoDepth

AlleleSpecificAnnotation

This is a marker interface used to indicate which annotations are allele-specific.

AlleleSpecificAnnotationData<T>

A class to encapsulate the raw data for allele-specific classes compatible with the ReducibleAnnotation interface

AlleleSubsettingUtils

Utilities class containing methods for restricting VariantContext and GenotypesContext objects to a reduced set of alleles, as well as for choosing the best set of alleles to keep and for cleaning up annotations and genotypes after subsetting.

AlleleSubsettingUtils

AllelicCount

Reference and alternate allele counts at a site specified by an interval.

AllelicCountCollection

Simple data structure to pass and read/write a List of AllelicCount objects.

AllelicCountCollector

Collects reference/alternate allele counts at specified sites.

AllLocusIterator

A super-simplified/stripped-down/faster version of IntervalAlignmentContextIterator that takes a locus iterator and a *single* interval, and returns an AlignmentContext for every locus in the interval.

AltSiteRecord

Created by tsato on 10/11/17.

AltSiteRecord.AltSiteRecordTableWriter

Writer

AmbiguousBaseReadFilter

Filters out reads that have greater than the threshold number for unknown (N) bases.

AminoAcid

Enum to hold the amino acids and their standard codons.

Analysis

AnalysisModuleScanner

AnalyzeCovariates

Evaluate and compare base quality score recalibration tables

AnalyzeSaturationMutagenesis

Process reads from a saturation mutagenesis experiment.

AnalyzeSaturationMutagenesis.ReportTypeCounts

AnnotatedInterval

Represents an interval with a set of annotations.

AnnotatedInterval

Simple class that just has an interval and sorted name-value pairs.

AnnotatedIntervalCodec

Read AnnotatedIntervals from a xsv file (see XsvLocatableTableCodec.

AnnotatedIntervalCollection

Represents a collection of intervals annotated with CopyNumberAnnotations.

AnnotatedIntervalCollection

Represents a collection of annotated intervals.

AnnotatedIntervalHeader

AnnotatedIntervalToSegmentVariantContextConverter

Converts an annotated interval representing a segment to a variant context.

AnnotatedIntervalUtils

AnnotatedIntervalWriter

AnnotatedVariantProducer

Given identified pair of breakpoints for a simple SV and its supportive evidence, i.e.

AnnotateIntervals

Annotates intervals with GC content, and optionally, mappability and segmental-duplication content.

AnnotateVcfWithBamDepth

Annotate every variant in a VCF with the depth at that locus in a bam.

AnnotateVcfWithExpectedAlleleFraction

Given mixing weights of different samples in a pooled bam, annotate a corresponding vcf containing individual sample genotypes.

Annotation

An annotation group is a set of annotation that have something in common and should be added at the same time.

AnnotationException

Exception thrown when loading gene annotations.

AnnotationKey<T>

Represents a key for a named, typed annotation.

AnnotationMap

Represents an immutable ordered collection of named, typed annotations for an interval.

AnnotationUtils

ApacheSingularValueDecomposer

Perform singular value decomposition (and pseudoinverse calculation) in pure Java, Commons Math.

ApplyBQSR

Apply base quality score recalibration

ApplyBQSRArgumentCollection

The collection of all arguments needed for ApplyBQSR.

ApplyBQSRSpark

Apply base quality score recalibration with Spark.

ApplyBQSRSparkFn

ApplyBQSRUniqueArgumentCollection

The collection of those arguments for ApplyBQSR that are not already defined in RecalibrationArgumentCollection.

ApplyVQSR

Apply a score cutoff to filter variants based on a recalibration table

ArHetvarFilter

ArHomvarFilter

ArraysControlInfo

A simple class to store names and counts for the the Control Information fields that are stored in an Illumina GTC file.

ArtifactPrior

Container for the artifact prior probabilities for the read orientation model

ArtifactPrior.ArtifactPriorTableReader

ArtifactPrior.ArtifactPriorTableWriter

ArtifactPriorCollection

Container class for ArtifactPrior objects.

ArtifactState

This enum encapsulates the domain of the discrete latent random variable z

ArtificialBAMBuilder

Easy to use creator of artificial BAM files for testing Allows us to make a stream of reads or an index BAM file with read having the following properties - coming from n samples - of fixed read length and aligned to the genome with M operator - having N reads per alignment start - skipping N bases between each alignment start - starting at a given alignment start

ArtificialReadIterator

this fake iterator allows us to look at how specific piles of reads are handled

ArtificialReadQueryIterator

ArtificialReadUtils

AS_BaseQualityRankSumTest

Allele-specific rank Sum Test of REF versus ALT base quality scores

AS_FisherStrand

Allele-specific strand bias estimated using Fisher's Exact Test *

AS_InbreedingCoeff

Allele-specific likelihood-based test for the inbreeding among samples

AS_MappingQualityRankSumTest

Allele specific Rank Sum Test for mapping qualities of REF versus ALT reads

AS_QualByDepth

Allele-specific call confidence normalized by depth of sample reads supporting the allele

AS_RankSumTest

Allele-specific implementation of rank sum test annotations

AS_ReadPosRankSumTest

Allele-specific Rank Sum Test for relative positioning of REF versus ALT allele within reads

AS_RMSMappingQuality

Allele-specific Root Mean Square of the mapping quality of reads across all samples.

AS_StandardAnnotation

This is a marker interface used to indicate which annotations are "Standard" and allele-specific.

AS_StrandBiasMutectAnnotation

Adds the strand bias table annotation for use in mutect filters

AS_StrandBiasTest

Allele-specific implementation of strand bias annotations

AS_StrandOddsRatio

Allele-specific strand bias estimated by the Symmetric Odds Ratio test

ASEReadCounter

Calculate read counts per allele for allele-specific expression analysis of RNAseq data

ASEReadCounter.CountPileupType

ASEReadCounter.OUTPUT_FORMAT

AssemblerOffRamp

AssemblyBasedCallerArgumentCollection

Set of arguments for Assembly Based Callers

AssemblyBasedCallerUtils

Created by davidben on 9/8/16.

AssemblyComplexity

AssemblyContigAlignmentsRDDProcessor

A simple heuristic optimizer based on extensive manual review of alignments produced by the aligner (currently "bwa mem -x intractg") with the aim for picking a configuration that provides "optimal coverage" for the input assembly contig.

AssemblyContigAlignmentsRDDProcessor.SAMFormattedContigAlignmentParser

AssemblyContigWithFineTunedAlignments

A wrapper around AlignedContig to represent mapped assembly contig whose alignments went through AssemblyContigAlignmentsRDDProcessor and may represent SV breakpoints.

AssemblyContigWithFineTunedAlignments.AlignmentSignatureBasicType

AssemblyContigWithFineTunedAlignments.ReasonForAlignmentClassificationFailure

AssemblyContigWithFineTunedAlignments.Serializer

AssemblyRegion

Region of the genome that gets assembled by the local assembly engine.

AssemblyRegionArgumentCollection

AssemblyRegionEvaluator

Classes that implement this interface have the ability to evaluate how likely it is that a site is "active" (contains potential real variation).

AssemblyRegionFromActivityProfileStateIterator

Given an iterator of ActivityProfileState, finds AssemblyRegions.

AssemblyRegionIterator

Given a MultiIntervalShard of GATKRead, iterates over each AssemblyRegion within that shard, using the provided AssemblyRegionEvaluator to determine the boundaries between assembly regions.

AssemblyRegionReadShardArgumentCollection

AssemblyRegionTrimmer

Helper component to manage active region trimming

AssemblyRegionWalker

An AssemblyRegionWalker is a tool that processes an entire region of reads at a time, each marked as either "active" (containing possible variation) or "inactive" (not likely to contain actual variation).

AssemblyRegionWalkerContext

Encapsulates an AssemblyRegion with its ReferenceContext and FeatureContext.

AssemblyRegionWalkerSpark

A Spark version of AssemblyRegionWalker.

AssemblyResult

Result of assembling, with the resulting graph and status

AssemblyResult.Status

Status of the assembly result

AssemblyResultSet

Collection of read assembly using several kmerSizes.

AsynchronousStreamWriter<T>

A service that can be used to write to a stream using a thread background thread and an executor service.

AsyncIterator<T>

Wrapper around a CloseableIterator that reads in a separate thread, for cases in which that might be efficient.

AtomicIterator<T>

Describes

AutoCloseableCollection<C extends Collection<? extends AutoCloseable>>

An AutoCloseable collection that will automatically close all of its elements.

AutoCloseableReference<T>

Reference to another object that perform some action when closed.

AutosomalRecessiveConstants

BafEvidence

Biallelic-frequency of a sample at some locus.

BafEvidenceBCICodec

Codec to handle BafEvidence in BlockCompressedInterval files

BafEvidenceCodec

Codec to handle BafEvidence in tab-delimited text files

BafEvidenceSortMerger

Imposes additional ordering of same-locus BafEvidence by sample.

BafRegressMetrics

BaitDesigner

Designs baits for hybrid selection!

BaitDesigner.DesignStrategy

Set of possible design strategies for bait design.

BamIndexStats

Command line program to print statistics from BAM index (.bai) file Statistics include count of aligned and unaligned reads for each reference sequence and a count of all records with no start coordinate.

BamToBfq

Converts a BAM file into a BFQ (binary fastq formatted) file.

BamToBfqWriter

Deprecated.

BandPassActivityProfile

A band pass filtering version of the activity profile Applies a band pass filter with a Gaussian kernel to the input state probabilities to smooth them out of an interval

BAQ

BAQ.BAQCalculationResult

BAQ.CalculationMode

BAQ.QualityMode

these are features that only the walker can override

BarcodeEditDistanceQuery

A class for finding the distance between multiple (matched) barcodes and multiple barcode reads.

BarcodeExtractor

BarcodeExtractor is used to match barcodes and collect barcode match metrics.

BarcodeExtractor.BarcodeMatch

Utility class to hang onto data about the best match for a given barcode

BarcodeFileFaker

Created by jcarey on 3/13/14.

BarcodeFileReader

Reads a single barcode file line by line and returns the barcode if there was a match or NULL otherwise.

BarcodeMetric

Metrics produced by the ExtractIlluminaBarcodes program that is used to parse data in the basecalls directory and determine to which barcode each read should be assigned.

BaseBclReader

BaseCalculator

An interface that can take a collection of bases (provided as SamLocusIterator.RecordAndOffset and SamLocusAndReferenceIterator.SAMLocusAndReference) and generates a ErrorMetric from them.

BaseCallingProgramGroup

Tools that process sequencing machine data, e.g.

BasecallsConverter<CLUSTER_OUTPUT_RECORD>

BasecallsConverter utilizes an underlying IlluminaDataProvider to convert parsed and decoded sequencing data from standard Illumina formats to specific output records (FASTA records/SAM records).

BasecallsConverter.ClusterDataConverter<OUTPUT_RECORD>

Interface that defines a converter that takes ClusterData and returns OUTPUT_RECORD type objects.

BasecallsConverter.ConvertedClusterDataWriter<OUTPUT_RECORD>

Interface that defines a writer that will write out OUTPUT_RECORD type objects.

BasecallsConverterBuilder<CLUSTER_OUTPUT_RECORD>

BasecallsConverterBuilder creates and configures BasecallsConverter objects.

BaseDistributionByCycleMetrics

BaseEdge

Simple edge class for connecting nodes in the graph.

BaseErrorAggregation<CALCULATOR extends BaseCalculator>

An interface and implementations for classes that apply a RecordAndOffsetStratifier to put bases into various "bins" and then compute an ErrorMetric on these bases using a BaseErrorCalculator.

BaseErrorCalculator

BaseErrorMetric

An error metric for the errors in bases.

BaseFuncotatorArgumentCollection

BaseGraph<V extends BaseVertex,E extends BaseEdge>

Common code for graphs used for local assembly.

BaseIlluminaDataProvider

Parse various formats and versions of Illumina Basecall files, and use them the to populate ClusterData objects.

BaselineCopyNumberCollection

Collection of baseline copy-number states.

BaseQuality

Median base quality of bases supporting each allele.

BaseQualityClipReadTransformer

Clips reads on both ends using base quality scores

BaseQualityFilter

BaseQualityHistogram

BaseQualityRankSumTest

Rank Sum Test of REF versus ALT base quality scores

BaseQualityReadTransformer

BaseRecalibrationEngine

BaseRecalibrationEngine.BQSRReferenceWindowFunction

Reference window function for BQSR.

BaseRecalibrator

First pass of the base quality score recalibration.

BaseRecalibratorSpark

Spark version of the first pass of the base quality score recalibration.

BaseRecalibratorSparkFn

BaseUtils

BaseUtils contains some basic utilities for manipulating nucleotides.

BaseUtils.Base

BaseUtils.BaseSubstitutionType

BaseUtils.HmerIterator

BaseVertex

A graph vertex that holds some sequence information

BasicInputParser

TextFileParser which reads a single text file.

BasicReference

A source of reference base calls.

BasicSomaticShortMutationValidator

BasicValidationResult

BayesianGaussianMixtureModeller

BayesianGaussianMixtureModeller.InitMethod

BciFileFaker

Created by jcarey on 3/14/14.

BclData

A class that implements the IlluminaData interfaces provided by this parser One BclData object is returned to IlluminaDataProvider per cluster and each first level array in bases and qualities represents a single read in that cluster

BclFileFaker

BclIndexFaker

BclIndexReader

Annoyingly, there are two different files with extension .bci in NextSeq output.

BclQualityEvaluationStrategy

Describes a mechanism for revising and evaluating qualities read from a BCL file.

BclReader

BCL Files are base call and quality score binary files containing a (base,quality) pair for successive clusters.

BDGAlignmentRecordToGATKReadAdapter

Implementation of the GATKRead interface for the AlignmentRecord class.

BedToIntervalList

BestEndMapqPrimaryAlignmentStrategy

For an aligner that aligns each end independently, select the alignment for each end with the best MAPQ, and make that the primary.

BestEndMapqPrimaryAlignmentStrategy

For an aligner that aligns each end independently, select the alignment for each end with the best MAPQ, and make that the primary.

BestMapqPrimaryAlignmentSelectionStrategy

This strategy was designed for TopHat output, but could be of general utility.

BestMapqPrimaryAlignmentSelectionStrategy

This strategy was designed for TopHat output, but could be of general utility.

BetaBinomialCluster

BetaBinomialDistribution

Beta-binomial using the Apache Math3 Framework.

BetaDistributionShape

BGMMVariantAnnotationsModel

BGMMVariantAnnotationsScorer

BigQueryUtils

Utility class for dealing with BigQuery connections / tables / queries /etc.

BinaryTableReader<R>

Abstract base class for readers of table with records stored in binary.

BinaryTableWriter<R>

Abstract file writing class for record tables stored in binary format.

BinnedCNVLinkage

CNV defragmenter for when the intervals used for coverage collection are available.

BinomialCluster

BlockCompressedIntervalStream

BlockCompressedIntervalStream.IndexEntry

BlockCompressedIntervalStream.Reader<T extends htsjdk.tribble.Feature>

BlockCompressedIntervalStream.WriteFunc<F extends htsjdk.tribble.Feature>

BlockCompressedIntervalStream.Writer<F extends htsjdk.tribble.Feature>

BpmToNormalizationManifestCsv

A simple program to convert an Illumina bpm (bead pool manifest file) into a normalization manifest (bpm.csv) file The normalization manifest (bpm.csv) is a simple text file generated by Illumina tools - it has a specific format and is used by ZCall .

BQSRPipelineSpark

The full BQSR pipeline in one tool to run on Spark.

BQSRReadTransformer

BreakEndVariantType

BreakEndVariantType.InterChromosomeBreakend

BreakEndVariantType.IntraChromosomalStrandSwitch33BreakEnd

BreakEndVariantType.IntraChromosomalStrandSwitch55BreakEnd

BreakEndVariantType.IntraChromosomeRefOrderSwap

BreakEndVariantType.SupportedType

BreakpointComplications

A helper struct for annotating complications that make the locations represented by its associated NovelAdjacencyAndAltHaplotype a little ambiguous.

BreakpointComplications.IntraChrStrandSwitchBreakpointComplications

For novel adjacency between reference locations that are on the same chromosome, and with a strand switch.

BreakpointComplications.IntraChrStrandSwitchBreakpointComplications.Serializer

BreakpointComplications.InvertedDuplicationBreakpointComplications

For this specific complication, we support a what could be defined as incomplete picture, that involves inverted duplication: two overlapping alignments to reference first alignment: --------------------> second alignment: <--------------------- |--------||----------| Seg.1 Seg.2 At least Seg.1 is invert duplicated, and Seg.2 is inverted trans-inserted between the two copies (one of which is inverted).

BreakpointComplications.InvertedDuplicationBreakpointComplications.Serializer

BreakpointComplications.SimpleInsDelOrReplacementBreakpointComplications

For simple deletion, insertion, and replacement (dep and ins at the same time).

BreakpointComplications.SimpleInsDelOrReplacementBreakpointComplications.Serializer

BreakpointComplications.SmallDuplicationBreakpointComplications

For duplications small enough that we seemingly have assembled across the whole event.

BreakpointComplications.SmallDuplicationWithImpreciseDupRangeBreakpointComplications

This is for dealing with case when the duplicated range could NOT be inferred exactly, but only from a simple optimization scheme.

BreakpointComplications.SmallDuplicationWithImpreciseDupRangeBreakpointComplications.Serializer

BreakpointComplications.SmallDuplicationWithPreciseDupRangeBreakpointComplications

BreakpointComplications.SmallDuplicationWithPreciseDupRangeBreakpointComplications.Serializer

BreakpointDensityFilter

A class that acts as a filter for breakpoint evidence.

BreakpointEvidence

Various types of read anomalies that provide evidence of genomic breakpoints.

BreakpointEvidence.DiscordantReadPairEvidence

BreakpointEvidence.ExternalEvidence

BreakpointEvidence.ExternalEvidence.Serializer

BreakpointEvidence.InterContigPair

BreakpointEvidence.InterContigPair.Serializer

BreakpointEvidence.LargeIndel

BreakpointEvidence.LargeIndel.Serializer

BreakpointEvidence.MateUnmapped

BreakpointEvidence.MateUnmapped.Serializer

BreakpointEvidence.OutiesPair

BreakpointEvidence.OutiesPair.Serializer

BreakpointEvidence.ReadEvidence

BreakpointEvidence.ReadEvidence.Serializer

BreakpointEvidence.SameStrandPair

BreakpointEvidence.SameStrandPair.Serializer

BreakpointEvidence.Serializer

BreakpointEvidence.SplitRead

BreakpointEvidence.SplitRead.Serializer

BreakpointEvidence.TemplateSizeAnomaly

BreakpointEvidence.TemplateSizeAnomaly.Serializer

BreakpointEvidence.WeirdTemplateSize

BreakpointEvidence.WeirdTemplateSize.Serializer

BreakpointEvidenceClusterer

A class to examine a stream of BreakpointEvidence, and group it into Intervals.

BreakpointsInference

Based on alignment signature of the input simple chimera, and evidence contig having the chimera, infers exact position of breakpoints following the left-aligning convention, alt haplotype sequence based on given contig sequence complications such as homology, inserted sequence and duplicated ref region, if any.

BucketUtils

Utilities for dealing with google buckets.

Build37ExtendedIlluminaManifest

A class to represent an 'Extended' Illumina Manifest file.

Build37ExtendedIlluminaManifestRecord

A class to represent a record (line) from an Extended Illumina Manifest [Assay] entry

Build37ExtendedIlluminaManifestRecord.Flag

Build37ExtendedIlluminaManifestRecordCreator

BuildBamIndex

Command line program to generate a BAM index (.bai) file from a BAM (.bam) file

BunnyLog

The "bunny" log format: =[**]= START =[**]= STEPEND =[**]= END The functions here create an id for you and keep track of it, and format the various strings, sending it to a logger if you provided one.

BwaAndMarkDuplicatesPipelineSpark

Runs BWA and MarkDuplicates on Spark.

BwaArgumentCollection

A collection of the arguments that are used for BWA.

BwaMemAlignmentUtils

Utils to move data from a BwaMemAlignment into a GATKRead, or into a SAM tag.

BwaMemIndexCache

Manage a global collection of BwaMemIndex instances.

BwaMemIndexImageCreator

Create a BWA-MEM index image file for use with GATK BWA tools

BwaSpark

BwaSparkEngine

The BwaSparkEngine provides a simple interface for transforming a JavaRDD in which the reads are paired and unaligned, into a JavaRDD of aligned reads, and does so lazily.

ByIntervalListVariantContextIterator

Takes a VCFFileReader and an IntervalList and provides a single iterator over all variants in all the intervals.

ByteArrayIterator

Trivial adapter class allowing a primitive byte[] array to be accessed using the java.util.Iterator interface

CachingIndexedFastaSequenceFile

A caching version of the IndexedFastaSequenceFile that avoids going to disk as often as the raw indexer.

CalcMetadataSpark

(Internal) Collects read metrics relevant to structural variant discovery

CalculateAverageCombinedAnnotations

CalculateContamination

Calculates the fraction of reads coming from cross-sample contamination, given results from GetPileupSummaries.

CalculateFingerprintMetrics

Calculates various metrics on a sample fingerprint, indicating whether the fingerprint satisfies the assumptions we have.

CalculateGenotypePosteriors

Calculate genotype posterior probabilities given family and/or known population genotypes

CalculateMixingFractions

Given a VCF of known variants from multiple samples, calculate how much each sample contributes to a pooled BAM.

CalculateReadGroupChecksum

CalibrateDragstrModel

Estimates the parameters for the DRAGstr model for an input sample.

CallCopyRatioSegments

Calls copy-ratio segments as amplified, deleted, or copy-number neutral.

CalledCopyRatioSegment

CalledCopyRatioSegment.Call

CalledCopyRatioSegmentCollection

CalledHaplotypes

Carries the result of a call to #assignGenotypeLikelihoods

CalledLegacySegment

CalledLegacySegmentCollection

Represents a CBS-style segmentation to enable IGV-compatible plotting.

CallingMetricAccumulator

Collects variants and generates metrics about them.

CallingMetricAccumulator.Result

CanonicalSVCollapser

Class for collapsing a collection of similar SVCallRecord objects, such as clusters produced by CanonicalSVLinkage, into a single representative call.

CanonicalSVCollapser.AltAlleleSummaryStrategy

Define strategies for collapsing alt alleles with different subtypes.

CanonicalSVCollapser.BreakpointSummaryStrategy

Define strategies for collapsing variant intervals.

CanonicalSVLinkage<T extends SVCallRecord>

Main class for SV clustering.

CappedHaplotypeProbabilities

CapturedStreamOutput

Stream output captured from a stream.

CapturedStreamOutputSnapshot

Stream output captured from a streaming stream.

Casava18ReadNameEncoder

A read name encoder conforming to the standard described by Illumina Casava 1.8.

CbclData

This class provides that data structure for cbcls.

CbclReader

------------------------------------- CBCL Header ----------------------------------- Bytes 0 - 1 Version number, current version is 1 unsigned 16 bits little endian integer Bytes 2 - 5 Header size unsigned 32 bits little endian integer Byte 6 Number of bits per basecall unsigned Byte 7 Number of bits per q-score unsigned

ChainPruner<V extends BaseVertex,E extends BaseEdge>

CheckDuplicateMarking

CheckDuplicateMarking.Mode

CheckFingerprint

Checks the sample identity of the sequence/genotype data in the provided file (SAM/BAM or VCF) against a set of known genotypes in the supplied genotype file (in VCF format).

CheckIlluminaDirectory

Program to check a lane of an Illumina output directory.

CheckPileup

Compare GATK's internal pileup to a reference Samtools pileup

CheckReferenceCompatibility

Check a BAM/VCF for compatibility against specified references.

CheckReferenceCompatibility.CheckReferenceCompatibilityTableWriter

TableWriter to format and write the table output.

CheckTerminatorBlock

Simple class to check the terminator block of a SAM file.

ChimeraUtil

ChromosomeCounts

Counts and frequency of alleles in called genotypes

CigarBuilder

This class allows code that manipulates cigars to do so naively by handling complications such as merging consecutive identical operators within the builder.

CigarBuilder.Result

CigarUtils

CircularByteBuffer

Implementation of a circular byte buffer that uses a large byte[] internally and supports basic read/write operations from/to other byte[]s passed as arguments.

ClassUtils

Utilities for dealing with reflection.

CleanSam

ClinVarFilter

FuncotationFilter matching variants which: Occur on a gene in the American College of Medical Genomics (ACMG)'s list of clinically-significant variants Have been labeled by ClinVar as pathogenic or likely pathogenic Have a max MAF of 5% across sub-populations of ExAC or gnomAD

ClippingOp

Represents a clip on a read.

ClippingRankSumTest

Rank Sum Test for hard-clipped bases on REF versus ALT reads

ClippingRepresentation

How should we represent a clipped bases in a read?

ClippingUtility

Utilities to clip the adapter sequence from a SAMRecord read

ClipReads

Read clipping based on quality, position or sequence matching.

ClocsFileFaker

ClocsFileReader

The clocs file format is one of 3 Illumina formats(pos, locs, and clocs) that stores position data exclusively.

CloseAtEndIterator<E>

An Iterator that automatically closes a resource when the end of the iteration is reached.

ClosestSVFinder

Efficiently clusters a set of evaluation ("eval") SVs with their closest truth SVs.

ClosestSVFinder.ClosestPair

Output container for an evaluation record and its closest truth record.

ClusterCrosscheckMetrics

Summary

ClusterData

Store the information from Illumina files for a single cluster with one or more reads.

ClusterDataToSamConverter

Takes ClusterData provided by an IlluminaDataProvider into one or two SAMRecords, as appropriate, and optionally marking adapter sequence.

ClusteredCrosscheckMetric

A metric class to hold the result of ClusterCrosscheckMetrics fingerprints.

ClusteredEventsFilter

ClusteringParameters

Stores clustering parameters for different combinations of supporting algorithm types (depth-only/depth-only, depth-only/PESR, and PESR/PESR)

CNNScoreVariants

Annotate a VCF with scores from a Convolutional Neural Network (CNN).

CNNVariantTrain

Train a Convolutional Neural Network (CNN) for filtering variants.

CNNVariantWriteTensors

Write variant tensors for training a Convolutional Neural Network (CNN) for filtering variants.

CNVInputReader

CNVLinkage

Clustering engine class for defragmenting depth-based DEL/DUP calls, such as those produced by GermlineCNVCaller.

CollectAlignmentSummaryMetrics

A command line tool to read a BAM file and produce standard alignment metrics that would be applicable to any alignment.

CollectAlignmentSummaryMetrics.CollectAlignmentRefArgCollection

CollectAllelicCounts

Collects reference and alternate allele counts at specified sites.

CollectAllelicCountsSpark

See CollectAllelicCounts.

CollectArraysVariantCallingMetrics

Collects summary and per-sample metrics about variant calls in a VCF file.

CollectArraysVariantCallingMetrics.ArraysControlCodesSummaryMetrics

CollectArraysVariantCallingMetrics.ArraysVariantCallingDetailMetrics

CollectArraysVariantCallingMetrics.ArraysVariantCallingSummaryMetrics

CollectBaseDistributionByCycle

CollectBaseDistributionByCycleSpark

Collects base distribution per cycle in SAM/BAM/CRAM file(s).

CollectDuplicateMetrics

Collect DuplicateMark'ing metrics from an input file that was already Duplicate-Marked.

CollectF1R2Counts

At each genomic locus, count the number of F1R2/F2R1 alt reads.

CollectF1R2CountsArgumentCollection

CollectGcBiasMetrics

Tool to collect information about GC bias in the reads in a given BAM file.

CollectHiSeqXPfFailMetrics

Collect metrics regarding the reason for reads (sequenced by HiSeqX) not passing the Illumina PF Filter.

CollectHiSeqXPfFailMetrics.PFFailDetailedMetric

a metric class for describing FP failing reads from an Illumina HiSeqX lane *

CollectHiSeqXPfFailMetrics.PFFailSummaryMetric

Metrics produced by the GetHiSeqXPFFailMetrics program.

CollectHiSeqXPfFailMetrics.ReadClassifier

CollectHiSeqXPfFailMetrics.ReadClassifier.PfFailReason

CollectHsMetrics

This tool takes a SAM/BAM file input and collects metrics that are specific for sequence datasets generated through hybrid-selection.

CollectIlluminaBasecallingMetrics

A Command line tool to collect Illumina Basecalling metrics for a sequencing run Requires a Lane and an input file of Barcodes to expect.

CollectIlluminaLaneMetrics

Command-line wrapper around CollectIlluminaLaneMetrics.IlluminaLaneMetricsCollector.

CollectIlluminaLaneMetrics.IlluminaLaneMetricsCollector

Utility for collating Tile records from the Illumina TileMetrics file into lane-level and phasing-level metrics.

CollectIndependentReplicateMetrics

A CLP that, given a BAM and a VCF with genotypes of the same sample, estimates the rate of independent replication of reads within the bam.

CollectInsertSizeMetrics

Command line program to read non-duplicate insert sizes, create a Histogram and report distribution statistics.

CollectInsertSizeMetricsSpark

Collects insert size distribution information in alignment data.

CollectJumpingLibraryMetrics

Command-line program to compute metrics about outward-facing pairs, inward-facing pairs, and chimeras in a jumping library.

CollectMultipleMetrics

Class that is designed to instantiate and execute multiple metrics programs that extend SinglePassSamProgram while making only a single pass through the SAM file and supplying each program with the records as it goes.

CollectMultipleMetrics.Program

CollectMultipleMetrics.ProgramInterface

CollectMultipleMetricsSpark

Runs multiple metrics collection modules for a given alignment file.

CollectMultipleMetricsSpark.SparkCollectorProvider

CollectMultipleMetricsSpark.SparkCollectors

CollectOxoGMetrics

Class for trying to quantify the CpCG->CpCA error rate.

CollectOxoGMetrics.CpcgMetrics

Metrics class for outputs.

CollectQualityYieldMetrics

Command line program to calculate quality yield metrics

CollectQualityYieldMetrics.QualityYieldMetrics

A set of metrics used to describe the general quality of a BAM file

CollectQualityYieldMetrics.QualityYieldMetricsCollector

CollectQualityYieldMetrics.QualityYieldMetricsFlow

CollectQualityYieldMetricsSpark

Collects quality yield metrics in SAM/BAM/CRAM file(s).

CollectRawWgsMetrics

Computes a number of metrics that are useful for evaluating coverage and performance of whole genome sequencing experiments, same implementation as CollectWgsMetrics, with different defaults: lacks baseQ and mappingQ filters and has much higher coverage cap.

CollectRawWgsMetrics.RawWgsMetrics

CollectReadCounts

Collects read counts at specified intervals.

CollectReadCounts.Format

CollectRnaSeqMetrics

CollectRrbsMetrics

Calculates and reports QC metrics for RRBS data based on the methylation status at individual C/G bases as well as CpG sites across all reads in the input BAM/SAM file.

CollectRrbsMetrics.CollectRrbsMetricsReferenceArgumentCollection

CollectSamErrorMetrics

Program to collect error metrics on bases stratified in various ways.

CollectSequencingArtifactMetrics

Quantify substitution errors caused by mismatched base pairings during various stages of sample / library prep.

CollectSVEvidence

Creates discordant read pair, split read evidence, site depth, and read depth files for use in the GATK-SV pipeline.

CollectSVEvidence.BAFSiteIterator

CollectTargetedMetrics<METRIC extends MultilevelMetrics,COLLECTOR extends TargetMetricsCollector<METRIC>>

Both CollectTargetedPCRMetrics and CollectHsSelection share virtually identical program structures except for the name of their targeting mechanisms (e.g.

CollectTargetedPcrMetrics

This tool calculates a set of PCR-related metrics from an aligned SAM or BAM file containing targeted sequencing data.

CollectVariantCallingMetrics

Collects summary and per-sample metrics about variant calls in a VCF file.

CollectVariantCallingMetrics.VariantCallingDetailMetrics

A collection of metrics relating to snps and indels within a variant-calling file (VCF) for a given sample.

CollectVariantCallingMetrics.VariantCallingSummaryMetrics

A collection of metrics relating to snps and indels within a variant-calling file (VCF).

CollectWgsMetrics

Computes a number of metrics that are useful for evaluating coverage and performance of whole genome sequencing experiments.

CollectWgsMetrics.CollectWgsMetricsIntervalArgumentCollection

CollectWgsMetrics.WgsMetricsCollector

CollectWgsMetricsWithNonZeroCoverage

CollectWgsMetricsWithNonZeroCoverage.WgsMetricsWithNonZeroCoverage

Metrics for evaluating the performance of whole genome sequencing experiments.

CollectWgsMetricsWithNonZeroCoverage.WgsMetricsWithNonZeroCoverage.Category

CombineGenotypingArrayVcfs

A simple program to combine multiple genotyping array VCFs into one VCF The input VCFs must have the same sequence dictionary and same list of variant loci.

CombineGVCFs

Combine per-sample gVCF files produced by HaplotypeCaller into a multi-sample gVCF file

CombineSegmentBreakpoints

CommandLineArgumentValidator

Adapter shim/alternate GATK entry point for use by GATK tests to run tools in command line argument validation mode.

CommandLineArgumentValidatorMain

Main class to be used as an alternative entry point to org.broadinstitute.hellbender.Main for performing command line validation only rather than executing the tool.

CommandLineDefaults

Embodies defaults for global values that affect how the Picard Command Line operates.

CommandLineProgram

Abstract class to facilitate writing command-line programs.

CommandLineProgram

Abstract class to facilitate writing command-line programs.

CommandLineProgram.AutoCloseableNoCheckedExceptions

A shim to make use of try-with-resources for tool shutdown

CommandLineSyntaxTranslater

Class for handling translation of Picard-style command line argument syntax to POSIX-style argument syntax; used for running tests written with Picard style syntax against the Barclay command line parser.

CommonSuffixSplitter

Split a collection of middle nodes in a graph into their shared prefix and suffix values This code performs the following transformation.

CompareBaseQualities

Compares the base qualities of two SAM/BAM/CRAM files.

CompareDuplicatesSpark

Determine if two potentially identical BAMs have the same duplicate reads.

CompareGtcFiles

A simple tool to compare two Illumina GTC files.

CompareIntervalLists

CompareMatrix

CompareMatrix contains a square matrix of linear dimension QualityUtils.MAX_SAM_QUAL_SCORE.

CompareMetrics

Compare two metrics files.

CompareMetrics.MetricComparisonDifferences

CompareReferences

Display reference comparison as a tab-delimited table and summarize reference differences.

CompareReferences.BaseComparisonMode

CompareReferences.CompareReferencesOutputTableWriter

TableWriter to format and write the table output.

CompareReferences.FindSNPsOnlyTableWriter

TableWriter to format and write SNP table output.

CompareReferences.MD5CalculationMode

CompareSAMs

Rudimentary SAM comparer.

CompFeatureInput

Required stratification grouping output by each comp

ComplexityPartitioner

A Spark Partitioner that puts tasks with greater complexities into earlier partitions.

ComposeSTRTableFile

This tool looks for low-complexity STR sequences along the reference that are later used to estimate the Dragstr model during single sample auto calibration CalibrateDragstrModel.

CompositeOutputRenderer

Class to make multiple funcotator output at the same time.

CompOverlap

CompressedDataList<T>

A class to represent data as a list of <value,count> pairs.

Concordance

Evaluate site-level concordance of an input VCF against a truth VCF.

ConcordanceState

Created by davidben on 3/2/17.

ConcordanceSummaryRecord

Created by tsato on 2/8/17.

ConcordanceSummaryRecord.Reader

ConcordanceSummaryRecord.Writer

CondenseDepthEvidence

Combines adjacent intervals in DepthEvidence files.

ConfigFactory

A singleton class to act as a user interface for loading configuration files from org.aeonbits.owner.

ContainsKmerReadFilter

Keep reads that do NOT contain one or more kmers from a set of SVKmerShorts

ContainsKmerReadFilterSpark

Wrapper for ContainsKmerReadFilter to avoid serializing the kmer filter in Spark

ContaminationFilter

ContaminationModel

This is the probabilistic contamination model that we use to distinguish homs from hets The model is similar to that of ContEst, in that it assumes that each contaminant read is independently drawn from the population.

ContaminationRecord

Created by David Benjamin on 2/13/17.

ContaminationSegmenter

ContextCovariate

Contig

Stratifies the evaluation by each contig in the reference sequence.

ContigAlignmentsModifier

ContigAlignmentsModifier.AlnModType

ContigAlignmentsModifier.AlnModType.ModTypeString

ContigChimericAlignmentIterativeInterpreter

This class scans the chimeric alignments of input AlignedContig, filters out the alignments that offers weak evidence for a breakpoint and, makes interpretation based on the SimpleChimera extracted.

ConvertHaplotypeDatabaseToVcf

ConvertHeaderlessHadoopBamShardToBam

This is a troubleshooting utility that converts a headerless BAM shard (e.g., a part-r-00000.bam, part-r-00001.bam, etc.), produced by a Spark tool with --sharded-output set to true, into a readable BAM file by adding a header and a BGZF terminator.

ConvertSequencingArtifactToOxoG

CopyNumberAnnotations

CopyNumberArgumentValidationUtils

CopyNumberFormatsUtils

CopyNumberPosteriorDistribution

A record containing the integer copy-number posterior distribution for a single interval.

CopyNumberPosteriorDistributionCollection

Collection of integer copy-number posteriors.

CopyNumberProgramGroup

Tools that analyze read coverage to detect copy number variants

CopyNumberStandardArgument

CopyRatio

CopyRatioCollection

CopyRatioKernelSegmenter

Segments copy-ratio data using kernel segmentation.

CopyRatioModeller

Represents a segmented model for copy ratio fit to denoised log2 copy-ratio data.

CopyRatioParameter

Enumerates the parameters for CopyRatioState.

CopyRatioSegment

CopyRatioSegmentCollection

CosmicFuncotationFactory

Factory for creating Funcotations by handling a SQLite database containing information from COSMIC.

CountBases

Count and print to standard output (and optionally to a file) the total number of bases in a SAM/BAM/CRAM file

CountBasesInReference

Counts the number of times each base occurs in a reference, and prints the counts to standard output (and optionally to a file).

CountBasesSpark

Calculate the overall number of bases SAM/BAM/CRAM file

CounterManager

Class for managing a list of Counters of integer, provides methods to access data from Counters with respect to an offset.

CountFalsePositives

Count variants which were not filtered in a VCF.

CountingAdapterFilter

Counting filter that discards reads are unaligned or aligned with MQ==0 and whose 5' ends look like adapter Sequence

CountingDuplicateFilter

Counting filter that discards reads that have been marked as duplicates.

CountingFilter

A SamRecordFilter that counts the number of bases in the reads which it filters out.

CountingMapQFilter

Counting filter that discards reads below a configurable mapping quality threshold.

CountingPairedFilter

Counting filter that discards reads that are unpaired in sequencing and paired reads whose mates are not mapped.

CountingReadFilter

Wrapper/adapter for ReadFilter that counts the number of reads filtered, and provides a filter count summary.

CountingReadFilter.CountingAndReadFilter

Private class for Counting AND filters

CountingVariantFilter

Wrapper/adapter for VariantFilter that counts the number of variants filtered, and provides a filter count summary.

CountingVariantFilter.CountingAndVariantFilter

Private class for Counting AND filters

CountNs

Apply a read-based annotation that reports the number of Ns seen at a given site.

CountReads

Count and print to standard output (and optionally to a file) the total number of reads in a SAM/BAM/CRAM file.

CountReadsSpark

Calculate the overall number of reads in a SAM/BAM file

CountVariants

Count variant records in a VCF file, regardless of filter status.

CountVariants

CountVariantsSpark

Covariate

The Covariate interface.

CovariateKeyCache

Coverage

Total depth of coverage per sample and over all samples.

CoverageAnalysisProgramGroup

Tools that count coverage, e.g.

CoverageOutputWriter

This is a class for managing the output formatting/files for DepthOfCoverage.

CoverageOutputWriter.DEPTH_OF_COVERAGE_OUTPUT_FORMAT

CoveragePerContig

Represents total coverage over each contig in an ordered set associated with a named sample.

CoveragePerContigCollection

Represents a sequence dictionary and total coverage over each contig in an ordered set associated with a cohort of named samples.

CpG

CpG is a stratification module for VariantEval that divides the input data by within/not within a CpG site

CpxVariantCanonicalRepresentation

This struct contains two key pieces of information that provides interpretation of the event:

CpxVariantCanonicalRepresentation.Serializer

CpxVariantInducingAssemblyContig

One of the two fundamental classes (the other is CpxVariantCanonicalRepresentation) for complex variant interpretation and alt haplotype extraction.

CpxVariantInducingAssemblyContig.Serializer

CpxVariantInterpreter

This deals with the special case where a contig has multiple (> 2) alignments and seemingly has the complete alt haplotype assembled.

CpxVariantReInterpreterSpark

(Internal) Tries to extract simple variants from a provided GATK-SV CPX.vcf

CreateBafRegressMetricsFile

A simple program to create a standard picard metrics file from the output of bafRegress

CreateExtendedIlluminaManifest

Create an Extended Illumina Manifest by performing a liftover to Build 37.

CreateHadoopBamSplittingIndex

Create a Hadoop BAM splitting index and optionally a BAM index from a BAM file.

CreateReadCountPanelOfNormals

Creates a panel of normals (PoN) for read-count denoising given the read counts for samples in the panel.

CreateSequenceDictionary

Create a SAM/BAM file from a fasta containing reference sequence.

CreateSequenceDictionary.CreateSeqDictReferenceArgumentCollection

CreateSomaticPanelOfNormals

Create a panel of normals (PoN) containing germline and artifactual sites for use with Mutect2.

CreateVerifyIDIntensityContaminationMetricsFile

A simple program to create a standard picard metrics file from the output of VerifyIDIntensity

CrosscheckFingerprints

Checks that all data in the set of input files appear to come from the same individual.

CrosscheckMetric

A class to hold the result of crosschecking fingerprints.

CrosscheckMetric.DataType

The data type.

CrosscheckMetric.FingerprintResult

CrosscheckReadGroupFingerprints

Deprecated.

6/6/2017 Use CrosscheckFingerprints instead.

CsvInputParser

CustomBooleanConverter

Converts a given string into a Boolean after trimming whitespace from that string.

CustomMafFuncotationCreator

Produces custom MAF fields (e.g.

CycleCovariate

The Cycle covariate.

CycleSkipStatus

Flow Annotation: cycle skip status: cycle-skip, possible-cycle-skip, non-skip

DataCollection

Interface for tagging any class that represents a collection of datasets required to update posterior samples for Markov-Chain Monte Carlo sampling using samplers implementing the ParameterSampler interface.

DataLine

Table data-line string array wrapper.

DataPoint

DataSourceFuncotationFactory

An abstract class to allow for the creation of a Funcotation for a given data source.

DataSourceUtils

Utilities for reading / working with / manipulating Data Sources.

Datum

DbsnpArgumentCollection

DbSnpBitSetUtil

Utility class to use with DbSnp files to determine is a locus is a dbSnp site.

DbSnpBitSetUtil

Utility class to use with DbSnp files to determine is a locus is a dbSnp site.

DbSnpBitSetUtil.DbSnpBitSets

Little tuple class to contain one bitset for SNPs and another for Indels.

DbSnpBitSetUtil.DbSnpBitSets

Little tuple class to contain one bitset for SNPs and another for Indels.

DbSnpVariantType

Enum to hold the possible types of dbSnps.

Decile

Enumerates individual deciles.

DecileCollection

Represents a set of deciles.

DefaultGATKReadFilterArgumentCollection

Default GATKReadFilterArgumentCollection applied in GATK for optional read filters in the command line.

DefaultGATKVariantAnnotationArgumentCollection

Arguments for requesting VariantContext annotations to be processed by VariantAnnotatorEngine for tools that process variants objects.

Degeneracy

Experimental stratification by the degeneracy of an amino acid, according to VCF annotation.

DelimitedTextFileWithHeaderIterator

Iterate through a delimited text file in which columns are found by looking at a header line rather than by position.

DenoiseReadCounts

Denoises read counts to produce denoised copy ratios.

DeprecatedToolsRegistry

When a tool is removed from GATK (after having been tagged with @DeprecatedFeature for a suitable period), an entry should be added to this list to issue a message when the user tries to run that tool.

DepthEvidence

Read counts for an indefinite number of samples on some interval.

DepthEvidenceBCICodec

Codec to handle DepthEvidence in BlockCompressedInterval files

DepthEvidenceCodec

Codec to handle DepthEvidence in tab-delimited text files

DepthEvidenceSortMerger

Merges records for the same interval into a single record, when possible, throws if not possible.

DepthFilter

Filters out a record if all variant samples have depth lower than the given value.

DepthOfCoverage

Assess sequence coverage by a wide array of metrics, partitioned by sample, read group, or library

DepthOfCoveragePartitionedDataStore

A class helper for storing running intervalPartition data.

DepthOfCoverageStats

A class for storing summarized coverage statistics for DepthOfCoverage.

DepthOneHistograms

Holds histograms of alt depth=1 sites for reference contexts.

DepthPerAlleleBySample

Depth of coverage of each allele per sample

DepthPerSampleHC

Depth of informative coverage for each sample.

DetermineGermlineContigPloidy

Determines the integer ploidy state of all contigs for germline samples given counts data.

DetermineGermlineContigPloidy.RunMode

DiagnosticsAndQCProgramGroup

Tools that collect sequencing quality-related and comparative metrics

DigammaCache

DiploidGenotype

A genotype produced by one of the concrete implementations of AbstractAlleleCaller.

DiploidHaplotype

Simple enum to represent the three possible combinations of major/major, major/minor and minor/minor haplotypes for a diploid individual.

Dirichlet

The Dirichlet distribution is a distribution on multinomial distributions: if pi is a vector of positive multinomial weights such that sum_i pi[i] = 1, the Dirichlet pdf is P(pi) = [prod_i Gamma(alpha[i]) / Gamma(sum_i alpha[i])] * prod_i pi[i]^(alpha[i] - 1) The vector alpha comprises the sufficient statistics for the Dirichlet distribution.

DiscordantPairEvidence

Documents evidence of a too-close or too-far-apart read pair.

DiscordantPairEvidenceBCICodec

Codec to handle DiscordantPairEvidence in BlockCompressedInterval files

DiscordantPairEvidenceCodec

Codec to handle DiscordantPairEvidence in tab-delimited text files

DiscoverVariantsFromContigAlignmentsSAMSpark

(Internal) Examines aligned contigs from local assemblies and calls structural variants

DiskBasedReadEndsForMarkDuplicatesMap

Disk-based implementation of ReadEndsForMarkDuplicatesMap.

DistanceMetric

DoCOutputType

Models a single output file in the DoC walker.

DoCOutputType.Aggregation

DoCOutputType.FileType

DoCOutputType.Partition

DoNotSubclass

Classes annotated with this annotation are NOT intended or designed to be extended and should be treated as final.

DoubleSequence

User argument to specify a sequence of doubles with 3 values in the format "start:step:limit".

DownsampleableSparkReadShard

A simple shard implementation intended to be used for splitting reads by partition in Spark tools

DownsampleByDuplicateSet

Given a bam grouped by the same unique molecular identifier (UMI), this tool drops a specified fraction of duplicate sets and returns a new bam.

Downsampler<T>

The basic downsampler API, with no reads-specific operations.

DownsampleSam

Summary

DownsampleType

Type of downsampling method to invoke.

DownsamplingMethod

Describes the method for downsampling reads at a given locus.

DRAGENGenotypesModel

This is the DRAGEN-GATK genotyper model.

DRAGENMappingQualityReadTransformer

Read transformer intended to replicate DRAGEN behavior for handling mapping qualities.

DragstrHyperParameters

DragstrLocus

Holds information about a locus on the reference that might be used to estimate the DRAGstr model parameters.

DragstrLocus.WriteAction

DragstrLocusCase

Represents the DRAGstr model fitting relevant stats at a given locus on the genome for the target sample.

DragstrLocusCases

Collection of Dragstr Locus cases constraint to a particular period and (minimum) repeat-length

DragstrLocusUtils

DragstrLocusUtils.BinaryTableIndex

DragstrPairHMMInputScoreImputator

Pair-HMM score imputator based on the DRAGstr model parameters.

DragstrParams

Holds the values of the DRAGstr model parameters for different combinations of repeat unit length (period) and number of repeat units.

DragstrParamsBuilder

Partial mutable collection of Dragstr Parameters used to compose the final immutable DragstrParams.

DragstrParamUtils

Utils to read and write DragstrParams instances from an to files and other resources.

DragstrReadSTRAnalyzer

Utility to find short-tandem-repeats on read sequences.

DragstrReferenceAnalyzer

Tool to figure out the period and repeat-length (in units) of STRs in a reference sequence.

DumpTabixIndex

DuplicatedAltReadFilter

DuplicateSetWalker

A walker that processes duplicate reads that share the same Unique molecule Identifier (UMI) as a single unit.

DuplicationMetrics

Metrics that are calculated during the process of marking duplicates within a stream of SAMRecords.

DuplicationMetricsFactory

Factory class that creates either regular or flow-based duplication metrics.

DUSTReadTransformer

Masks read bases and base qualities using the symmetric DUST algorithm

EarliestFragmentPrimaryAlignmentSelectionStrategy

When it is necessary to pick a primary alignment from a group of alignments for a read, pick the one that maps the earliest base in the read.

EarliestFragmentPrimaryAlignmentSelectionStrategy

When it is necessary to pick a primary alignment from a group of alignments for a read, pick the one that maps the earliest base in the read.

EmpiricalPhasingMetricsOutReader

EmpiricalPhasingMetricsOutReader.IlluminaPhasingMetrics

EmptyFragment

Dummy class representing a mated read fragment at a particular start position to be used for accounting when deciding whether to duplicate unmatched fragments.

EmptyOutputArgumentCollection

EnsemblGtfCodec

Codec to decode data in GTF format from ENSEMBL.

ErrorMetric

Created by farjoun on 6/26/18.

ErrorProbabilities

ErrorSummaryMetrics

Summary metrics produced by CollectSequencingArtifactMetrics as a roll up of the context-specific error rates, to provide global error rates per type of base substitution.

ErrorType

Errors in Mutect2 fall into three major categories -- technical artifacts that depend on (usually hidden) features and do not follow the independent reads assumption of the somatic likelihoods model, non-somatic variants such as germline mutations and contamination, and sequencing errors that are captured by the base qualities and the somatic likelihoods model.

EstimateLibraryComplexity

Attempts to estimate library complexity from sequence alone.

EvalFeatureInput

Required stratification grouping output by each eval

EvaluateInfoFieldConcordance

Compare INFO field values between two VCFs or compare two different INFO fields from one VCF.

EvaluationContext

EventMap

Extract simple VariantContext events from a single haplotype

EventType

EvidenceTargetLink

This class holds information about pairs of intervals on the reference that are connected by one or more BreakpointEvidence objects that have distal targets.

EvidenceTargetLink.Serializer

EvidenceTargetLinkClusterer

This class is responsible for iterating over a collection of BreakpointEvidence to find clusters of evidence with distal targets (discordant read pairs or split reads) that agree in their location and target intervals and strands.

ExampleAssemblyRegionWalker

Example/toy program that shows how to implement the AssemblyRegionWalker interface.

ExampleAssemblyRegionWalkerSpark

Example/toy program that shows how to implement the AssemblyRegionWalker interface.

ExampleCollectMultiMetricsSpark

Example Spark tool for collecting multi-level metrics.

ExampleCollectSingleMetricsSpark

Example Spark tool for collecting example single-level metrics.

ExampleFeatureWalker

Example/toy program that shows how to implement the FeatureWalker interface.

ExampleIntervalWalker

Example/toy program that shows how to implement the IntervalWalker interface.

ExampleIntervalWalkerSpark

Example/toy program that shows how to implement the IntervalWalker interface.

ExampleLocusWalker

Example/toy program that shows how to implement the LocusWalker interface.

ExampleLocusWalkerSpark

Example/toy program that shows how to implement the LocusWalker interface.

ExampleMultiFeatureWalker

Example subclass that shows how to use the MultiFeatureWalker class.

ExampleMultiMetrics

An example multi-level metrics collector that just counts the number of reads (per unit/level)

ExampleMultiMetricsArgumentCollection

Example argument collection for multi-level metrics.

ExampleMultiMetricsCollector

Example multi-level metrics collector for illustrating how to collect metrics on specified accumulation levels.

ExampleMultiMetricsCollectorSpark

Example implementation of a multi-level Spark metrics collector.

ExamplePartialReadWalker

Example/toy program that prints reads from the provided file or files with corresponding reference bases (if a reference is provided).

ExamplePostTraversalPythonExecutor

Example/toy ReadWalker program that uses a Python script.

ExampleProgramGroup

Program group for Example programs

ExampleReadWalkerWithReference

Example/toy program that prints reads from the provided file or files with corresponding reference bases (if a reference is provided).

ExampleReadWalkerWithReferenceSpark

Example/toy program that prints reads from the provided file or files with corresponding reference bases (if a reference is provided).

ExampleReadWalkerWithVariants

Example/toy program that prints reads from the provided file or files along with overlapping variants (if a source of variants is provided).

ExampleReadWalkerWithVariantsSpark

Example/toy program that prints reads from the provided file or files along with overlapping variants (if a source of variants is provided).

ExampleReferenceWalker

Counts the number of times each reference context is seen as well as how many times it's overlapped by reads and variants.

ExampleSingleMetrics

ExampleSingleMetricsArgumentCollection

Argument argument collection for Example single level metrics.

ExampleSingleMetricsCollectorSpark

ExampleSingleMetricsCollector for Spark.

ExampleStreamingPythonExecutor

Example ReadWalker program that uses a Python streaming executor to stream summary data from a BAM input file to a Python process through an asynchronous stream writer.

ExampleTwoPassVariantWalker

This walker makes two traversals through variants in a vcf.

ExampleVariantWalker

Example/toy program that shows how to implement the VariantWalker interface.

ExampleVariantWalkerSpark

Example/toy program that shows how to implement the VariantWalker interface.

ExcessHet

Phred-scaled p-value for exact test of excess heterozygosity.

ExcessiveEndClippedReadFilter

Filter out reads where the number of soft-/hard-clipped bases on either end is above a certain threshold.

ExomeStandardArgumentDefinitions

Created by davidben on 11/30/15.

ExtractBarcodesProgram

ExtractFingerprint

Program to create a fingerprint for the contaminating sample when the level of contamination is both known and uniform in the genome.

ExtractIlluminaBarcodes

Determine the barcode for each read in an Illumina lane.

ExtractIlluminaBarcodes.PerTileBarcodeExtractor

Extracts barcodes and accumulates metrics for an entire tile.

ExtractOriginalAlignmentRecordsByNameSpark

Subsets reads by name (basically a parallel version of "grep -f", or "grep -vf")

ExtractSequences

Simple command line program that allows sub-sequences represented by an interval list to be extracted from a reference sequence file.

ExtractSVEvidenceSpark

(Internal) Extracts evidence of structural variations from reads

ExtractVariantAnnotations

Extracts site-level variant annotations, labels, and other metadata from a VCF file to HDF5 files.

F1R2CountsCollector

F1R2FilterConstants

Created by tsato on 3/14/18.

F1R2FilterUtils

FalsePositiveRecord

Created by tsato on 2/14/17.

Family

Stratifies the eval RODs by each family in the eval ROD, as described by the pedigree.

FamilyLikelihoods

Utility to compute genotype posteriors given family priors.

FastaAlternateReferenceMaker

Generate an alternative reference sequence over the specified interval

FastaReferenceMaker

Create a subset of a FASTA reference sequence

FastqToSam

Converts a FASTQ file to an unaligned BAM or SAM file.

FastWgsMetricsCollector

Class represents fast algorithm for collecting data from AbstractLocusInfo with a list of aligned EdgingRecordAndOffset objects.

FeatureContext

Wrapper around FeatureManager that presents Feature data from a particular interval to a client tool without improperly exposing engine internals.

FeatureDataSource<T extends htsjdk.tribble.Feature>

Enables traversals and queries over sources of Features, which are metadata associated with a location on the genome in a format supported by our file parsing framework, Tribble.

FeatureInput<T extends htsjdk.tribble.Feature>

Class to represent a Feature-containing input file.

FeatureManager

Handles discovery of available codecs and Feature arguments, file format detection and codec selection, and creation/management/querying of FeatureDataSources for each source of Features.

FeatureMapper

FeatureOutputCodec<F extends htsjdk.tribble.Feature,S extends FeatureSink<F>>

A FeatureOutputCodec can encode Features into some type of FeatureSink.

FeatureOutputCodecFinder

This class knows about all FeatureOutputCodec implementations, and allows you to find an appropriate codec to create a given file type.

FeatureOutputStream<F extends htsjdk.tribble.Feature>

Class for output streams that encode Tribble Features.

FeatureSink<F extends htsjdk.tribble.Feature>

FeatureWalker<F extends htsjdk.tribble.Feature>

A FeatureWalker is a tool that processes a Feature at a time from a source of Features, with optional contextual information from a reference, sets of reads, and/or supplementary sources of Features.

FeaturizedReadSets

For each sample and for each allele a list feature vectors of supporting reads In order to reduce the number of delimiter characters, we flatten featurized reads.

FermiLiteAssemblyHandler

LocalAssemblyHandler that uses FermiLite.

FifoBuffer

Summary

FileFaker

Filter

Stratifies by the FILTER status (PASS, FAIL) of the eval records

FilterAlignmentArtifacts

Filter false positive alignment artifacts from a VCF callset.

FilterAnalysisRecord

FilterApplyingVariantIterator

Iterator that dynamically applies filter strings to VariantContext records supplied by an underlying iterator.

FilteredHaplotypeFilter

FilterFileFaker

Created by jcarey on 3/13/14.

FilterFileReader

Illumina uses an algorithm described in "Theory of RTA" that determines whether or not a cluster passes filter("PF") or not.

FilterFuncotations

Filter variants based on clinically-significant Funcotations.

FilterFuncotations.AlleleFrequencyDataSource

The allele frequency data source that was used when Funcotating the input VCF.

FilterFuncotations.Reference

The version of the Human Genome reference which was used when Funcotating the input VCF.

FilterFuncotationsConstants

FilterFuncotationsUtils

FilteringOutputStats

Helper class used on the final pass of FilterMutectCalls to record total expected true positives, false positives, and false negatives, as well as false positives and false negatives attributable to each filter

FilterIntervals

Given specified intervals, annotated intervals output by AnnotateIntervals, and/or counts output by CollectReadCounts, outputs a filtered Picard interval list.

FilterMutectCalls

Filter variants in a Mutect2 VCF callset.

FilterSamReads

Summary

FilterSamReads.Filter

FilterStats

FilterType

Stratifies by the FILTER type(s) for each line, with PASS used for passing

FilterVariantTranches

Apply tranche filtering to VCF based on scores from an annotation in the INFO field.

FilterVcf

Applies a set of hard filters to Variants and to Genotypes within a VCF.

FindAssemblyRegionsSpark

Find assembly regions from reads in a distributed Spark setting.

FindBadGenomicKmersSpark

Identifies sequences that occur at high frequency in a reference

FindBreakpointEvidenceSpark

(Internal) Produces local assemblies of genomic regions that may harbor structural variants

FindBreakpointEvidenceSpark.AssembledEvidenceResults

FindBreakpointEvidenceSpark.IntPair

FindMendelianViolations

Summary

Fingerprint

class to represent a genetic fingerprint as a set of HaplotypeProbabilities objects that give the relative probabilities of each of the possible haplotypes at a locus.

FingerprintChecker

Major class that coordinates the activities involved in comparing genetic fingerprint data whether the source is from a genotyping platform or derived from sequence data.

FingerprintIdDetails

class to hold the details of a element of fingerprinting PU tag

FingerprintingDetailMetrics

Detailed metrics about an individual SNP/Haplotype comparison within a fingerprint comparison.

FingerprintingSummaryMetrics

Summary fingerprinting metrics and statistics about the comparison of the sequence data from a single read group (lane or index within a lane) vs.

FingerprintMetrics

Class for holding metrics on a single fingerprint.

FingerprintResults

Class that is used to represent the results of comparing a read group within a SAM file, or a sample within a VCF against one or more set of fingerprint genotypes.

FingerprintUtils

A set of utilities used in the fingerprinting environment

FingerprintUtils.VariantContextSet

A class that holds VariantContexts sorted by genomic position

FisherExactTest

Implements the Fisher's exact test for 2x2 tables assuming the null hypothesis of odd ratio of 1.

FisherStrand

Strand bias estimated using Fisher's Exact Test

FisherStrandFilter

Filters records based on the phred scaled p-value from the Fisher Strand test stored in the FS attribute.

FixMateInformation

Summary

FixMisencodedBaseQualityReads

FixVcfHeader

Tool for replacing or fixing up a VCF header.

FlagStat

Accumulate flag statistics given a BAM file, e.g.

FlagStat.FlagStatus

FlagStatSpark

Spark tool to accumulate flag statistics given a BAM file, e.g.

FlankSettings

Simple struct container class for the 5'/3' flank settings.

FlatMapGluer<I,O>

A little shim that let's you implement a mapPartitions operation (which takes an iterator over all items in the partition, and returns an iterator over all items to which they are mapped) in terms of a flatMap function (which takes a single input item, and returns an iterator over any number of output items).

FlowAnnotatorBase

Base class for flow based annotations Some flow based annotations depend on the results from other annotations, regardless if they were called for by user arguments.

FlowBasedAlignmentArgumentCollection

FlowBasedAlignmentLikelihoodEngine

Flow based replacement for PairHMM likelihood calculation.

FlowBasedArgumentCollection

FlowBasedDuplicationMetrics

FlowBasedHaplotype

Haplotype that also keeps information on the flow space @see FlowBasedRead Haplotype can't be extended, so this extends Allele

FlowBasedHmerBasedReadFilterHelper

A common base class for flow based filters which test for conditions on an hmer basis

FlowBasedHMMEngine

Flow Based HMM, intended to incorporate the scoring model of the FlowBasedAlignmentLikelihoodEngine while allowing for frame-shift insertions and deletions for better genotyping.

FlowBasedKeyCodec

FlowBasedPairHMM

Class for performing the pair HMM for global alignment in FlowSpace.

FlowBasedProgramGroup

Tools that perform variant calling and genotyping for short variants (SNPs, SNVs and Indels) on flow-based sequencing platforms

FlowBasedRead

Adds flow information to the usual GATKRead.

FlowBasedRead.Direction

FlowBasedReadUtils

utility class for flow based read

FlowBasedReadUtils.ReadGroupInfo

FlowBasedTPAttributeSymetricReadFilter

A read filter to test if the TP values for each hmer in a flow based read form a polindrome (as they should)

FlowBasedTPAttributeValidReadFilter

A read filter to test if the TP values for each hmer in a flow based read form are wihin the allowed range (being the possible lengths of hmers - maxHmer)

FlowFeatureMapper

Finds specific features in reads, scores the confidence of each feature relative to the reference in each read and writes them into a VCF file.

FlowFeatureMapper.MappedFeature

FlowFeatureMapper.ReadContext

FlowFeatureMapperArgumentCollection

Set of arguments for the FlowFeatureMapper

FlowModeFragment

Class representing a single read fragment at a particular start location without a mapped mate.

FractionalDownsampler

Fractional Downsampler: selects a specified fraction of the reads for inclusion.

Fragment

All available evidence coming from a single biological fragment.

Fragment

Class representing a single read fragment at a particular start location without a mapped mate.

FragmentCollection<T>

Represents the results of the reads -> fragment calculation.

FragmentDepthPerAlleleBySample

Fragment depth of coverage of each allele per sample

FragmentLength

Median fragment length of reads supporting each allele.

FragmentLengthFilter

FragmentLengthReadFilter

Keep only read pairs (0x1) with absolute insert length less than or equal to the specified maximum, and/or greater than or equal to the specified minimum.

FragmentUtils

FuncotateSegments

Perform functional annotation on a segment file (tsv).

Funcotation

Abstract class representing a Funcotator annotation.

FuncotationFilter

A filter to apply to Funcotations in FilterFuncotations.

FuncotationMap

A linked map of transcript IDs to funcotations.

FuncotationMetadata

Represents metadata information for fields in in a Funcotation.

FuncotationMetadataUtils

Funcotator

Funcotator (FUNCtional annOTATOR) analyzes given variants for their function (as retrieved from a set of data sources) and produces the analysis in a specified output file.

FuncotatorArgumentDefinitions

Class to store argument definitions specific to Funcotator.

FuncotatorArgumentDefinitions.DataSourceType

An enum to handle the different types of input files for data sources.

FuncotatorArgumentDefinitions.OutputFormatType

The file format of the output file.

FuncotatorConstants

FuncotatorDataSourceDownloader

FuncotatorDataSourceDownloader is a tool to download the latest data sources for Funcotator.

FuncotatorEngine

Class that performs functional annotation of variants.

FuncotatorSegmentArgumentCollection

FuncotatorUtils

FuncotatorUtils.Genus

A type to keep track of different specific genuses.

FuncotatorUtils.TranscriptCodingSequenceException

Class representing exceptions that arise when trying to create a coding sequence for a variant:

FuncotatorVariantArgumentCollection

Arguments to be be used by the Funcotator GATKTool, which are specific to Funcotator.

FunctionalClass

Stratifies by nonsense, missense, silent, and all annotations in the input ROD, from the INFO field annotation.

FunctionalClass.FunctionalType

GA4GHScheme

The scheme is defined in the constructor.

GA4GHSchemeWithMissingAsHomRef

The default scheme is derived from the GA4GH Benchmarking Work Group's proposed evaluation scheme.

GatherBamFiles

Concatenate efficiently BAM files that resulted from a scattered parallel analysis.

GatherBQSRReports

GatherNormalArtifactData

GatherPileupSummaries

GatherTranches

GatherVcfs

Simple little class that combines multiple VCFs that have exactly the same set of samples and nonoverlapping sets of loci.

GatherVcfsCloud

This tool combines together rows of variant calls from multiple VCFs, e.g.

GatherVcfsCloud.GatherType

GATKAnnotationArgumentCollection

An abstract ArgumentCollection for defining the set of annotation descriptor plugin arguments that are exposed to the user on the command line.

GATKAnnotationPluginDescriptor

A plugin descriptor for managing the dynamic discovery of both InfoFieldAnnotation and GenotypeAnnotation objects within the packages defined by the method getPackageNames() (default org.broadinstitute.hellbender.tools.walkers.annotator).

GATKAvroReader

GATKConfig

Configuration file for GATK options.

GATKDataSource<T>

A GATKDataSource is something that can be iterated over from start to finish and/or queried by genomic interval.

GATKDocWorkUnit

Custom DocWorkUnit used for generating GATK help/documentation.

GATKDuplicationMetrics

Metrics that are calculated during the process of marking duplicates within a stream of SAMRecords.

GATKException

Class GATKException.

GATKException.MissingReadField

GATKException.ReadAttributeTypeMismatch

GATKException.ShouldNeverReachHereException

For wrapping errors that are believed to never be reachable

GATKGenomicsDBUtils

Utility class containing various methods for working with GenomicsDB Contains code to modify the GenomicsDB import input using the Protobuf API References: GenomicsDB Protobuf structs: https://github.com/GenomicsDB/GenomicsDB/blob/master/src/resources/genomicsdb_vid_mapping.proto Protobuf generated Java code guide: https://developers.google.com/protocol-buffers/docs/javatutorial#the-protocol-buffer-api https://developers.google.com/protocol-buffers/docs/reference/java-generated

GATKGSONWorkUnit

Class representing a GSONWorkUnit for GATK work units.

GATKHelpDoclet

Custom Barclay-based Javadoc Doclet used for generating GATK help/documentation.

GATKHelpDocWorkUnitHandler

The GATK Documentation work unit handler class that is the companion to GATKHelpDoclet.

GATKPath

GATK tool command line arguments that are input or output resources.

GATKRead

Unified read interface for use throughout the GATK.

GATKReadFilterArgumentCollection

An abstract ArgumentCollection for defining the set of read filter descriptor plugin arguments that are exposed to the user on the command line.

GATKReadFilterPluginDescriptor

A CommandLinePluginDescriptor for ReadFilter plugins

GATKReadToBDGAlignmentRecordConverter

Converts a GATKRead to a BDG AlignmentRecord

GATKReadWriter

Interface for classes that are able to write GATKReads to some output destination.

GATKRegistrator

GATKRegistrator registers Serializers for our project.

GATKReport

Container class for GATK report tables

GATKReportColumn

column information within a GATK report table

GATKReportColumnFormat

Column width and left/right alignment.

GATKReportColumnFormat.Alignment

GATKReportDataType

The gatherable data types acceptable in a GATK report column.

GATKReportTable

GATKReportTable.Sorting

GATKReportTable.TableDataHeaderFields

GATKReportTable.TableNameHeaderFields

GATKReportVersion

GATKSparkTool

Base class for GATK spark tools that accept standard kinds of inputs (reads, reference, and/or intervals).

GATKSparkTool.ReadInputMergingPolicy

GATKSVVariantContextUtils

GATKSVVCFConstants

GATKSVVCFConstants.ComplexVariantSubtype

GATKSVVCFConstants.StructuralVariantAnnotationType

GATKSVVCFHeaderLines

GATKTool

Base class for all GATK tools.

GATKVariant

Variant is (currently) a minimal variant interface needed by the Hellbender pipeline.

GATKVariantContextUtils

GATKVariantContextUtils.AlleleMapper

GATKVariantContextUtils.FilteredRecordMergeType

GATKVariantContextUtils.GenotypeMergeType

GATKVCFConstants

This class contains any constants (primarily FORMAT/INFO keys) in VCF files used by the GATK.

GATKVCFHeaderLines

This class contains the VCFHeaderLine definitions for the annotation keys in GATKVCFConstants.

GATKVCFIndexType

Choose the Tribble indexing strategy

GATKWDLDoclet

Custom Barclay-based Javadoc Doclet used for generating tool WDL.

GATKWDLWorkUnitHandler

The GATK WDL work unit handler.

GCBiasCorrector

Learn multiplicative correction factors as a function of GC content using a simple regression.

GcBiasDetailMetrics

Class that holds detailed metrics about reads that fall within windows of a certain GC bin on the reference genome.

GcBiasMetrics

GcBiasMetricsCollector

Calculates GC Bias Metrics on multiple levels Created by kbergin on 3/23/15.

GcBiasSummaryMetrics

High level metrics that capture how biased the coverage in a certain lane is.

GcBiasUtils

Utilities to calculate GC Bias Created by kbergin on 9/23/15.

GcContent

Flow Annotation: percentage of G or C in the window around hmer

GencodeFuncotation

A class to represent a Functional Annotation.

GencodeFuncotation.VariantClassification

Represents the type and severity of a variant.

GencodeFuncotation.VariantType

GencodeFuncotationBuilder

A builder object to create GencodeFuncotations.

GencodeFuncotationFactory

A factory to create GencodeFuncotations.

GencodeGtfCDSFeature

A Gencode GTF Feature representing a CDS.

GencodeGtfCodec

Tribble Codec to read data from a GENCODE GTF file.

GencodeGtfExonFeature

A Gencode GTF Feature representing an exon.

GencodeGtfFeature

A GencodeGtfFeature represents data in a GENCODE GTF file.

GencodeGtfFeature.AnnotationSource

Keyword identifying the source of the feature, like a program (e.g.

GencodeGtfFeature.FeatureTag

Additional relevant information appended to a feature.

GencodeGtfFeature.FeatureType

Type of the feature represented in a single line of a GENCODE GTF File.

GencodeGtfFeature.GeneTranscriptStatus

Indication of whether a feature is new, tenatative, or already known.

GencodeGtfFeature.GenomicPhase

Whether the first base of the CDS segment is the first (frame 0), second (frame 1) or third (frame 2) \ in the codon of the ORF.

GencodeGtfFeature.KnownGeneBiotype

Biotype / transcript type for the transcript or gene represented in a feature.

GencodeGtfFeature.LocusLevel

Status of how a position was annotated / verified: 1 - verified locus 2 - manually annotated locus 3 - automatically annotated locus For more information, see: https://www.gencodegenes.org/data_format.html https://en.wikipedia.org/wiki/General_feature_format

GencodeGtfFeature.OptionalField<T>

GencodeGtfFeature.RemapStatus

Attribute that indicates the status of the mapping.

GencodeGtfFeature.RemapTargetStatus

Attribute that compares the mapping to the existing target annotations.

GencodeGtfFeature.TranscriptSupportLevel

Transcript score according to how well mRNA and EST alignments match over its full length.

GencodeGtfFeatureBaseData

Struct-like container class for the fields in a GencodeGtfFeature This is designed to be a basic dummy class to make feature instantiation easier.

GencodeGtfGeneFeature

A Gencode GTF Feature representing a gene.

GencodeGtfSelenocysteineFeature

A Gencode GTF Feature representing a selenocysteine.

GencodeGtfStartCodonFeature

A Gencode GTF Feature representing a start codon.

GencodeGtfStopCodonFeature

A Gencode GTF Feature representing a stop codon.

GencodeGtfTranscriptFeature

A Gencode GTF Feature representing a transcript.

GencodeGtfUTRFeature

A Gencode GTF Feature representing an untranslated region.

Gene

Holds annotation of a gene for storage in an OverlapDetector.

GeneAnnotationReader

Load gene annotations into an OverlapDetector of Gene objects.

GeneExpressionEvaluation

Evaluate gene expression from RNA-seq reads aligned to genome.

GeneExpressionEvaluation.Coverage

GeneExpressionEvaluation.FragmentCountReader

GeneExpressionEvaluation.FragmentCountWriter

GeneExpressionEvaluation.SingleStrandFeatureCoverage

GeneListOutputRenderer

This class can - only work on segments.

GenomeLoc

Genome location representation.

GenomeLocParser

Factory class for creating GenomeLocs

GenomeLocParser.ValidationLevel

How much validation should we do at runtime with this parser?

GenomeLocSortedSet

Class GenomeLocCollection

GenomicsDBArgumentCollection

GenomicsDBConstants

Constants related to GenomicsDB

GenomicsDBImport

Import single-sample GVCFs into GenomicsDB before joint genotyping.

GenomicsDBOptions

Encapsulates the GenomicsDB-specific options relevant to the FeatureDataSource

GenotypeAlleleCounts

Collection of allele counts for a genotype.

GenotypeAnnotation

Represents an annotation that is computed for a single genotype.

GenotypeAssignmentMethod

Created by davidben on 6/10/16.

GenotypeCalculationArgumentCollection

GenotypeConcordance

Summary

GenotypeConcordance.Alleles

A simple structure to return the results of getAlleles.

GenotypeConcordanceContingencyMetrics

Class that holds metrics about the Genotype Concordance contingency tables.

GenotypeConcordanceCounts

A class to store the counts for various truth and call state classifications relative to a reference.

GenotypeConcordanceDetailMetrics

Class that holds detail metrics about Genotype Concordance

GenotypeConcordanceScheme

This defines for each valid TruthState and CallState tuple, the set of contingency table entries that to which the tuple should contribute.

GenotypeConcordanceSchemeFactory

Created by kbergin on 6/19/15.

GenotypeConcordanceStateCodes

Created by kbergin on 7/30/15.

GenotypeConcordanceStates

A class to store the various classifications for: 1.

GenotypeConcordanceStates.CallState

These states represent the relationship between the call genotype and the truth genotype relative to a reference sequence.

GenotypeConcordanceStates.ContingencyState

A specific state for a 2x2 contingency table.

GenotypeConcordanceStates.TruthAndCallStates

A minute class to store the truth and call state respectively.

GenotypeConcordanceStates.TruthState

These states represent the relationship between a truth genotype and the reference sequence.

GenotypeConcordanceSummaryMetrics

Class that holds summary metrics about Genotype Concordance

GenotypeCounts

GenotypeFilter

An interface for classes that perform Genotype filtration.

GenotypeFilterSummary

Created by bimber on 5/17/2017.

GenotypeGVCFs

Perform joint genotyping on one or more samples pre-called with HaplotypeCaller

GenotypeGVCFsAnnotationArgumentCollection

GenotypeGVCFsEngine

Engine class to allow for other classes to replicate the behavior of GenotypeGVCFs.

GenotypeLikelihoodCalculator

GenotypeLikelihoodCalculatorDRAGEN

Helper to calculate genotype likelihoods for DRAGEN advanced genotyping models (BQD - Base Quality Dropout, and FRD - Foreign Reads Detection).

GenotypeLikelihoodCalculators

Genotype likelihood calculator utility.

GenotypePriorCalculator

Class to compose genotype prior probability calculators.

GenotypeQualityFilter

Genotype filter that filters out genotypes below a given quality threshold.

GenotypeSummaries

Summarize genotype statistics from all samples at the site level

GenotypeUtils

GenotypingArraysProgramGroup

Miscellaneous tools, e.g.

GenotypingData<A extends htsjdk.variant.variantcontext.Allele>

Encapsulates the data use to make the genotype calls.

GenotypingEngine<Config extends StandardCallerArgumentCollection>

Base class for genotyper engines.

GenotypingLikelihoods<A extends htsjdk.variant.variantcontext.Allele>

Genotyping Likelihoods collection.

GenotypingModel

A wrapping interface between the various versions of genotypers so as to keep them interchangeable.

GermlineCallingArgumentCollection

GermlineCNVCaller

Calls copy-number variants in germline samples given their counts and the corresponding output of DetermineGermlineContigPloidy.

GermlineCNVCaller.RunMode

GermlineCNVHybridADVIArgumentCollection

GermlineCNVIntervalVariantComposer

Helper class for PostprocessGermlineCNVCalls for single-sample postprocessing of GermlineCNVCaller calls into genotyped intervals.

GermlineCNVNamingConstants

This class stores naming standards in the GermlineCNVCaller.

GermlineCNVSegmentVariantComposer

Helper class for PostprocessGermlineCNVCalls for single-sample postprocessing of segmented GermlineCNVCaller calls.

GermlineCNVVariantComposer<DATA extends htsjdk.samtools.util.Locatable>

Base class for GermlineCNVIntervalVariantComposer and GermlineCNVSegmentVariantComposer.

GermlineContigPloidyHybridADVIArgumentCollection

GermlineContigPloidyModelArgumentCollection

GermlineDenoisingModelArgumentCollection

GermlineDenoisingModelArgumentCollection.CopyNumberPosteriorExpectationMode

GermlineFilter

GetNormalArtifactData

Usage example

GetPileupSummaries

Summarizes counts of reads that support reference, alternate and other alleles for given sites.

GetSampleName

Emit a single sample name from the bam header into an output file.

GibbsSampler<V extends Enum<V> & ParameterEnum,S extends ParameterizedState<V>,T extends DataCollection>

Implements Gibbs sampling of a multivariate probability density function.

GnarlyGenotyper

Perform "quick and dirty" joint genotyping on one or more samples pre-called with HaplotypeCaller

GnarlyGenotyperEngine

Guts of the GnarlyGenotyper

GraphBasedKBestHaplotypeFinder<V extends BaseVertex,E extends BaseEdge>

Efficient algorithm to obtain the list of best haplotypes given the instance.

GraphUtils

Utility functions used in the graphs package

GraphUtils

Created by farjoun on 11/2/16.

GraphUtils.Graph<Node extends Comparable<Node>>

GroundTruthReadsBuilder

An internal tool to produce a flexible and robust ground truth set for base calling training.

GtcToVcf

Class to convert an Illumina GTC file into a VCF file.

GVCFBlock

GVCFBlockCombiner

Combines variants into GVCF blocks.

GVCFBlockCombiningIterator

Turns an iterator of VariantContext into one which combines GVCF blocks.

GvcfMetricAccumulator

An accumulator for collecting metrics about a single-sample GVCF.

GVCFWriter

Genome-wide VCF writer Merges blocks based on GQ

Haplotype

HaplotypeBAMDestination

Utility class that allows easy creation of destinations for the HaplotypeBAMWriters

HaplotypeBAMWriter

A BAMWriter that aligns reads to haplotypes and emits their best alignments to a destination.

HaplotypeBAMWriter.WriterType

Possible modes for writing haplotypes to BAMs

HaplotypeBasedVariantRecaller

Calculate likelihood matrix for each Allele in VCF against a set of Reads limited by a set of Haplotypes

HaplotypeBasedVariantRecallerArgumentCollection

Set of arguments for the HaplotypeBasedVariantRecaller

HaplotypeBlock

Represents information about a group of SNPs that form a haplotype in perfect LD with one another.

HaplotypeCaller

Call germline SNPs and indels via local re-assembly of haplotypes

HaplotypeCallerArgumentCollection

Set of arguments for the HaplotypeCallerEngine

HaplotypeCallerArgumentCollection.FlowMode

the different flow modes, in terms of their parameters and their values NOTE: a parameter value ending with /o is optional - meaning it will not fail the process if it is not existent on the target parameters collection.

HaplotypeCallerEngine

The core engine for the HaplotypeCaller that does all of the actual work of the tool.

HaplotypeCallerGenotypingDebugger

A short helper class that manages a singleton debug stream for HaplotypeCaller genotyping information that is useful for debugging.

HaplotypeCallerGenotypingEngine

HaplotypeCaller's genotyping strategy implementation.

HaplotypeCallerReadThreadingAssemblerArgumentCollection

HaplotypeCallerSpark

******************************************************************************** * This tool DOES NOT match the output of HaplotypeCaller.

HaplotypeFilteringAnnotation

Set of annotations meant to be reflective of HaplotypeFiltering operations that were applied in FlowBased HaplotypeCaller.

HaplotypeMap

A collection of metadata about Haplotype Blocks including multiple in memory "indices" of the data to make it easy to query the correct HaplotypeBlock or Snp by snp names, positions etc.

HaplotypeProbabilities

Abstract class for storing and calculating various likelihoods and probabilities for haplotype alleles given evidence.

HaplotypeProbabilities.Genotype

Log10(P(evidence| haplotype)) for the 3 different possible haplotypes {aa, ab, bb}

HaplotypeProbabilitiesFromContaminatorSequence

Represents the probability of the underlying haplotype of the contaminating sample given the data.

HaplotypeProbabilitiesFromGenotype

Represents a set of HaplotypeProbabilities that were derived from a single SNP genotype at a point in time.

HaplotypeProbabilitiesFromGenotypeLikelihoods

Represents the likelihood of the HaplotypeBlock given the GenotypeLikelihoods (GL field from a VCF, which is actually a log10-likelihood) for each of the SNPs in that block.

HaplotypeProbabilitiesFromSequence

Represents the probability of the underlying haplotype given the data.

HaplotypeProbabilityOfNormalGivenTumor

A wrapper class for any HaplotypeProbabilities instance that will assume that the given evidence is that of a tumor sample and provide an hp for the normal sample that tumor came from.

HaplotypeRegionWalker

a service class for HaplotypeBasedVariableRecaller that reads a SAM/BAM file, interprets the reads as haplotypes and called a provided consumer with the 'best' haplotypes found for a given query location.

HardAlleleFilter

Base class for Hard filters that are applied at the allele level

HardFilter

HardThresholdingOutputStream

An output stream which stops at the threshold instead of potentially triggering early.

HasGenomeLocation

Indicates that this object has a genomic location and provides a systematic interface to get it.

HDF5SimpleCountCollection

Helper class for SimpleCountCollection used to read/write HDF5.

HDF5SVDReadCountPanelOfNormals

Represents the SVD panel of normals to be created by CreateReadCountPanelOfNormals.

HDF5Utils

TODO move into hdf5-java-bindings

HeaderlessSAMRecordCoordinateComparator

A comparator for headerless SAMRecords that exactly matches the ordering of the SAMRecordCoordinateComparator

HelpConstants

HeterogeneousPloidyModel

General heterogeneous ploidy model.

HeterozygosityCalculator

A class containing utility methods used in the calculation of annotations related to cohort heterozygosity, e.g.

Histogram

Class used for storing a list of doubles as a run length encoded histogram that compresses the data into bins spaced at defined intervals.

HitsForInsert

Holds all the hits (alignments) for a read or read pair.

HitsForInsert.NumPrimaryAlignmentState

HmerIndelLength

Flow Annotation: length of the hmer indel, if so

HmerIndelNuc

Flow Annotation: nucleotide of the hmer indel, if so

HmerMotifs

Flow Annotation: motifs to the left and right of the indel

HmerQualitySymetricReadFilter

A read filter to test if the quality values for each hmer in a flow based read form a polindrome (as they should)

HomogeneousPloidyModel

PloidyModel implementation tailored to work with a homogeneous constant ploidy across samples and positions.

HomoSapiensConstants

Homo sapiens genome constants.

HopscotchCollection<T>

Multiset implementation that provides low memory overhead with a high load factor by using the hopscotch algorithm.

HopscotchCollectionSpark<T>

HopscotchCollectionSpark.Serializer<T>

HopscotchMap<K,V,T extends Map.Entry<K,V>>

A uniquely keyed map with O(1) operations.

HopscotchMapSpark<K,V,T extends Map.Entry<K,V>>

A uniquely keyed map with O(1) operations.

HopscotchMapSpark.Serializer

HopscotchMultiMap<K,V,T extends Map.Entry<K,V>>

A map that can contain multiple values for a given key.

HopscotchMultiMapSpark<K,V,T extends Map.Entry<K,V>>

A map that can contain multiple values for a given key.

HopscotchMultiMapSpark.Serializer

HopscotchSet<T>

Implements Set by imposing a unique-element constraint on HopscotchCollection.

HopscotchSetSpark<T>

Implements Set by imposing a unique-element constraint on HopscotchCollection.

HopscotchSetSpark.Serializer

HopscotchUniqueMultiMap<K,V,T extends Map.Entry<K,V>>

A map that can contain multiple values for a given key, but distinct entries.

HopscotchUniqueMultiMapSpark<K,V,T extends Map.Entry<K,V>>

A map that can contain multiple values for a given key, but distinct entries.

HopscotchUniqueMultiMapSpark.Serializer

HostAlignmentReadFilter

Filters out reads above a threshold identity (number of matches less deletions), given in bases.

HsMetricCollector

Calculates HS metrics for a given SAM or BAM file.

HsMetrics

Metrics generated by CollectHsMetrics for the analysis of target-capture sequencing experiments.

HtsgetClass

Classes of data that can be requested in an htsget request as defined by the spec

HtsgetErrorResponse

Class allowing deserialization from json htsget error response

HtsgetFormat

Formats currently supported by htsget as defined by spec

HtsgetReader

A tool that downloads a file hosted on an htsget server to a local file

HtsgetRequestBuilder

Builder for an htsget request that allows converting the request to a URI after validating that it is properly formed

HtsgetRequestField

Fields which can be used to filter a htsget request as defined by the spec

HtsgetResponse

Class allowing deserialization from json htsget response

HtsgetResponse.Block

HttpUtils

HybridADVIArgumentCollection

HybridADVIArgumentCollection.HybridADVIArgument

IdentifyContaminant

Program to create a fingerprint for the contaminating sample when the level of contamination is both known and uniform in the genome.

IGVUtils

Utilities for interacting with IGV-specific formats.

IlluminaAdapterPair

Describes adapters used on each pair of strands

IlluminaAdpcFileWriter

A class to encompass writing an Illumina adpc.bin file.

IlluminaAdpcFileWriter.Record

IlluminaBasecallingMetrics

Metric for Illumina Basecalling that stores means and standard deviations on a per-barcode per-lane basis.

IlluminaBasecallsToFastq

IlluminaBasecallsToFastq.ReadNameFormat

Simple switch to control the read name format to emit.

IlluminaBasecallsToSam

IlluminaBasecallsToSam transforms a lane of Illumina data file formats (bcl, locs, clocs, qseqs, etc.) into SAM, BAM or CRAM file format.

IlluminaBPMFile

A class to parse the contents of an Illumina Bead Pool Manifest (BPM) file A BPM file contains metadata (including the alleles, mapping and normalization information) on an Illumina Genotyping Array Each type of genotyping array has a specific BPM .

IlluminaBPMLocusEntry

A simple class to represent a locus entry in an Illumina Bead Pool Manifest (BPM) file

IlluminaDataProviderFactory

IlluminaDataProviderFactory accepts options for parsing Illumina data files for a lane and creates an IlluminaDataProvider, an iterator over the ClusterData for that lane, which utilizes these options.

IlluminaDataType

List of data types of interest when parsing Illumina data.

IlluminaFileUtil

General utils for dealing with IlluminaFiles as well as utils for specific, support formats.

IlluminaFileUtil.SupportedIlluminaFormat

IlluminaGenotype

IlluminaLaneMetrics

Embodies characteristics that describe a lane.

IlluminaManifest

A class to represent an Illumina Manifest file.

IlluminaManifestRecord

A class to represent a record (line) from an Illumina Manifest [Assay] entry

IlluminaManifestRecord.IlluminaStrand

IlluminaMetricsCode

Illumina's TileMetricsOut.bin file codes various metrics, both concrete (all density id's are code 100) or as a base code (e.g.

IlluminaPhasingMetrics

Metrics for Illumina Basecalling that stores median phasing and prephasing percentages on a per-template-read, per-lane basis.

IlluminaReadNameEncoder

A read name encoder following the encoding initially produced by picard fastq writers.

IlluminaUtil

Misc utilities for working with Illumina specific files and data

IlluminaUtil.IlluminaAdapterPair

Describes adapters used on each pair of strands

ImpreciseVariantDetector

InbreedingCoeff

Likelihood-based test for the consanguinuity among samples

IndelClassify

Flow Annotation: indel class: ins, del, NA

IndelErrorCalculator

A calculator that estimates the error rate of the bases it observes for indels only.

IndelErrorMetric

Metric to be used for InDel errors

IndelLength

Flow Annotation: length of indel

IndelLengthHistogram

Simple utility for histogramming indel lengths Based on code from chartl

IndelSize

Stratifies the eval RODs by the indel size Indel sizes are stratified from sizes -100 to +100.

IndelSummary

IndependentReplicateMetric

A class to store information relevant for biological rate estimation

IndependentSampleGenotypesModel

This class delegates genotyping to allele count- and ploidy-dependent GenotypeLikelihoodCalculators under the assumption that sample genotypes are independent conditional on their population frequencies.

IndexedAlleleList<A extends htsjdk.variant.variantcontext.Allele>

Allele list implementation using an indexed-set.

IndexedSampleList

Simple implementation of a sample-list using an indexed-set.

IndexedSet<E>

Set where each element can be reference by a unique integer index that runs from 0 to the size of the set - 1.

IndexFeatureFile

This tool creates an index file for the various kinds of feature-containing files supported by GATK (such as VCF and BED files).

IndexRange

Represents 0-based integer index range.

IndexUtils

InfiniumDataFile

A class to provide methods for accessing Illumina Infinium Data Files.

InfiniumEGTFile

A class to parse the contents of an Illumina Infinium cluster (EGT) file A cluster file contains information about the clustering information used in mapping red / green intensity information to genotype calls

InfiniumFileTOC

A class to encapsulate the table of contents for an Illumina Infinium Data Files.

InfiniumGTCFile

A class to parse the contents of an Illumina Infinium genotype (GTC) file A GTC file is the output of Illumina's genotype calling software (either Autocall or Autoconvert) and contains genotype calls, confidence scores, basecalls and raw intensities for all calls made on the chip.

InfiniumGTCRecord

InfiniumNormalizationManifest

A class to parse the contents of an Illumina Infinium Normalization Manifest file An Illumina Infinium Normalization Manifest file contains a subset of the information contained in the Illumina Manifest file in addition to the normalization ID which is needed for normalizating intensities in GtcToVcf

InfiniumTransformation

InfiniumVcfFields

A class to store fields that are specific to a VCF generated from an Illumina GTC file.

InfiniumVcfFields.GENOTYPE_VALUES

InfoConcordanceRecord

Keeps track of concordance between two info fields.

InfoConcordanceRecord.InfoConcordanceReader

Table reading class for InfoConcordanceRecords

InfoConcordanceRecord.InfoConcordanceWriter

Table writing class for InfoConcordanceRecords

InfoFieldAnnotation

Annotations relevant to the INFO field of the variant file (ie annotations for sites).

InputStreamSettings

Settings that define text to write to the process stdin.

InsertSizeDistribution

Holds the information characterizing and insert size distribution.

InsertSizeDistributionShape

Supported insert size distributions shapes.

InsertSizeMetrics

Metrics about the insert size distribution of a paired-end library, created by the CollectInsertSizeMetrics program and usually written to a file with the extension ".insertSizeMetrics".

InsertSizeMetrics

Metrics about the insert size distribution of a paired-end library, created by the CollectInsertSizeMetrics program and usually written to a file with the extension ".insert_size_metrics".

InsertSizeMetricsArgumentCollection

ArgumentCollection for InsertSizeMetrics collectors.

InsertSizeMetricsCollector

Collects InsertSizeMetrics on the specified accumulationLevels

InsertSizeMetricsCollector

Collects InsertSizeMetrics on the specified accumulationLevels using

InsertSizeMetricsCollectorSpark

Worker class to collect insert size metrics, add metrics to file, and provides accessors to stats of groups of different level.

IntBiConsumer

Created by davidben on 8/19/16.

IntegerCopyNumberSegment

A genotyped integer copy-number segment.

IntegerCopyNumberSegmentCollection

Represents a collection of IntegerCopyNumberSegment for a sample.

IntegerCopyNumberState

This class represents integer copy number states.

IntegrationUtils

Created by tsato on 5/1/17.

IntensityChannel

The channels in a FourChannelIntensityData object, and the channels produced by a ClusterIntensityFileReader, for cases in which it is desirable to handle these abstractly rather than having the specific names in the source code.

IntervalAlignmentContextIterator

For special cases where we want to emit AlignmentContexts regardless of whether we have an overlap with a given interval.

IntervalArgumentCollection

Intended to be used as an @ArgumentCollection for specifying intervals at the command line.

IntervalArgumentCollection

Base interface for an interval argument collection.

IntervalCopyNumberGenotypingData

The bundle of integer copy-number posterior distribution and baseline integer copy-number state for an interval.

IntervalCoverageFinder

Class to find the coverage of the intervals.

IntervalListScatter

IntervalListScatterer

An interface for a class that scatters IntervalLists.

IntervalListScattererByBaseCount

a Baseclass for scatterers that scatter by uniqued base count.

IntervalListScattererByIntervalCount

Scatters IntervalList by interval count so that resulting IntervalList's have the same number of intervals in them.

IntervalListScattererByIntervalCountWithDistributedRemainder

Scatters IntervalList by into `interval count` shards so that resulting IntervalList's have approximately same number of intervals in them.

IntervalListScattererWithoutSubdivision

A BaseCount Scatterer that avoid breaking-up intervals.

IntervalListScattererWithoutSubdivisionWithOverflow

Like IntervalListScattererWithoutSubdivision but will overflow current list if the projected size of the remaining lists is bigger than the "ideal".

IntervalListScattererWithSubdivision

An IntervalListScatterer that attempts to place the same number of (uniquified) bases in each output interval list.

IntervalListScatterMode

An enum to control the creation of the various IntervalListScatter objects

IntervalListToBed

Trivially simple command line program to convert an IntervalList file to a BED file.

IntervalListTools

Performs various IntervalList manipulations.

IntervalListTools.Action

IntervalLocusIterator

Returns a SimpleInterval for each locus in a set of intervals.

IntervalMergingRule

a class we use to determine the merging rules for intervals passed to the GATK

IntervalOverlappingIterator<T extends htsjdk.samtools.util.Locatable>

Wraps an iterator of Locatable with a list of sorted intervals to return only the objects which overlaps with them

IntervalOverlapReadFilter

A simple read filter that allows for the user to specify intervals at the filtering stage.

IntervalPileup

IntervalPileup.Element

IntervalPileup.Insert

IntervalSetRule

set operators for combining lists of intervals

IntervalsManipulationProgramGroup

Tools that process genomic intervals in various formats.

IntervalStratification

Stratifies the variants by whether they overlap an interval in the set provided on the command line.

IntervalUtils

Parse text representations of interval strings that can appear in GATK-based applications.

IntervalUtils.IntervalBreakpointType

An enum to classify breakpoints whether the breakpoint is the start or end of a region.

IntervalWalker

An IntervalWalker is a tool that processes a single interval at a time, with the ability to query optional overlapping sources of reads, reference data, and/or variants/features.

IntervalWalkerContext

Encapsulates a SimpleInterval with the reads that overlap it (the ReadsContext and its ReferenceContext and FeatureContext.

IntervalWalkerSpark

A Spark version of IntervalWalker.

IntHistogram

Histogram of observations on a compact set of non-negative integer values.

IntHistogram.CDF

IntHistogram.CDF.Serializer

IntHistogram.Serializer

IntToDoubleBiFunction

Created by davidben on 8/19/16.

IntToDoubleFunctionCache

A helper class to maintain a cache of an int to double function defined on n = 0, 1, 2.

InverseAllele

Utility class for defining a "not" allele concept that is used to score haplotypes that are not supporting the allele.

IOUtils

Iterators

IUPACReadTransformer

A read transformer to convert IUPAC bases (i.e.

JexlExpression

Stratifies the eval RODs by user-supplied JEXL expressions https://gatk.broadinstitute.org/hc/en-us/articles/360035891011-JEXL-filtering-expressions for more details

JexlExpressionReadTagValueFilter

Keep only reads that the attributes of meet a given set of jexl expressions

JoinReadsWithVariants

Joins an RDD of GATKReads to variant data by copying the variants files to every node, using Spark's file copying mechanism.

JointGermlineCNVSegmentation

Merge GCNV segments VCFs This tool takes in segmented VCFs produced by PostprocessGermlineCNVCalls.

JTBestHaplotype<V extends BaseVertex,E extends BaseEdge>

A best haplotype object for being used with junction trees.

JumboGenotypeAnnotation

FORMAT annotations that look at more inputs than regular annotations

JumboInfoAnnotation

INFO annotations that look at more inputs than regular annotations

JumpingLibraryMetrics

High level metrics about the presence of outward- and inward-facing pairs within a SAM file generated with a jumping library, produced by the CollectJumpingLibraryMetrics program and usually stored in a file with the extension ".jump_metrics".

JunctionTreeKBestHaplotypeFinder<V extends BaseVertex,E extends BaseEdge>

JunctionTreeLinkedDeBruijnGraph

Experimental version of the ReadThreadingGraph with support for threading reads to generate JunctionTrees for resolving connectivity information at longer ranges.

KBestHaplotype<V extends BaseVertex,E extends BaseEdge>

Represents a result from a K-best haplotype search.

KBestHaplotypeFinder<V extends BaseVertex,E extends BaseEdge>

A common interface for the different KBestHaplotypeFinder implementations to conform to

KernelSegmenter<DATA>

Segments data (i.e., finds multiple changepoints) using a method based on the kernel-segmentation algorithm described in https://hal.inria.fr/hal-01413230/document, which gives a framework to quickly calculate the cost of a segment given a low-rank approximation to a specified kernel.

KernelSegmenter.ChangepointSortOrder

Kmer

Fast wrapper for byte[] kmers This objects has several important features that make it better than using a raw byte[] for a kmer: -- Can create kmer from a range of a larger byte[], allowing us to avoid Array.copyOfRange -- Fast equals and hashcode methods -- can get actual byte[] of the kmer, even if it's from a larger byte[], and this operation only does the work of that operation once, updating its internal state

KmerAndCount

A <Kmer,count> pair.

KmerAndCount.Serializer

KmerAndInterval

A <Kmer,IntervalId> pair.

KmerAndInterval.Serializer

KmerCleaner

Eliminates dups, and removes over-represented kmers.

KmerCounter

Iterates over reads, kmerizing them, and counting up just the kmers that appear in a passed-in set.

KMerCounter

generic utility class that counts kmers Basically you add kmers to the counter, and it tells you how many occurrences of each kmer it's seen.

KmerSearchableGraph<V extends BaseVertex,E extends BaseEdge>

Common interface for those graphs that implement vertex by kmer look-up.

KSWindowFinder

KV<K,V>

replacement for dataflow Key-Value class, don't use this anywhere new

LabeledVariantAnnotationsData

Represents a collection of LabeledVariantAnnotationsDatum as a list of lists of datums.

LabeledVariantAnnotationsWalker

Base walker for both ExtractVariantAnnotations and ScoreVariantAnnotations, which enforces identical variant-extraction behavior in both tools via

LabeledVariantAnnotationsWalker.extractVariantMetadata(htsjdk.variant.variantcontext.VariantContext, org.broadinstitute.hellbender.engine.FeatureContext, boolean)

LanePhasingMetricsCollector

Helper class used to transform tile data for a lane into a collection of IlluminaPhasingMetrics

LargeLongHopscotchSet

Set of longs that is larger than the max Java array size ( ~ 2^31 ~ 2 billion) and therefore cannot fit into a single LongHopscotchSet.

LargeLongHopscotchSet.Serializer

LearnReadOrientationModel

Learn the prior probability of read orientation artifact from the output of CollectF1R2Counts of Mutect2 Details of the model may be found in docs/mutect/mutect.pdf.

LearnReadOrientationModelEngine

LeftAlignAndTrimVariants

Left-align indels in a variant callset

LeftAlignIndels

Left-aligns indels in read data

LegacySegment

LegacySegmentCollection

Represents a CBS-style segmentation to enable IGV-compatible plotting.

LevelingDownsampler<T extends List<E>,E>

Leveling Downsampler: Given a set of Lists of arbitrary items and a target size, removes items from the Lists in an even fashion until the total size of all Lists is <= the target size.

LibraryIdGenerator

A class to generate library Ids and keep duplication metrics by library IDs.

LibraryIdGenerator

A class to generate library Ids and keep duplication metrics by library IDs.

LibraryNameSplitter

Splits readers by library name.

LibraryReadFilter

Keep only reads from the specified library.

LibraryStatistics

statistics of fragment length distributions

LibraryStatistics.Serializer

LIBSDownsamplingInfo

Simple wrapper about the information LIBS needs about downsampling

LiftOverHaplotypeMap

Liftover SNPs in HaplotypeMaps from one reference to another

LiftOverIntervalList

This tool adjusts the coordinates in an interval list on one reference to its homologous interval list on another reference, based on a chain file that describes the correspondence between the two references.

LiftoverUtils

LiftoverVcf

Summary

LikelihoodEngineArgumentCollection

Set of arguments related to ReadLikelihoodCalculationEngine implementations

LikelihoodMatrix<EVIDENCE,A extends htsjdk.variant.variantcontext.Allele>

Likelihood matrix between a set of alleles and evidence.

LikelihoodRankSumTest

Rank Sum Test of per-read likelihoods of REF versus ALT reads

LinearCopyRatio

Represents a value of copy ratio in linear space generated by GermlineCNVCaller with the corresponding interval.

LinearCopyRatioCollection

Collection of copy ratios in linear space generated by GermlineCNVCaller with their corresponding intervals

LineIteratorReader

A reader wrapper around a LineIterator.

LmmFilter

FuncotationFilter matching variants which: Have been flagged by LMM as important for loss of function.

LocalAssembler

LocalAssembler.AssemblyTooComplexException

Something to throw when we have too many Contigs or Traversals to proceed with assembly.

LocalAssembler.Contig

An unbranched sequence of Kmers.

LocalAssembler.ContigEndKmer

Initial or final Kmer in a Contig.

LocalAssembler.ContigImpl

Simple implementation of Contig interface.

LocalAssembler.ContigListRC

A list of Contigs that presents a reverse-complemented view of a List of Contigs.

LocalAssembler.ContigOrientation

LocalAssembler.ContigRCImpl

Implementation of Contig for the reverse-complement of some other Contig.

LocalAssembler.Kmer

fixed-size, immutable kmer.

LocalAssembler.KmerAdjacency

A Kmer that remembers its predecessors and successors, and the number of times it's been observed in the assembly's input set of reads.

LocalAssembler.KmerAdjacencyImpl

Class to implement KmerAdjacency for canonical Kmers.

LocalAssembler.KmerAdjacencyRC

Class to implement KmerAdjacency for Kmers that are the reverse-complement of a canonical Kmer.

LocalAssembler.KmerSet<KMER extends LocalAssembler.Kmer>

Set of Kmers.

LocalAssembler.Path

A path through the assembly graph for something (probably a read).

LocalAssembler.PathBuilder

A helper class for Path building.

LocalAssembler.PathPart

A single-Contig portion of a path across the assembly graph.

LocalAssembler.PathPartContig

A part of a path that is present as a sub-sequence of some Contig.

LocalAssembler.PathPartGap

A part of a path that isn't present in the graph.

LocalAssembler.SequenceRC

A CharSequence that is a view of the reverse-complement of another sequence.

LocalAssembler.TransitPairCount

A count of the number of read Paths that cross through some Contig from some previous Contig to some subsequent Contig.

LocalAssembler.Traversal

A list of Contigs through the assembly graph.

LocalAssembler.TraversalEndpointComparator

LocalAssembler.TraversalSet

Set of traversals.

LocalAssembler.WalkData

Per-Contig storage for depth-first graph walking.

LocalAssembler.WalkDataFactory

LocatableFuncotationCreator

Implements fields for use in known locatables.

LocatableMetadata

Interface for marking objects that contain metadata associated with a collection of locatables.

LocatableXsvFuncotationFactory

Factory for creating TableFuncotations by handling `Separated Value` files with arbitrary delimiters (e.g.

LocationAndAlleles

This class exists to allow VariantContext objects to be compared based only on their location and set of alleles, providing a more liberal equals method so that VariantContext objects can be placed into a Set which retains only VCs that have non-redundant location and Allele lists.

LocationTranslationException

LocsFileFaker

Created by jcarey on 3/13/14.

LocsFileReader

The locs file format is one 3 Illumina formats(pos, locs, and clocs) that stores position data exclusively.

LocusFunction

Describes the behavior of a locus relative to a gene.

LocusIteratorByState

Iterator that traverses a SAM File, accumulating information on a per-locus basis Produces AlignmentContext objects, that contain ReadPileups of PileupElements.

LocusWalker

A LocusWalker is a tool that processes reads that overlap a single position in a reference at a time from one or multiple sources of reads, with optional contextual information from a reference and/or sets of variants/Features.

LocusWalkerByInterval

An implementation of LocusWalker that supports arbitrary interval side inputs.

LocusWalkerContext

Encapsulates an AlignmentContext with its ReferenceContext and FeatureContext.

LocusWalkerSpark

A Spark version of LocusWalker.

LofFilter

FuncotationFilter matching variants which: Are classified as FRAME_SHIFT_*, NONSENSE, START_CODON_DEL, or SPLICE_SITE Occur on a gene where loss of function is a disease mechanism Have a max MAF of 1% across sub-populations of ExAC or gnomAD

Log10Cache

Log10FactorialCache

Wrapper class so that the log10Factorial array is only calculated if it's used

Log10PairHMM

Util class for performing the pair HMM for global alignment.

LoggingUtils

Logging utilities.

LoglessPairHMM

LongBloomFilter

Bloom filter for primitive longs.

LongBloomFilter.Serializer

LongHomopolymerHaplotypeCollapsingEngine

Utility class, useful for flow based applications, implementing a workaround for long homopolymers handling.

LongHopscotchSet

This class is based on the HopscotchCollection and HopscotchSet classes for storing Objects.

LongHopscotchSet.Serializer

LongIterator

Iterator-like interface for collections of primitive long's

LowWeightChainPruner<V extends BaseVertex,E extends BaseEdge>

Prune all chains from this graph where all edges in the path have multiplicity < pruneFactor For A -[1]> B -[1]> C -[1]> D would be removed with pruneFactor 2 but A -[1]> B -[2]> C -[1]> D would not be because the linear chain includes an edge with weight >= 2

LRUCache<K,V>

An LRU cache implemented as an extension to LinkedHashMap

M2ArgumentCollection

M2ArgumentCollection.FlowMode

M2FiltersArgumentCollection

MafOutputRenderer

A Funcotator output renderer for writing to MAF files.

MafOutputRendererConstants

Class to hold all the constants required for the MafOutputRenderer.

Main

This is the main class of Hellbender and is the way of executing individual command line programs.

MakeSitesOnlyVcf

Creates a VCF that contains all the site-level information for all records in the input VCF but no genotype information.

MakeVcfSampleNameMap

Creates a TSV from sample name to VCF/GVCF path, with one line per input.

MannWhitneyU

Imported with changes from Picard private.

MannWhitneyU.RankedData

The ranked data in one list and a list of the number of ties.

MannWhitneyU.Result

The results of performing a rank sum test.

MannWhitneyU.TestStatistic

The values of U1, U2 and the transformed number of ties needed for the calculation of sigma in the normal approximation.

MannWhitneyU.TestType

A variable that indicates if the test is one sided or two sided and if it's one sided which group is the dominator in the null hypothesis.

MappingQuality

Median mapping quality of reads supporting each alt allele.

MappingQualityFilter

MappingQualityRankSumTest

Rank Sum Test for mapping qualities of REF versus ALT reads

MappingQualityReadFilter

Keep only reads with mapping qualities within a specified range.

MappingQualityReadTransformer

A read transformer to modify the mapping quality of reads with MQ=255 to reads with MQ=60

MappingQualityZero

Count of all reads with MAPQ = 0 across all samples

MarkDuplicates

A better duplication marking algorithm that handles all cases including clipped and gapped alignments.

MarkDuplicates.DuplicateTaggingPolicy

Enum used to control how duplicates are flagged in the DT optional tag on each read.

MarkDuplicates.DuplicateType

Enum for the possible values that a duplicate read can be tagged with in the DT attribute.

MarkDuplicatesForFlowArgumentCollection

MarkDuplicatesForFlowHelper

MarkDuplicates calculation helper class for flow based mode The class extends the behavior of MarkDuplicates which contains the complete code for the non-flow based mode.

MarkDuplicatesHelper

MarkDuplicatesScoringStrategy

This class helps us compute and compare duplicate scores, which are used for selecting the non-duplicate during duplicate marking (see MarkDuplicatesGATK).

MarkDuplicatesSpark

MarkDuplicates on Spark

MarkDuplicatesSparkArgumentCollection

An argument collection for use with tools that mark optical duplicates.

MarkDuplicatesSparkRecord

A common interface for the data types that represent reads for mark duplicates spark.

MarkDuplicatesSparkRecord.Type

MarkDuplicatesSparkUtils

Utility classes and functions for Mark Duplicates.

MarkDuplicatesSparkUtils.IndexPair<T>

Wrapper object used for storing an object and some type of index information.

MarkDuplicatesSparkUtils.TransientFieldPhysicalLocationComparator

Comparator for TransientFieldPhysicalLocation objects by their attributes and strandedness.

MarkDuplicatesWithMateCigar

An even better duplication marking algorithm that handles all cases including clipped and gapped alignments.

MarkDuplicatesWithMateCigarIterator

This will iterate through a coordinate sorted SAM file (iterator) and either mark or remove duplicates as appropriate.

MarkedOpticalDuplicateReadFilter

MarkIlluminaAdapters

Command line program to mark the location of adapter sequences.

MarkQueue

This is the mark queue.

MatchResults

Represents the results of a fingerprint comparison between one dataset and a specific fingerprint file.

MateDistantReadFilter

Keep only paired reads that are not near each other in a coordinate-sorted source of reads.

MathUtil

General math utilities

MathUtil.LogMath

A collection of common math operations that work with log values.

MathUtils

MathUtils is a static class (no instantiation allowed!) with some useful math methods.

MathUtils.IntToDoubleArrayFunction

MathUtils.RunningAverage

A utility class that computes on the fly average and standard deviation for a stream of numbers.

MatrixSummaryUtils

Static class for implementing some matrix summary stats that are not in Apache, Spark, etc

MeanQualityByCycle

Program to generate a data table and chart of mean quality by cycle from a BAM file.

MeanQualityByCycleSpark

Program to generate a data table and chart of mean quality by cycle from a BAM file.

MemoryBasedReadEndsForMarkDuplicatesMap

Map from String to ReadEnds object.

MendelianViolation

Class for the identification and tracking of mendelian violation.

MendelianViolationEvaluator

Mendelian violation detection and counting

MendelianViolationMetrics

Describes the type and number of mendelian violations found within a Trio.

MendelianViolationsByFamily

Created by farjoun on 6/25/16.

MergeableMetricBase

An extension of MetricBase that knows how to merge-by-adding fields that are appropriately annotated (MergeByAdding).

MergeableMetricBase.MergeByAdding

Metrics whose values can be merged by adding.

MergeableMetricBase.MergeByAssertEquals

Metrics whose values should be equal when merging.

MergeableMetricBase.MergingIsManual

Metrics that are merged manually in the MergeableMetricBase.merge(MergeableMetricBase) ()}.

MergeableMetricBase.NoMergingIsDerived

Metrics that are not merged, but are subsequently derived from other metrics, for example by MergeableMetricBase.calculateDerivedFields().

MergeableMetricBase.NoMergingKeepsValue

Metrics that are not merged.

MergeAnnotatedRegions

MergeAnnotatedRegionsByAnnotation

MergeBamAlignment

Summary

MergeMutect2CallsWithMC3

MergeMutectStats

Merge the stats output by scatters of a single Mutect2 job.

MergePedIntoVcf

Class to take genotype calls from a ped file output from zCall and merge them into a vcf from autocall.

MergeSamFiles

This tool is used for combining SAM and/or BAM files from different runs or read groups into a single file, similar to the \"merge\" function of Samtools (http://www.htslib.org/doc/samtools.html).

MergeVcfs

Combines multiple variant files into a single variant file.

Metadata

Interface for marking objects that contain metadata that can be represented as a SAMFileHeader.

Metadata.Type

MetadataUtils

MetagenomicsProgramGroup

Tools that perform metagenomic analysis, e.g.

MethylationProgramGroup

Tools that performs methylation calling and methylation-based coverage for bisulfite BAMs

MethylationTypeCaller

Identifies methylated bases from bisulfite sequencing data.

MetricAccumulationLevel

For use with Picard metrics programs that may output metrics for multiple levels of aggregation with an analysis.

MetricAccumulationLevel

For use with Picard metrics programs that may output metrics for multiple levels of aggregation with an analysis.

MetricAccumulationLevelArgumentCollection

MetricsArgumentCollection

Base class for defining a set of metrics collector arguments.

MetricsCollection

Created by knoblett on 9/15/15.

MetricsCollectorSpark<T extends MetricsArgumentCollection>

Each metrics collector has to be able to run from 4 different contexts: - a standalone walker tool - the org.broadinstitute.hellbender.metrics.analysis.CollectMultipleMetrics walker tool - a standalone Spark tool - the CollectMultipleMetricsSpark tool In order to allow a single collector implementation to be shared across all of these contexts (standalone and CollectMultiple, Spark and non-Spark), collectors should be factored into the following classes, where X in the class names represents the specific type of metrics being collected: XMetrics extends MetricBase: defines the aggregate metrics that we're trying to collect XMetricsArgumentCollection: defines parameters for XMetrics, extends MetricsArgumentCollection XMetricsCollector: processes a single read, and has a reduce/combiner For multi level collectors, XMetricsCollector is composed of several classes: XMetricsCollector extends MultiLevelReducibleCollector< XMetrics, HISTOGRAM_KEY, XMetricsCollectorArgs, XMetricsPerUnitCollector> XMetricsPerUnitCollector: per level collector, implements PerUnitMetricCollector<XMetrics, HISTOGRAM_KEY, XMetricsCollectorArgs> (requires a combiner) XMetricsCollectorArgs per-record argument (type argument for MultiLevelReducibleCollector) XMetricsCollectorSpark: adapter/bridge between RDD and the (read-based) XMetricsCollector, implements MetricsCollectorSpark CollectXMetrics extends org.broadinstitute.hellbender.metrics.analysis.SinglePassSamProgram CollectXMetricsSpark extends MetricsCollectorSparkTool The following schematic shows the general relationships of these collector component classes in the context of various tools, with the arrows indicating a "delegates to" relationship via composition or inheritance: CollectXMetrics CollectMultipleMetrics \ / \ / v v _______________________________________ | XMetricsCollector =========|=========> MultiLevelReducibleCollector | | | | | V | | | XMetrics | V | XMetricsCollectorArgumentCollection | PerUnitXMetricCollector --------------------------------------- ^ | | XMetricsCollectorSpark ^ ^ / \ / \ CollectXMetricsSpark CollectMultipleMetricsSpark The general lifecycle of a Spark collector (XMetricsCollectorSpark in the diagram above) looks like this: CollectorType collector = new CollectorType() CollectorArgType args = // get metric-specific input arguments // NOTE: getDefaultReadFilters is called before the collector's initialize // method is called, so the read filters cannot access argument values ReadFilter filter == collector.getDefaultReadFilters(); // pass the input arguments to the collector for initialization collector.initialize(args, defaultMetricsHeaders); collector.collectMetrics( getReads().filter(filter), samFileHeader ); collector.saveMetrics(getReadSourceName());

MetricsCollectorSparkTool<T extends MetricsArgumentCollection>

Base class for standalone Spark metrics collector tools.

MetricsReadFilter

Filter out reads that: Fail platform/vendor quality checks (0x200) Are unmapped (0x4) Represent secondary/supplementary alignments (0x100 or 0x800)

MetricsUtils

Utility methods for dealing with MetricsFile and related classes.

MinAlleleFractionFilter

MinibatchSliceSampler<DATA>

Implements slice sampling of a continuous, univariate, unnormalized probability density function (PDF), which is assumed to be unimodal.

MinimalGenotypingEngine

A stripped-down version of the former UnifiedGenotyper's genotyping strategy implementation, used only by the HaplotypeCaller for its isActive() determination.

MinimalVariant

MinimalVariant is a minimal implementation of the GATKVariant interface.

MinorAlleleFractionRecord

Created by David Benjamin on 2/13/17.

MinorAlleleFractionRecord.MinorAlleleFractionTableReader

MisencodedBaseQualityReadTransformer

Checks for and errors out (or fixes if requested) when it detects reads with base qualities that are not encoded with phred-scaled quality scores.

MixingFraction

Simple class for storing a sample and its mixing fraction within a pooled bam.

MMapBackedIteratorFactory

MMapBackedIteratorFactory a file reader that takes a header size and a binary file, maps the file to a read-only byte buffer and provides methods to retrieve the header as it's own bytebuffer and create iterators of different data types over the values of file (starting after the end of the header).

ModeArgumentUtils

This class is a static helper for implementing 'mode arguments' by tools.

ModeledSegment

ModeledSegment.SimplePosteriorSummary

ModeledSegmentCollection

ModelSegments

Models segmented copy ratios from denoised copy ratios and segmented minor-allele fractions from allelic counts.

ModelSegments.RunMode

MoleculeID

A container class for the molecule ID, which consists of an integer ID and a binary strand.

Molten

Molten for @Analysis modules.

MostDistantPrimaryAlignmentSelectionStrategy

For a paired-end aligner that aligns each end independently, select the pair of alignments that result in the largest insert size.

MostDistantPrimaryAlignmentSelectionStrategy

For a paired-end aligner that aligns each end independently, select the pair of alignments that result in the largest insert size.

MTLowHeteroplasmyFilterTool

MultiallelicFilter

MultiallelicSummary

MultiallelicSummary.Type

MultiDeBruijnVertex

A DeBruijnVertex that supports multiple copies of the same kmer

MultidimensionalModeller

Represents a segmented model for copy ratio and allele fraction.

MultiFeatureWalker<F extends htsjdk.tribble.Feature>

A MultiFeatureWalker is a tool that presents one Feature at a time in sorted order from multiple sources of Features.

MultiFeatureWalker.DictSource

MultiFeatureWalker.MergingIterator<F extends htsjdk.tribble.Feature>

MultiFeatureWalker.PQContext<F extends htsjdk.tribble.Feature>

MultiFeatureWalker.PQEntry<F extends htsjdk.tribble.Feature>

MultiHitAlignedReadIterator

Iterate over queryname-sorted SAM, and return each group of reads with the same queryname.

MultiIntervalLocalReadShard

A class to represent shards of read data spanning multiple intervals.

MultiIntervalShard<T>

An interface to represent shards of arbitrary data spanning multiple intervals.

MultiLevelCollector<METRIC_TYPE extends htsjdk.samtools.metrics.MetricBase,HISTOGRAM_KEY extends Comparable<HISTOGRAM_KEY>,ARGTYPE>

MultiLevelCollector handles accumulating Metrics at different MetricAccumulationLevels(ALL_READS, SAMPLE, LIBRARY, READ_GROUP).

MultiLevelCollector<METRIC_TYPE extends htsjdk.samtools.metrics.MetricBase,Histogram_KEY extends Comparable,ARGTYPE>

MultiLevelCollector handles accumulating Metrics at different MetricAccumulationLevels(ALL_READS, SAMPLE, LIBRARY, READ_GROUP).

MultilevelMetrics

MultiLevelMetrics

MultiLevelReducibleCollector<METRIC_TYPE extends htsjdk.samtools.metrics.MetricBase,HISTOGRAM_KEY extends Comparable<HISTOGRAM_KEY>,ARGTYPE,UNIT_COLLECTOR extends PerUnitMetricCollector<METRIC_TYPE,HISTOGRAM_KEY,ARGTYPE>>

Abstract base class for reducible multi-level metrics collectors.

MultiplePassReadWalker

A MultiplePassReadWalker traverses input reads multiple times.

MultiplePassReadWalker.GATKReadConsumer

Implemented by MultiplePassReadWalker-derived tools.

MultiplePassVariantWalker

A VariantWalker that makes multiple passes through the variants.

MultiSampleEdge

Edge class for connecting nodes in the graph that tracks some per-sample information.

MultisampleMultidimensionalKernelSegmenter

Segments copy-ratio data and/or alternate-allele-fraction data from one or more samples using kernel segmentation.

MultiTileBclFileFaker

Created by jcarey on 3/13/14.

MultiTileBclFileUtil

NextSeq-style bcl's have all tiles for a cycle in a single file.

MultiTileBclParser

Parse .bcl.bgzf files that contain multiple tiles in a single file.

MultiTileFileUtil<OUTPUT_RECORD extends picard.illumina.parser.IlluminaData>

For file types for which there is one file per lane, with fixed record size, and all the tiles in it, so the s_.bci file can be used to figure out where each tile starts and ends.

MultiTileFilterParser

Read filter file that contains multiple tiles in a single file.

MultiTileLocsFileFaker

Created by jcarey on 3/13/14.

MultiTileLocsParser

Read locs file that contains multiple tiles in a single file.

MultiTileParser<OUTPUT_RECORD extends picard.illumina.parser.IlluminaData>

Abstract class for files with fixed-length records for multiple tiles, e.g.

MultiVariantDataSource

MultiVariantDataSource aggregates multiple FeatureDataSources of variants, and enables traversals and queries over those sources through a single interface.

MultiVariantInputArgumentCollection

Class that defines the variant arguments used for a MultiVariantWalker.

MultiVariantInputArgumentCollection.DefaultMultiVariantInputArgumentCollection

MultiVariantWalker

A MultiVariantWalker is a tool that processes one variant at a time, in position order, from multiple sources of variants, with optional contextual information from a reference, sets of reads, and/or supplementary sources of Features.

MultiVariantWalkerGroupedOnStart

A MultiVariantWalker that walks over multiple variant context sources in reference order and emits to client tools groups of all input variant contexts by their start position.

MummerExecutor

Class for executing MUMmer alignment pipeline to detect SNPs and INDELs in mismatching sequences.

Mutect2

Call somatic short mutations via local assembly of haplotypes.

Mutect2AlleleFilter

Base class for filters that apply at the allele level.

Mutect2Engine

Created by davidben on 9/15/16.

Mutect2Filter

Base class for all Mutect2Filters

Mutect2FilteringEngine

Mutect2VariantFilter

Mutect3DatasetEngine

MutectDownsampler

MutectReadThreadingAssemblerArgumentCollection

MutectStats

NaiveHeterozygousPileupGenotypingUtils

Naive methods for binomial genotyping of heterozygous sites from pileup allele counts.

NaiveHeterozygousPileupGenotypingUtils.NaiveHeterozygousPileupGenotypingResult

NativeUtils

Utilities to provide architecture-dependent native functions

NaturalLogUtils

NDNCigarReadTransformer

A read transformer that refactors NDN cigar elements to one N element.

NearbyKmerErrorCorrector

Utility class that error-corrects reads.

NearbyKmerErrorCorrector.CorrectionSet

Wrapper utility class that holds, for each position in read, a list of bytes representing candidate corrections.

NestedIntegerArray<T extends Serializable>

NestedIntegerArray.Leaf<T>

NGSPlatform

A canonical, master list of the standard NGS platforms.

NioFileCopierWithProgressMeter

Class to copy a file using java.nio.

NioFileCopierWithProgressMeter.ChecksumCalculator

An interface that defines a method to use to calculate a checksum on an InputStream.

NioFileCopierWithProgressMeter.Verbosity

An enum to allow for verbosity of logging progress of an NioFileCopierWithProgressMeter.

NioFileCopierWithProgressMeterResults

An object to hold the results of a copy operation performed by NioFileCopierWithProgressMeterResults.

NonChecksumLocalFileSystem

An extension of Hadoop's LocalFileSystem that doesn't write (or verify) .crc files.

NonLocatableDoubleCollection

A collection representing a real valued vector generated by GermlineCNVCaller

NonNFastaSize

A tool to count the number of non-N bases in a fasta file

NonSymmetricalPairHMMInputScoreImputator

A version of the classic StandardPairHMMInputScoreImputator that allows for decoupled insertion and deletion penalties for the model.

NormalArtifactFilter

NormalArtifactRecord

NormalArtifactRecord.NormalArtifactWriter

NormalizeFasta

Little program to "normalize" a fasta file to ensure that all line of sequence are the same length, and are a reasonable length!

NotOpticalDuplicateReadFilter

Filters out reads marked as duplicates.

NovelAdjacencyAndAltHaplotype

This class represents a pair of inferred genomic locations on the reference whose novel adjacency is generated due to an SV event (in other words, a simple rearrangement between two genomic locations) that is suggested by the input SimpleChimera, and complications as enclosed in BreakpointComplications in pinning down the locations to exact base pair resolution.

NovelAdjacencyAndAltHaplotype.Serializer

Novelty

Stratifies by whether a site in in the list of known RODs (e.g., dbsnp by default)

NRatioFilter

Nucleotide

Represents the nucleotide alphabet with support for IUPAC ambiguity codes.

Nucleotide.Counter

Helper class to count the number of occurrences of each nucleotide code in a sequence.

NuMTFilterTool

OccurrenceMatrix<R,C>

Class to work with exclusive pairs of elements, example - pairs of alleles that do not occur in the haplotypes

OffRampBase

OjAlgoSingularValueDecomposer

SVD using the ojAlgo library.

OneBPIndel

Stratifies the eval RODs into sites where the indel is 1 bp in length and those where the event is 2+.

OneShotLogger

A logger wrapper class which only outputs the first warning provided to it

OnRampBase

OpticalDuplicateFinder

Contains methods for finding optical/co-localized/sequencing duplicates.

OpticalDuplicatesArgumentCollection

An argument collection for use with tools that mark optical duplicates.

OptimizationUtils

Created by davidben on 4/27/16.

OptionalFeatureInputArgumentCollection

An argument collection for use with tools that accept zero or more input files containing Feature records (eg., BED files, hapmap files, etc.).

OptionalIntervalArgumentCollection

An interval argument class that allows -L to be specified but does not require it.

OptionalReadInputArgumentCollection

An argument collection for use with tools that accept zero or more input files containing reads (eg., BAM/SAM/CRAM files).

OptionalReferenceArgumentCollection

Picard default argument collection for an optional reference.

OptionalReferenceInputArgumentCollection

An argument collection for use with tools that optionally accept a reference file as input.

OptionalTextOutputArgumentCollection

An ArgumentCollection with an optional output argument, and utility methods for printing String output to it To use this class add an @ArgumentCollection variable to your tool like so:


OptionalVariantInputArgumentCollection

An argument collection for use with tools that accept zero or more input files containing VariantContext records
 (eg., VCF files).

OrientationBiasReadCounts

Count of read pairs in the F1R2 and F2R1 configurations supporting the reference and alternate alleles

OriginalAlignment

Original Alignment annotation counts the number of alt reads where the original alignment contig doesn't match the current alignment contig

OtherProgramGroup

Miscellaneous tools, e.g.

OutputArgumentCollection

Base interface for an output argument collection.

OutputMapping

In multiple locations we need to know what cycles are output, as of now we output all non-skip cycles, but rather than sprinkle
 this knowledge throughout the parser code, instead OutputMapping provides all the data a client might want about the
 cycles to be output including what ReadType they are.

OutputMode

Describes the mode of output for the caller.

OutputRenderer

An abstract class to allow for writing output for the Funcotator.

OutputStreamSettings

Settings that define text to capture from a process stream.

OverclippedReadFilter

Filter out reads where the number of bases without soft-clips (M, I, X, and = CIGAR operators) is lower than a threshold.

OverhangFixingManager

The class manages reads and splices and tries to apply overhang clipping when appropriate.

OverlappingErrorMetric

An error metric for the errors invovling bases in the overlapping region of a read-pair.

OverlappingIntegerCopyNumberSegmentCollection
 
OverlappingReadsErrorCalculator

A calculator that estimates the error rate of the bases it observes, assuming that the reference is truth.

Pair

Class representing a pair of reads together with accompanying optical duplicate marking information.

Pair<X extends Comparable<X>,Y extends Comparable<Y>>

Simple Pair class.

Pair.Serializer

Serializers for each subclass of PairedEnds which rely on implementations of serializations within each class itself

PairedEnds

Struct-like class to store information about the paired reads for mark duplicates.

PairedStrandedIntervals
 
PairedStrandedIntervalTree<V>
 
PairedStrandedIntervalTree.Serializer<T>
 
PairedVariantSubContextIterator

An iterator that takes a pair of iterators over VariantContexts and iterates over them in tandem.

PairedVariantSubContextIterator.VcfTuple

Little class to hold a pair of VariantContexts that are in sync with one another.

PairHMM

Class for performing the pair HMM for global alignment.

PairHMM.Implementation
 
PairHMMInputScoreImputation
 
PairHMMInputScoreImputator

Common interface for pair-hmm score calculators.

PairHMMLikelihoodCalculationEngine
 
PairHMMLikelihoodCalculationEngine.PCRErrorModel
 
PairHMMModel

Helper class that implement calculations required to implement the PairHMM Finite State Automation (FSA) model.

PairHMMNativeArgumentCollection

Arguments for native PairHMM implementations

PairWalker
 
PalindromeArtifactClipReadTransformer

Trims (hard clips) soft-clipped bases due to the following artifact:

 When a sequence and its reverse complement occur near opposite ends of a fragment DNA damage (especially in the case
 of FFPE samples and ancient DNA) can disrupt base-pairing causing a single-strand loop of the sequence and its reverse
 complement, after which end repair copies the true 5' end of the fragment onto the 3' end of the fragment.

PanelMetricsBase

A base class for Metrics for targeted panels.

PanelOfNormalsFilter
 
ParallelCopyGCSDirectoryIntoHDFSSpark

Parallel copy a file or directory from Google Cloud Storage into the HDFS file system used by Spark

Parameter<T extends Enum<T> & ParameterEnum,U>

Represents a parameter value with a named ParameterEnum key.

ParameterDecileCollection<T extends Enum<T> & ParameterEnum>
 
ParameterEnum

Interface for tagging an enum that represents the name of every Parameter
 comprising a ParameterizedState.

ParameterizedFileUtil
 
ParameterizedModel<V1 extends Enum<V1> & ParameterEnum,S1 extends ParameterizedState<V1>,T1 extends DataCollection>

Represents a parameterized model.

ParameterizedModel.GibbsBuilder<V2 extends Enum<V2> & ParameterEnum,S2 extends ParameterizedState<V2>,T2 extends DataCollection>

Builder for constructing a ParameterizedModel to be Gibbs sampled using GibbsSampler.

ParameterizedModel.UpdateMethod
 
ParameterizedState<T extends Enum<T> & ParameterEnum>

Represents a mapped collection of Parameter objects, i.e., named, ordered, enumerated keys associated with
 values of mixed type via a key -> key, value map.

ParameterSampler<U,V extends Enum<V> & ParameterEnum,S extends ParameterizedState<V>,T extends DataCollection>

Interface for generating random samples of a Parameter value,
 given an ParameterizedState and a DataCollection.

ParameterTableColumn
 
ParamUtils

This class should eventually be merged into Utils, which is in hellbender, and then this class should be deleted.

PartialReadWalker

A specialized read walker that may be gracefully stopped before the input stream ends
 
 A tool derived from this class should implement PartialReadWalker.shouldExitEarly(GATKRead)
 to indicate when to stop.

PartitionCrossingChecker

It allows you to ask whether a given interval is near the beginning or end of the partition.

Passthrough

Dummy class used for preserving reads that need to be marked as non-duplicate despite not wanting to perform any
 processing on the reads.

PassThroughDownsampler

Pass-Through Downsampler: Implementation of the ReadsDownsampler interface that does no
 downsampling whatsoever, and instead simply "passes-through" all the reads it's given.

Path<V extends BaseVertex,E extends BaseEdge>

A path thought a BaseGraph

 class to keep track of paths

PathLineIterator

Iterate through the lines of a Path.

PathProvider

A class whose purpose is to initialize the various plugins that provide Path support.

PathSeqBuildKmers

Produce a set of k-mers from the given host reference.

PathSeqBuildReferenceTaxonomy

Build an annotated taxonomy datafile for a given microbe reference.

PathSeqBwaSpark

Align reads to a microbe reference using BWA-MEM and Spark.

PathSeqFilterSpark

Filters low complexity, low quality, duplicate, and host reads.

PathSeqPipelineSpark

Combined tool that performs all PathSeq steps: read filtering, microbe reference alignment and abundance scoring

PathSeqScoreSpark

Classify reads and estimate abundances of each taxon in the reference.

PedFile

Represents a .ped file of family information as documented here:
    http://pngu.mgh.harvard.edu/~purcell/plink/data.shtml

 Stores the information in memory as a map of individualId -> Pedigree information for that individual

PedigreeAnnotation

A common interface for handling annotations that require pedigree file information either in the form of explicitly
 selected founderIDs or in the form of an imported pedigreeFile.

PedigreeValidationType
 
PedReader

Reads PED file-formatted tabular text files

 See http://www.broadinstitute.org/mpg/tagger/faq.html
 See http://pngu.mgh.harvard.edu/~purcell/plink/data.shtml#ped

 The "ped" file format refers to the widely-used format for linkage pedigree data.

PedReader.Field
 
PedReader.MissingPedField

An enum that specifies which, if any, of the standard PED fields are
 missing from the input records.

PerAlleleAnnotation

Apply an annotation based on aggregation data from all reads supporting each allele.

PerAlleleCollection<X extends Number>

A container for allele to value mapping.

PerAlleleCollection.Type
 
Permutation<E>

Represent a permutation of an ordered set or list of elements.

PersistenceOptimizer

Given 1-dimensional data, finds all local minima sorted by decreasing topological persistence.

PerTileFileUtil
 
PerTileOrPerRunFileUtil
 
PerTileParser<ILLUMINA_DATA extends picard.illumina.parser.IlluminaData>

Abstract base class for Parsers that open a single tile file at a time and iterate through them.

PerTilePerCycleFileUtil
 
PerUnitExampleMultiMetricsCollector

A Collector for individual ExampleMultiMetrics for a given SAMPLE or SAMPLE/LIBRARY or
 SAMPLE/LIBRARY/READ_GROUP (depending on aggregation levels)

PerUnitInsertSizeMetricsCollector

A Collector for individual InsertSizeMetrics for a given SAMPLE or SAMPLE/LIBRARY or
 SAMPLE/LIBRARY/READ_GROUP (depending on aggregation levels)

PerUnitMetricCollector<BEAN extends htsjdk.samtools.metrics.MetricBase,HKEY extends Comparable<HKEY>,ARGTYPE>

PerRecordCollector - An interface for classes that collect data in order to generate one or more metrics.

PerUnitMetricCollector<BEAN extends htsjdk.samtools.metrics.MetricBase,HKEY extends Comparable,ARGTYPE>

PerRecordCollector - An interface for classes that collect data in order to generate one or more metrics.

PGTagArgumentCollection

Argument Collection which holds parameters common to classes that want to add PG tags to reads in SAM/BAM files

PhysicalLocation

Small interface that provides access to the physical location information about a cluster.

PhysicalLocationForMateCigar

Stores the minimal information needed for optical duplicate detection.

PhysicalLocationForMateCigarSet

This stores records that are comparable for detecting optical duplicates.

PhysicalLocationInt

Small class that provides access to the physical location information about a cluster.

PhysicalLocationShort

Small class that provides access to the physical location information about a cluster.

PicardCommandLine

This is the main class of Picard and is the way of executing individual command line programs.

PicardCommandLineProgram

Base class for all Picard tools.

PicardCommandLineProgramExecutor

Adapter shim for use within GATK to run Picard tools.

PicardException

Basic Picard runtime exception that, for now, does nothing much

PicardHelpDoclet

Custom Barclay-based Javadoc Doclet used for generating Picard help/documentation.

PicardHelpDocWorkUnitHandler

The Picard Documentation work unit handler class that is the companion to PicardHelpDoclet.

PicardHtsPath

A Subclass of HtsPath with conversion to Path making use of IOUtil

PicardNonZeroExitException

Exception used to propagate non-zero return values from Picard tools.

Pileup

Prints read alignments in samtools pileup format.

PileupBasedAlleles

Helper class for handling pileup allele detection supplement for assembly.

PileupDetectionArgumentCollection

Set of arguments for configuring the pileup detection code

PileupElement

Represents an individual base in a reads pileup.

PileupReadErrorCorrector
 
PileupSpark

Prints read alignments in samtools pileup format.

PileupSummary

Created by David Benjamin on 2/14/17.

PileupSummary.PileupSummaryComparator
 
PileupSummary.PileupSummaryTableWriter
 
PlatformReadFilter

Keep only reads where the the Read Group platform attribute (RG:PL tag) contains the given string.

PlatformUnitReadFilter

Filter out reads where the the platform unit attribute (PU tag) contains the given string.

PloidyModel

Information about the number of chromosome per sample at a given location.

PloidyTable
 
PlotDenoisedCopyRatios

Creates plots of standardized and denoised copy ratios.

PlotModeledSegments

Creates plots of denoised and segmented copy-ratio and minor-allele-fraction estimates.

PolymeraseSlippageFilter
 
PosFileFaker

Created by jcarey on 3/13/14.

PosFileReader

The pos file format is one 3 Illumina formats(pos, locs, and clocs) that stores position data exclusively.

PositionalDownsampler

PositionalDownsampler: Downsample each stack of reads at each alignment start to a size <= a target coverage
 using a ReservoirDownsampler.

PositionBasedDownsampleSam

Summary

PosParser

PosParser parses multiple files formatted as one of the three file formats that contain position information
 only (pos, locs, and clocs).

PossibleDeNovo

Existence of a de novo mutation in at least one of the given families

PostAssemblerOnRamp
 
PosteriorProbabilitiesUtils
 
PosteriorProbabilitiesUtils.PosteriorProbabilitiesOptions

A class to wrangle all the various and sundry genotype posterior options,
  mostly from CalculateGenotypePosteriors

PostFilterOnRamp
 
PostprocessGermlineCNVCalls

Postprocesses the output of GermlineCNVCaller and generates VCF files as well as a concatenated denoised
 copy ratio file.

PostProcessReadsForRSEM

Performs post-processing steps to get a bam aligned to a transcriptome ready for RSEM (https://github.com/deweylab/RSEM)


 Suppose the read name "Q1" aligns to multiple loci in the transcriptome.

PowerCalculationUtils
 
PredicateFilterDecoratingClosableIterator<T>

Performs on-the-fly filtering of the provided VariantContext Iterator such that only variants that satisfy
 all predicates are emitted.

PreFilterOffRamp
 
PreprocessIntervals

Prepares bins for coverage collection.

PrimaryAlignmentKey

It is useful to define a key such that the key will occur at most once among the primary alignments in a given file
 (assuming the file is valid).

PrimaryAlignmentKey

It is useful to define a key such that the key will occur at most once among the primary alignments in a given file
 (assuming the file is valid).

PrimaryAlignmentSelectionStrategy

Given a set of alignments for a read or read pair, mark one alignment as primary, according to whatever
 strategy is appropriate.

PrimaryAlignmentSelectionStrategy

Given a set of alignments for a read or read pair, mark one alignment as primary, according to whatever
 strategy is appropriate.

PrintBGZFBlockInformation

A diagnostic tool that prints information about the compressed blocks in a BGZF format file,
 such as a .vcf.gz file or a .bam file.

PrintDistantMates
 
PrintMissingComp
 
PrintReadCounts

Prints (and optionally subsets) an rd (DepthEvidence) file or a counts file
 as one or more (for multi-sample DepthEvidence files) counts files for CNV determination.

PrintReads

Write reads from SAM format file (SAM/BAM/CRAM) that pass criteria to a new file.

PrintReadsHeader
 
PrintReadsSpark
 
PrintSVEvidence

Merges locus-sorted files of evidence for structural variation into a single output file.

PrintVariantsSpark

Print out variants from a VCF file.

ProcessController

Facade to Runtime.exec() and java.lang.Process.

ProcessControllerAckResult

Command acknowledgements that are returned from a process managed by StreamingProcessController.

ProcessControllerBase<CAPTURE_POLICY extends CapturedStreamOutput>
 
ProcessControllerBase.ProcessStream
 
ProcessOutput
 
ProcessSettings
 
ProgressLogger

Facilitate consistent logging output when progressing through a stream of SAM records.

ProgressMeter

A basic progress meter to print out the number of records processed (and other metrics) during a traversal
 at a configurable time interval.

ProgressReportingDelegatingCodec<A extends htsjdk.tribble.Feature,B>

This class is useful when we want to report progress when indexing.

PropertyUtils

Utility for loading properties files from resources.

ProteinChangeInfo

Class representing the change in the protein sequence for a specific reference/alternate allele pair in a variant.

PSBuildReferenceTaxonomyUtils
 
PSBwaAligner

Loads Bwa index and aligns reads.

PSBwaAlignerSpark

Wrapper class for using the PathSeq Bwa aligner class in Spark.

PSBwaArgumentCollection
 
PSBwaFilter

Aligns using BWA and filters out reads above the minimum coverage and identity.

PSBwaUtils

Utility functions for PathSeq Bwa tool

PSFilter

Performs PathSeq filtering steps and manages associated resources.

PSFilterArgumentCollection
 
PSFilterEmptyLogger

Dummy filter metrics class that does nothing

PSFilterFileLogger

Logs filtering read counts to metrics file

PSFilterLogger

Interface for filter metrics logging

PSFilterMetrics

Metrics that are calculated during the PathSeq filter

PSKmerBloomFilter

Kmer Bloom Filter class that encapsulates the filter, kmer size, and kmer mask

PSKmerBloomFilter.Serializer
 
PSKmerCollection

Classes that provide a way to test kmers for set membership and keep track of the kmer size and mask

PSKmerSet

Kmer Hopscotch set class that encapsulates the filter, kmer size, and kmer mask

PSKmerSet.Serializer
 
PSKmerUtils

PathSeq utilities for kmer libraries

PSPairedUnpairedSplitterSpark

Class for separating paired and unpaired reads in an RDD

PSPathogenAlignmentHit

Stores taxonomic IDs that were hits of a read pair.

PSPathogenReferenceTaxonProperties

Helper class for ClassifyReads that stores the name, taxonomic class and parent, reference length,
 and reference contig names of a given taxon in the pathogen reference.

PSPathogenTaxonScore

Pathogen abundance scores assigned to a taxonomic node and reported by the PathSeqScoreSpark tool.

PSScoreArgumentCollection
 
PSScoreFileLogger

Logs number of mapped and unmapped reads to metrics file

PSScoreLogger

Interface for score metrics logging

PSScoreMetrics

Metrics that are calculated during the PathSeq scoring

PSScorer
 
PSTaxonomyConstants

Important NCBI taxonomy database constants

PSTaxonomyDatabase

Helper class for holding taxonomy data used by ClassifyReads

PSTaxonomyDatabase.Serializer
 
PSTree

Represents a taxonomic tree with nodes assigned a name and taxonomic rank (e.g.

PSTree.Serializer
 
PSTreeNode

Node class for PSTree

PSTreeNode.Serializer
 
PSUtils

Common functions for PathSeq

PushPullTransformer<T>

A class that receives a stream of elements and transforms or filters them in some way, such as by downsampling with
 a Downsampler.

PushToPullIterator<T>

Iterator wrapper around our generic {@link PushPullTransformer)} interface.

PythonExecutorBase

Base class for services for executing Python Scripts.

PythonExecutorBase.PythonExecutableName

Enum of possible executables that can be launched by this executor.

PythonScriptExecutor

Generic service for executing Python Scripts.

PythonScriptExecutorException

Python script execution exception.

PythonSklearnVariantAnnotationsModel

Given an HDF5 file containing annotations for a training set (in the format specified by
 VariantAnnotationsModel.trainAndSerialize(java.io.File, java.lang.String)), a Python script containing modeling code,
 and a JSON file containing hyperparameters, the PythonSklearnVariantAnnotationsModel.trainAndSerialize(java.io.File, java.lang.String) method can be used to train a model.

PythonSklearnVariantAnnotationsScorer

Given an HDF5 file containing annotations for a test set (in the format specified by
 VariantAnnotationsScorer.score(java.io.File, java.io.File)), a Python script containing scoring code,
 and a file containing a pickled Python lambda function for scoring,
 the PythonSklearnVariantAnnotationsScorer.score(java.io.File, java.io.File) method can be used to generate scores.

QdFilter

Filters out sites that have a QD annotation applied to them and where the QD value is lower than a
 lower limit.

QNameAndInterval

A template name and an intervalId.

QNameAndInterval.Serializer
 
QNameFinder

Class to find the template names associated with reads in specified intervals.

QNameIntervalFinder

Iterates over reads, kmerizing them, and checking the kmers against a set of KmerAndIntervals
 to figure out which intervals (if any) a read belongs in.

QNameKmerizer

Class that acts as a mapper from a stream of reads to a stream of KmerAndIntervals.

QNamesForKmersFinder

Class that acts as a mapper from a stream of reads to a stream of <kmer,qname> pairs for a set of interesting kmers.

QualByDepth

Variant confidence normalized by unfiltered depth of variant samples

QualityScoreCovariate

The Reported Quality Score covariate.

QualityScoreDistribution

Charts quality score distribution within a BAM file.

QualityScoreDistributionSpark

Charts quality score distribution within a BAM file.

QualityUtils

QualityUtils is a static class with some utility methods for manipulating
 quality scores.

QualityYieldMetrics

A set of metrics used to describe the general quality of a BAM file

QualityYieldMetricsArgumentCollection

MetricsArgumentCollection argument collection for QualityYield metrics.

QualityYieldMetricsCollectorSpark

QualityYieldMetricsCollector for Spark.

QualQuantizer

A general algorithm for quantizing quality score distributions to use a specific number of levels

 Takes a histogram of quality scores and a desired number of levels and produces a
 map from original quality scores -> quantized quality scores.

QuantizationInfo

Class that encapsulates the information necessary for quality score quantization for BQSR

QueryRecord

A BigQuery record for a row of results from a query.

QuerySortedReadPairIteratorUtil

A collection of helper utilities for iterating through reads that are in query-name sorted
 read order as pairs

QuerySortedReadPairIteratorUtil.ReadPair
 
RampBase
 
RampBase.Type
 
RampedHaplotypeCaller

This is a specialized HaplotypeCaller tool, designed to allow for breaking the monolithic haplotype
 caller process into smaller discrete steps.

RampedHaplotypeCallerArgumentCollection
 
RampedHaplotypeCallerArgumentCollection.OffRampTypeEnum
 
RampedHaplotypeCallerArgumentCollection.OnRampTypeEnum
 
RampedHaplotypeCallerEngine

This is a specialized haplotype caller engine, designed to allow for breaking the monolithic haplotype
 caller process into smaller discrete steps.

RampUtils
 
RampUtils.GATKReadComparator
 
RampUtils.HaplotypeComparator
 
Range

While structurally identical to CompositeIndex, this class is maintained as it makes code more readable when the two are used together (see QSeqParser)

RankSumTest

Abstract root for all RankSum based annotations

RawGtCount

INFO level annotation of the counts of genotypes with respect to the reference allele.

ReadAnonymizer

Replace bases in reads with reference bases.

ReadBaseStratification

Classes, methods, and enums that deal with the stratification of read bases and reference information.

ReadBaseStratification.BinnedReadCycleStratifier

Stratifies into quintiles of read cycle.

ReadBaseStratification.CigarOperatorsInReadStratifier

Stratifies according to the number of matching cigar operators (from CIGAR string) that the read has.

ReadBaseStratification.CollectionStratifier

A CollectionStratifier is a stratifier that uses a collection of stratifiers to inform the stratification.

ReadBaseStratification.Consensus

Types of consensus reads as determined by the number of duplicates used from
 first and second strands.

ReadBaseStratification.ConsensusStratifier

Stratify by tags used during duplex and single index consensus calling.

ReadBaseStratification.CycleBin

An enum designed to hold a binned version of any probability-like number (between 0 and 1)
 in quintiles

ReadBaseStratification.FlowCellTileStratifier

Stratifies base into their read's tile which is parsed from the read-name.

ReadBaseStratification.FlowCellXStratifier

Stratifies base into their read's X coordinate which is parsed from the read-name.

ReadBaseStratification.FlowCellYStratifier

Stratifies base into their read's Y coordinate which is parsed from the read-name.

ReadBaseStratification.GCContentStratifier

A stratifier that uses GC (of the read) to stratify.

ReadBaseStratification.IndelLengthStratifier

Stratifies according to the length of an insertion or deletion.

ReadBaseStratification.IndelsInReadStratifier

Stratifies according to the number of indel bases (from CIGAR string) that the read has.

ReadBaseStratification.LongShortHomopolymer
 
ReadBaseStratification.LongShortHomopolymerStratifier

Stratify bases according to the type of Homopolymer that they belong to (repeating element, final reference base and
 whether the length is "long" or not).

ReadBaseStratification.MismatchesInReadStratifier

Stratifies according to the overall mismatches (from SAMTag.NM) that the read has against the reference, NOT
 including the current base.

ReadBaseStratification.NsInReadStratifier

Stratify by the number of Ns found in the read.

ReadBaseStratification.PairOrientation

An enum for holding a reads read-pair's Orientation (i.e.

ReadBaseStratification.PairStratifier<T extends Comparable<T>,R extends Comparable<R>>

A PairStratifier is a stratifier that uses two other stratifiers to inform the stratification.

ReadBaseStratification.ProperPaired

An enum to hold information about the "properness" of a read pair

ReadBaseStratification.ReadDirection

An enum for holding the direction for a read (positive strand or negative strand

ReadBaseStratification.ReadOrdinality

An enum to hold the ordinality of a read

ReadBaseStratification.RecordAndOffsetStratifier<T extends Comparable<T>>

The main interface for a stratifier.

ReadCachingIterator

Trivial wrapper around a GATKRead iterator that saves all reads returned in a cache,
 which can be periodically returned and emptied by the client.

ReadClassifier

Figures out what kind of BreakpointEvidence, if any, a read represents.

ReadClipper

A comprehensive clipping tool.

ReadConstants

Constants for use with the GATKRead interface

ReadContextData

ReadContextData is additional data that's useful when processing reads.

ReadCoordinateComparator

Comparator for sorting Reads by coordinate.

ReadCovariates

The object temporarily held by a read that describes all of its covariates.

ReadData

Data for a single end of a paired-end read, a barcode read, or for the entire read if not paired end.

ReadDataManipulationProgramGroup

Tools that manipulate read data in SAM, BAM or CRAM format

ReadDescriptor

Represents one set of cycles in an ReadStructure (e.g.

ReadEnds

Little struct-like class to hold read pair (and fragment) end data for duplicate marking.

ReadEndsForMarkDuplicates

Little struct-like class to hold read pair (and fragment) end data for MarkDuplicatesWithMateCigar

ReadEndsForMarkDuplicatesCodec

Codec for ReadEnds that just outputs the primitive fields and reads them back.

ReadEndsForMarkDuplicatesMap

Interface for storing and retrieving ReadEnds objects.

ReadEndsForMarkDuplicatesWithBarcodes
 
ReadEndsForMarkDuplicatesWithBarcodesCodec

Created by nhomer on 9/13/15.

ReadEndsForMateCigar

A class to store individual records for MarkDuplicatesWithMateCigar.

ReadErrorCorrector
 
ReaderSplitter<T>

Splits a reader by some value.

ReadFilter

Filters which operate on GATKRead should subclass this by overriding ReadFilter.test(GATKRead)

 ReadFilter implements Predicate and Serializable.

ReadFilter.ReadFilterAnd
 
ReadFilter.ReadFilterBinOp
 
ReadFilterArgumentDefinitions
 
ReadFilteringIterator

An iterator that filters reads from an existing iterator of reads.

ReadFilterLibrary

Standard ReadFilters

ReadFilterLibrary.AllowAllReadsReadFilter

Do not filter out any read.

ReadFilterLibrary.CigarContainsNoNOperator

Filter out reads containing skipped region from the reference (CIGAR strings with 'N' operator).

ReadFilterLibrary.FirstOfPairReadFilter

Keep only reads that are first of pair (0x1 and 0x40).

ReadFilterLibrary.GoodCigarReadFilter

Keep only reads containing good CIGAR strings.

ReadFilterLibrary.HasReadGroupReadFilter

Filter out reads without the SAM record RG (Read Group) tag.

ReadFilterLibrary.MappedReadFilter

Filter out unmapped reads.

ReadFilterLibrary.MappingQualityAvailableReadFilter

Filter out reads without available mapping quality (MAPQ=255).

ReadFilterLibrary.MappingQualityNotZeroReadFilter

Filter out reads with mapping quality equal to zero.

ReadFilterLibrary.MatchingBasesAndQualsReadFilter

Filter out reads where the bases and qualities do not match in length.

ReadFilterLibrary.MateDifferentStrandReadFilter

For paired reads (0x1), keep only reads that are mapped, have a mate that is mapped (read is not 0x8), and both
 the read and its mate are on different strands (when read is 0x20, it is not 0x10), as is the typical case.

ReadFilterLibrary.MateOnSameContigOrNoMappedMateReadFilter

Keep only reads that have a mate that maps to the same contig (RNEXT is "="), is single ended (not 0x1) or has an unmapped mate (0x8).

ReadFilterLibrary.MateUnmappedAndUnmappedReadFilter

Filter reads whose mate is unmapped as well as unmapped reads.

ReadFilterLibrary.NonChimericOriginalAlignmentReadFilter

If original alignment and mate original alignment tags exist, filter reads that were originally chimeric (mates were on different contigs).

ReadFilterLibrary.NonZeroFragmentLengthReadFilter

Filter out reads with fragment length (insert size) different from zero.

ReadFilterLibrary.NonZeroReferenceLengthAlignmentReadFilter

Filter out reads that do not align to the reference.

ReadFilterLibrary.NotDuplicateReadFilter

Filter out reads marked as duplicate (0x400).

ReadFilterLibrary.NotProperlyPairedReadFilter

Keep only paired reads that are marked as not properly paired (0x1 and !0x2).

ReadFilterLibrary.NotSecondaryAlignmentReadFilter

Filter out reads representing secondary alignments (0x100).

ReadFilterLibrary.NotSupplementaryAlignmentReadFilter

Filter out reads representing supplementary alignments (0x800).

ReadFilterLibrary.PairedReadFilter

Filter out unpaired reads (not 0x1).

ReadFilterLibrary.PassesVendorQualityCheckReadFilter

Filter out reads failing platform/vendor quality checks (0x200).

ReadFilterLibrary.PrimaryLineReadFilter

Keep only reads representing primary alignments (those that satisfy both the NotSecondaryAlignment and
 NotSupplementaryAlignment filters, or in terms of SAM flag values, must have neither of the 0x100 or
 0x800 flags set).

ReadFilterLibrary.ProperlyPairedReadFilter

Keep only paired reads that are properly paired (0x1 and 0x2).

ReadFilterLibrary.ReadLengthEqualsCigarLengthReadFilter

Filter out reads where the read and CIGAR do not match in length.

ReadFilterLibrary.SecondOfPairReadFilter

Keep only paired reads (0x1) that are second of pair (0x80).

ReadFilterLibrary.SeqIsStoredReadFilter

Keep only reads with sequenced bases.

ReadFilterLibrary.ValidAlignmentEndReadFilter

Keep only reads where the read end corresponds to a proper alignment -- that is, the read ends after the start
 (non-negative number of bases in the reference).

ReadFilterLibrary.ValidAlignmentStartReadFilter

Keep only reads with a valid alignment start (POS larger than 0) or is unmapped.

ReadFilterSparkifier
 
ReadGroupBlackListReadFilter

Keep records that don't match the specified filter string(s).

ReadGroupCovariate

The Read Group covariate.

ReadGroupHasFlowOrderReadFilter

A read filter to test if the read's readGroup has a flow order associated with it

ReadGroupIdSplitter

Splits readers read group id.

ReadGroupReadFilter

Keep only reads from the specified read group.

ReadGroupSplitter<T>

Splits a reader based on a value from a read group.

ReadInputArgumentCollection

An abstract argument collection for use with tools that accept input files containing reads
 (eg., BAM/SAM/CRAM files).

ReadLengthReadFilter

Keep only reads whose length is ≥ min value and ≤ max value.

ReadlessAssemblyRegion

A cut-down version of AssemblyRegion that doesn't store reads, used in the strict implementation of
 FindAssemblyRegionsSpark.

ReadLikelihoodCalculationEngine

Common interface for assembly-haplotype vs reads likelihood engines.

ReadLikelihoodCalculationEngine.Implementation
 
ReadMetadata

A bag of data about reads:  contig name to id mapping, fragment length statistics by read group, mean length.

ReadMetadata.LibraryRawStatistics
 
ReadMetadata.LibraryRawStatistics.Serializer
 
ReadMetadata.PartitionBounds

A class to track the genomic location of the start of the first and last mapped reads in a partition.

ReadMetadata.PartitionBounds.Serializer
 
ReadMetadata.PartitionStatistics
 
ReadMetadata.PartitionStatistics.Serializer
 
ReadMetadata.Serializer
 
ReadNameEncoder
 
ReadNameParser

Provides access to the physical location information about a cluster.

ReadNameReadFilter

Keep only reads with this read name.

ReadOrientation

Created by tsato on 3/28/18.

ReadOrientationFilter
 
ReadPair

Data structure that contains the set of reads sharing the same queryname, including
 the primary, secondary (i.e.

ReadPileup

Represents a pileup of reads at a given position.

ReadPosition

Median distance of variant starts from ends of reads supporting each alt allele.

ReadPositionFilter
 
ReadPosRankSumTest

Rank Sum Test for relative positioning of REF versus ALT alleles within reads

ReadQueryNameComparator

compare GATKRead by queryname
 duplicates the exact ordering of SAMRecordQueryNameComparator

ReadRecalibrationInfo
 
ReadsContext

Wrapper around ReadsDataSource that presents reads overlapping a specific interval to a client,
 without improperly exposing the entire ReadsDataSource interface.

ReadsDataSource

An interface for managing traversals over sources of reads.

ReadsDownsampler

An extension of the basic downsampler API with reads-specific operations

ReadsDownsamplingIterator

Iterator wrapper around our generic {@link ReadsDownsampler)} interface.

ReadsForQNamesFinder

Find <intervalId,list> pairs for interesting template names.

ReadsKey

Encodes a unique key for read, read pairs and fragments.

ReadsKey.KeyForFragment

Key class for representing relevant duplicate marking identifiers into a single long key for fragment data.

ReadsKey.KeyForPair

Key class for representing relevant duplicate marking identifiers into a two long key values for pair data data.

ReadsPathDataSource

Manages traversals and queries over sources of reads which are accessible via Paths
 (for now, SAM/BAM/CRAM files only).

ReadsPipelineSpark

ReadsPipelineSpark is our standard pipeline that takes unaligned or aligned reads and runs BWA (if specified), MarkDuplicates,
 BQSR, and HaplotypeCaller.

ReadsSparkSink

ReadsSparkSink writes GATKReads to a file.

ReadsSparkSource

Loads the reads from disk either serially (using samReaderFactory) or in parallel using Hadoop-BAM.

ReadStrandFilter

Keep only reads whose strand is either forward (not 0x10) or reverse (0x10), as specified.

ReadStructure

Describes the intended logical output structure of clusters of an Illumina run.

ReadsWithSameUMI

A container class for a set of reads that share the same unique molecular identifier (UMI) as judged by
 FGBio GroupReadsByUmi (http://fulcrumgenomics.github.io/fgbio/tools/latest/GroupReadsByUmi.html)

 Examples of molecule IDs (MI tag):

 "0/A" (The first molecule in the bam, A strand)
 "0/B" (The first molecule in the bam, B strand)
 "99/A" (100th molecule in the bam, A strand)

 For a given set of reads with the same molecule number, the strand with a larger number of reads is defined as the A strand.

ReadsWriteFormat

Possible output formats when writing reads.

ReadTagValueFilter

Keep only reads that contain a tag with a value that agrees with parameters as specified.

ReadTagValueFilter.Operator
 
ReadThreadingAssembler
 
ReadThreadingAssemblerArgumentCollection

Set of arguments related to the ReadThreadingAssembler

ReadThreadingGraph

Note: not final but only intended to be subclassed for testing.

ReadTransformer

Classes which perform transformations from GATKRead -> GATKRead should implement this interface by overriding SerializableFunction<GATKRead,GATKRead>#apply(GATKRead)

ReadTransformerSparkifier
 
ReadTransformingIterator

An iterator that transforms read (i.e.

ReadType

A read type describes a stretch of cycles in an ReadStructure
 (e.g.

ReadUtils

A miscellaneous collection of utilities for working with reads, headers, etc.

ReadWalker

A ReadWalker is a tool that processes a single read at a time from one or multiple sources of reads, with
 optional contextual information from a reference and/or sets of variants/Features.

ReadWalkerContext

Encapsulates an GATKRead with its ReferenceContext and FeatureContext.

ReadWalkerSpark

A Spark version of ReadWalker.

RealignmentArgumentCollection
 
RealignmentEngine
 
ReblockGVCF

Condense homRef blocks in a single-sample GVCF

ReblockGVCF.AlleleLengthComparator
 
ReblockingGVCFBlockCombiner

Combines variants into GVCF blocks.

ReblockingGVCFWriter
 
ReblockingOptions
 
RecalDatum

An individual piece of recalibration data.

RecalibrationArgumentCollection

A collection of the arguments that are used for BQSR.

RecalibrationReport

This class has all the static functionality for reading a recalibration report file into memory.

RecalibrationTables

Utility class to facilitate base quality score recalibration.

RecalUtils

This helper class holds the data HashMap as well as submaps that represent the marginal distributions collapsed over all needed dimensions.

ReducibleAnnotation

An interface for annotations that are calculated using raw data across samples, rather than the median (or median of median) of samples values
 The Raw annotation keeps some summary (one example might be a histogram of the raw values for each sample) of the individual sample (or allele)
 level annotation.

ReducibleAnnotationData<T>

A class to encapsulate the raw data for classes compatible with the ReducibleAnnotation interface

ReferenceArgumentCollection

Base interface for a reference argument collection.

ReferenceBases

Local reference context at a variant position.

ReferenceBases

ReferenceBases stores the bases of the reference genome for a particular interval.

ReferenceBlockConcordance

Evaluate GVCF reference block concordance of an input GVCF against a truth GVCF.

ReferenceConfidenceMode

Reference confidence emission modes.

ReferenceConfidenceModel

Code for estimating the reference confidence

 This code can estimate the probability that the data for a single sample is consistent with a
 well-determined REF/REF diploid genotype.

ReferenceConfidenceResult

Holds information about a genotype call of a single sample reference vs.

ReferenceConfidenceUtils
 
ReferenceConfidenceVariantContextMerger

Variant context utilities related to merging variant-context instances.

ReferenceConfidenceVariantContextMerger.VCWithNewAlleles
 
ReferenceContext

Wrapper around ReferenceDataSource that presents data from a specific interval/window to a client,
 without improperly exposing the entire ReferenceDataSource interface.

ReferenceDataSource

Manages traversals and queries over reference data.

ReferenceFileSource

Manages traversals and queries over reference data (for now, fasta files only)

 Supports targeted queries over the reference by interval, but does not
 yet support complete iteration over the entire reference.

ReferenceFileSparkSource

Class to load a reference sequence from a fasta file on Spark.

ReferenceHadoopSparkSource

Class to load a reference sequence from a fasta file on HDFS.

ReferenceInputArgumentCollection

An abstract ArgumentCollection for specifying a reference sequence file

ReferenceMemorySource

Manages traversals and queries over in-memory reference data.

ReferenceMultiSparkSource

Wrapper to load a reference sequence from a file stored on HDFS, GCS, or locally.

ReferencePair

Class representing a pair of references and their differences.

ReferencePair.Status
 
ReferenceProgramGroup

Tools that analyze and manipulate FASTA format references

ReferenceSequenceTable

Table utilized by CompareReferences tool to compare and analyze sequences found in specified references.

ReferenceShard

ReferenceShard is section of the reference genome that's used for sharding work for pairing things with
 the reference.

ReferenceSparkSource

Internal interface to load a reference sequence.

ReferenceTwoBitSparkSource

A ReferenceSource impl that is backed by a .2bit representation of a reference genome.

ReferenceUtils

A collection of static methods for dealing with references.

ReferenceWalker

A reference walker is a tool which processes each base in a given reference.

ReferenceWindowFunctions

A library of reference window functions suitable for passing in to transforms such as AddContextDataToRead.

ReferenceWindowFunctions.FixedWindowFunction

A function for requesting a fixed number of extra bases of reference context on either side
 of each read.

RefFlatReader

Loads gene annotations from a refFlat file into an OverlapDetector.

RefFlatReader.RefFlatColumns
 
ReflectionUtil

Class which contains utility functions that use reflection.

RefSeqCodec

Allows for reading in RefSeq information
 TODO this header needs to be rewritten

RefSeqFeature

The ref seq feature.

RefSeqTranscript

Created by IntelliJ IDEA.

RefVsAnyResult

Holds information about a genotype call of a single sample reference vs.

RemoveNearbyIndels

Remove indels that are close to another indel from a vcf file.

RenameSampleInVcf

Renames a sample within a VCF or BCF.

ReorderSam

Reorders a SAM/BAM input file according to the order of contigs in a second reference file.

ReplaceSamHeader
 
RepresentativeReadIndexer

Little struct-like class to hold a record index, the index of the corresponding representative read, and duplicate set size information.

RepresentativeReadIndexerCodec

Codec for read names and integers that outputs the primitive fields and reads them back.

RequiredFeatureInputArgumentCollection

An argument collection for use with tools that accept one or more input files containing Feature records
 (eg., BED files, hapmap files, etc.), and require at least one such input.

RequiredIntervalArgumentCollection

An ArgumentCollection that requires one or more intervals be specified with -L at the command line

RequiredOutputArgumentCollection
 
RequiredReadInputArgumentCollection

An argument collection for use with tools that accept one or more input files containing reads
 (eg., BAM/SAM/CRAM files), and require at least one such input.

RequiredReferenceArgumentCollection

Argument collection for references that are required (and not common).

RequiredReferenceInputArgumentCollection

An argument collection for use with tools that require a reference file as input.

RequiredStratification
 
RequiredVariantInputArgumentCollection

An argument collection for use with tools that accept one or more input files containing VariantContext records
 (eg., VCF files), and require at least one such input.

ReservoirDownsampler

Reservoir Downsampler: Selects n reads out of a stream whose size is not known in advance, with
 every read in the stream having an equal chance of being selected for inclusion.

Resource

Stores a resource by path and a relative class.

RevertBaseQualityScores
 
RevertOriginalBaseQualitiesAndAddMateCigar

This tool reverts the original base qualities (if specified) and adds the mate cigar tag to mapped SAM, BAM or CRAM files.

RevertOriginalBaseQualitiesAndAddMateCigar.CanSkipSamFile

Used as a return for the canSkipSAMFile function.

RevertSam

Reverts a SAM file by optionally restoring original quality scores and by removing
 all alignment information.

RevertSam.FileType
 
RevertSamSpark

Reverts a SAM file by optionally restoring original quality scores and by removing
 all alignment information.

RevertSamSpark.FileType
 
RExecutor

Util class for executing R scripts.

RMSMappingQuality

Root Mean Square of the mapping quality of reads across all samples.

RnaSeqMetrics

Metrics about the alignment of RNA-seq reads within a SAM file to genes, produced by the CollectRnaSeqMetrics
 program and usually stored in a file with the extension ".rna_metrics".

RnaSeqMetricsCollector
 
RnaSeqMetricsCollector.StrandSpecificity
 
RrbsCpgDetailMetrics

Holds information about CpG sites encountered for RRBS processing QC

RrbsMetricsCollector
 
RrbsSummaryMetrics

Holds summary statistics from RRBS processing QC

RScriptExecutor

Generic service for executing RScripts

RScriptExecutorException
 
RScriptLibrary

Libraries embedded in the StingUtils package.

RuntimeUtils
 
SamAlignmentMerger

Class that takes in a set of alignment information in SAM format and merges it with the set
 of all reads for which alignment was attempted, stored in an unmapped SAM file.

SamAlignmentMerger

Class that takes in a set of alignment information in SAM format and merges it with the set
 of all reads for which alignment was attempted, stored in an unmapped SAM file.

SamComparison

Compare two SAM/BAM files.

SamComparison.AlignmentComparison
 
SAMComparisonArgumentCollection

Argument collection for SAM comparison

SamComparisonMetric

Metric for results of SamComparison.

SAMFileDestination

Class used to direct output from a HaplotypeBAMWriter to a bam/sam file.

SAMFileGATKReadWriter

A GATKRead writer that writes to a SAM/BAM file.

SamFormatConverter

Converts a BAM file to human-readable SAM output or vice versa

SAMPileupCodec

Decoder for single sample SAM pileup data.

SAMPileupElement

Simple representation of a single base with associated quality from a SAM pileup

SAMPileupFeature

A tribble feature representing a SAM pileup.

Sample

Stratifies the eval RODs by each sample in the eval ROD.

Sample

Represents an individual under study.

SampleDB

Simple database for managing samples

SampleDBBuilder

Class for creating a temporary in memory database of samples.

SampleList

List samples that are non-reference at a given variant site

SampleList

An immutable, indexed set of samples.

SampleLocatableMetadata

Interface for marking objects that contain metadata associated with a collection of locatables
 associated with a single sample.

SampleMetadata

Interface for marking objects that contain metadata associated with a single sample.

SampleNameMap

A class to hold the mappings of sample names to VCF / VCF index paths.

SampleNameSplitter

Splits readers sample names.

SamplePairExtractor

Utility class to determine the tumor normal pairs that makeup a VCF Header

SampleReadFilter

Keep only reads for a given sample.

SamReaderQueryingIterator

An iterator that allows for traversals over a SamReader restricted to a set of intervals, unmapped reads,
 or both.

SAMRecordAndReference
 
SAMRecordAndReference
 
SAMRecordAndReferenceMultiLevelCollector<BEAN extends htsjdk.samtools.metrics.MetricBase,HKEY extends Comparable<HKEY>>
 
SAMRecordAndReferenceMultiLevelCollector<BEAN extends htsjdk.samtools.metrics.MetricBase,HKEY extends Comparable>
 
SAMRecordMultiLevelCollector<BEAN extends htsjdk.samtools.metrics.MetricBase,HKEY extends Comparable<HKEY>>

Defines a MultilevelPerRecordCollector using the argument type of SAMRecord so that this doesn't have to be redefined for each subclass of MultilevelPerRecordCollector

SAMRecordMultiLevelCollector<BEAN extends htsjdk.samtools.metrics.MetricBase,HKEY extends Comparable>

Defines a MultilevelPerRecordCollector using the argument type of SAMRecord so that this doesn't have to be redefined for each subclass of MultilevelPerRecordCollector

SAMRecordSerializer

Efficient serializer for SAMRecords that uses SAMRecordSparkCodec for encoding/decoding.

SAMRecordSparkCodec

A class that uses a slightly adapted version of BAMRecordCodec for serialization/deserialization of SAMRecords.

SAMRecordToGATKReadAdapter

Implementation of the GATKRead interface for the SAMRecord class.

SAMRecordToGATKReadAdapterSerializer

Efficient serializer for SAMRecordToGATKReadAdapters that uses SAMRecordSparkCodec for encoding/decoding.

SAMRecordToReadIterator

Wraps a SAMRecord iterator within an iterator of GATKReads.

SamRecordWithOrdinalAndSetDuplicateReadFlag

This class sets the duplicate read flag as the result state when examining sets of records.

SamToBfqWriter

Class to take unmapped reads in SAM/BAM/CRAM file format and create Maq binary fastq format file(s) --
 one or two of them, depending on whether it's a paired-end read.

SamToFastq

 Extracts read sequences and qualities from the input SAM/BAM file and writes them into
 the output file in Sanger FASTQ format.

SamToFastqWithTags

 Extracts read sequences and qualities from the input SAM/BAM file and SAM/BAM tags and writes them into
 output files in Sanger FASTQ format.

SATagBuilder

A builder class that expands functionality for SA tags.

ScatterIntervalsByNs

A Tool for breaking up a reference into intervals of alternating regions of N and ACGT bases.

ScatterIntervalsByNs.ScatterIntervalsByNReferenceArgumentCollection
 
ScoreVariantAnnotations

Scores variant calls in a VCF file based on site-level annotations using a previously trained model.

ScriptExecutor

Base class for executors that find and run scripts in an external script engine process (R, Python, etc).

ScriptExecutorException

Base type for exceptions thrown by the ScriptExecutor.

SegmentedCpxVariantSimpleVariantExtractor

For extracting simple variants from input GATK-SV complex variants.

SegmentedCpxVariantSimpleVariantExtractor.ExtractedSimpleVariants
 
SegmentedCpxVariantSimpleVariantExtractor.MultiSegmentsCpxVariantExtractor
 
SegmentedCpxVariantSimpleVariantExtractor.RelevantAttributes
 
SegmentedCpxVariantSimpleVariantExtractor.ZeroAndOneSegmentCpxVariantExtractor
 
SegmentExonOverlaps

Class that represents the exon numbers overlapped by a genomic region.

SegmentExonUtils
 
SelectVariants

Select a subset of variants from a VCF file

SeqGraph

A graph that contains base sequence at each node

SequenceComparison

A simple data object to hold a comparison between a reference sequence and an alternate allele.

SequenceDictionaryUtils

A series of utility functions that enable the GATK to compare two sequence dictionaries -- from the reference,
 from BAMs, or from feature sources -- for consistency.

SequenceDictionaryUtils

Class with helper methods for generating and writing SequenceDictionary objects.

SequenceDictionaryUtils.SamSequenceRecordsIterator
 
SequenceDictionaryUtils.SequenceDictionaryCompatibility
 
SequenceDictionaryValidationArgumentCollection

interface for argument collections that control how sequence dictionary validation should be handled

SequenceDictionaryValidationArgumentCollection.NoValidationCollection

doesn't provide a configuration argument, and always returns false, useful for tools that do not want to perform
 sequence dictionary validation, like aligners

SequenceDictionaryValidationArgumentCollection.StandardValidationCollection

most tools will want to use this, it defaults to performing sequence dictionary validation but provides the option
 to disable it

SequencerFlowClass

In broad terms, each sequencing platform can be classified by whether it flows nucleotides in some order
 such that homopolymers get sequenced in a single event (ie 454 or Ion) or it reads each position in the
 sequence one at a time, regardless of base composition (Illumina or Solid).

SequencingArtifactMetrics
 
SequencingArtifactMetrics.BaitBiasDetailMetrics

Bait bias artifacts broken down by context.

SequencingArtifactMetrics.BaitBiasSummaryMetrics

Summary analysis of a single bait bias artifact, also known as a reference bias artifact.

SequencingArtifactMetrics.PreAdapterDetailMetrics

Pre-adapter artifacts broken down by context.

SequencingArtifactMetrics.PreAdapterSummaryMetrics

Summary analysis of a single pre-adapter artifact.

SeqVertex

A graph vertex containing a sequence of bases and a unique ID that
 allows multiple distinct nodes in the graph to have the same sequence.

SerializableConsumer<T>
 
SerializableFunction<T,R>

Represents a Function that is Serializable.

SerializablePredicate<T>
 
SerializableSupplier<T>
 
SetNmAndUqTags
Deprecated.
SetNmMdAndUqTags

Fixes the NM, MD, and UQ tags in a SAM or BAM file.

SetSizeUtils

Set size utility

Sex

ENUM of possible human sexes: male, female, or unknown

Sex

Represents the sex of an individual.

Shard<T>

A Shard of records of type T covering a specific genomic interval, optionally expanded by a configurable
 amount of padded data, that provides the ability to iterate over its records.

ShardBoundary

Holds the bounds of a Shard, both with and without padding

ShardBoundaryShard<T>

A Shard backed by a ShardBoundary and a collection of records.

ShardedIntervalIterator

Iterator that will break up each input interval into shards.

ShardingVCFWriter

Variant writer tha splits output to multiple VCFs given the maximum records per file.

ShardToMultiIntervalShardAdapter<T>

adapts a normal Shard into a MultiIntervalShard that contains only the single wrapped shard

 this is a temporary shim until we can fully adopt MultiIntervalShard in HaplotypeCallerSpark

SharedSequenceMerger

Merges the incoming vertices of a vertex V of a graph

 Looks at the vertices that are incoming to V (i.e., have an outgoing edge connecting to V).

SharedVertexSequenceSplitter

Split a collection of middle nodes in a graph into their shared prefix and suffix values

 This code performs the following transformation.

ShiftFasta

Create a fasta with the bases shifted by offset

 delta1 = offset - 1
 delta2 = total - delta1

 To shift forward:
 if you are given a position in the regular fasta (pos_r) and want the position in the shifted fasta (pos_s):
 if pos_r > delta1 => pos_s = pos_r - delta1  ==   pos_r - offset +1
   otherwise          pos_s = pos_r + delta2  ==   pos_r + total - offset + 1

 To shift back:
 if you are given a position in the shifted fasta (pos_s) and want the position in the regular fasta (pos_r):
 if pos_s > delta2 => pos_r = pos_s - delta2  ==   pos_s - total + offset - 1
   otherwise          pos_r = pos_s + delta1  ==   pos_s + offset - 1

 Example command line:
 ShiftFasta
 -R "<CIRCURLAR_REFERENCE.fasta>"         // the reference to shift
 -O "<SHIFTED_REFERENCE.fasta>"           // output; the shifted fasta
 --shift-back-output "<SHIFT_BACK.chain>" // output; the shiftback chain file to use when lifting over
 --shift-offset-list ""    // optional; Specifies the offset to shift for each contig in the reference.

ShortVariantDiscoveryProgramGroup

Tools that perform variant calling and genotyping for short variants (SNPs, SNVs and Indels)

SimpleAnnotatedIntervalWriter

Callers must call SimpleAnnotatedIntervalWriter.writeHeader(org.broadinstitute.hellbender.tools.copynumber.utils.annotatedinterval.AnnotatedIntervalHeader) before SimpleAnnotatedIntervalWriter.add(org.broadinstitute.hellbender.tools.copynumber.utils.annotatedinterval.AnnotatedInterval).

SimpleChimera

Conceptually, a simple chimera represents the junction on
 AssemblyContigWithFineTunedAlignments that have
 exactly two good alignments.

SimpleChimera.DistancesBetweenAlignmentsOnRefAndOnRead

Struct to represent the (distance - 1) between boundaries of the two alignments represented by this CA,
 on reference, and on read.

SimpleChimera.Serializer
 
SimpleCopyRatioCaller

This caller is loosely based on the legacy ReCapSeg caller that was originally implemented in ReCapSeg v1.4.5.0,
 but introduces major changes.

SimpleCount

Represents a count at an interval.

SimpleCountCodec
 
SimpleCountCollection

Simple data structure to pass and read/write a List of SimpleCount objects.

SimpleCountCollection.SimpleCountTableColumn

CONTIG, START, END, COUNT

 Note: Unlike the package-private enums in other collection classes, this enum and its
 TableColumnCollection are public so that they can be accessed by SimpleCountCodec,
 which must be in org.broadinstitute.hellbender.utils.codecs to be discovered as a codec.

SimpleErrorCalculator

A calculator that estimates the error rate of the bases it observes, assuming that the reference is truth.

SimpleGermlineTagger

This utility class performs a simple tagging of germline segments in a tumor segments file.

SimpleInterval

Minimal immutable class representing a 1-based closed ended genomic interval
 SimpleInterval does not allow null contig names.

SimpleIntervalCollection

Represents a collection of SimpleInterval.

SimpleKeyXsvFuncotationFactory

Factory for creating TableFuncotations by handling `Separated Value` files with arbitrary delimiters
 (e.g.

SimpleKeyXsvFuncotationFactory.XsvDataKeyType
 
SimpleLocatableMetadata

Metadata associated with a collection of locatables.

SimpleMarkDuplicatesWithMateCigar

This is a simple tool to mark duplicates using the DuplicateSetIterator, DuplicateSet, and SAMRecordDuplicateComparator.

SimpleNovelAdjacencyAndChimericAlignmentEvidence

Simply a wrapper to link together
 NovelAdjacencyAndAltHaplotype and
 evidence SimpleChimera's.

SimpleNovelAdjacencyAndChimericAlignmentEvidence.ChimericContigAlignmentEvidenceAnnotations

Utility structs for extraction information from the consensus NovelAdjacencyAndAltHaplotype out of multiple ChimericAlignments,
 to be later added to annotations of the VariantContext extracted.

SimpleNovelAdjacencyAndChimericAlignmentEvidence.Serializer
 
SimpleNovelAdjacencyInterpreter

This deals with the special case where a contig has exactly two alignments
 and seemingly has the complete alt haplotype assembled.

SimpleRepeatMaskTransformer

Masks read bases with a supra-threshold number of A/T's or G/C's within a given window size.

SimpleSampleLocatableMetadata

Metadata associated with a collection of locatables associated with a single sample.

SimpleSampleMetadata

Metadata associated with a single sample.

SimpleSVD

Simple implementation of the SVD interface for storing the matrices (and vector) of a SVD result.

SimpleSVType
 
SimpleSVType.Deletion
 
SimpleSVType.DuplicationInverted
 
SimpleSVType.DuplicationTandem
 
SimpleSVType.ImpreciseDeletion
 
SimpleSVType.Insertion
 
SimpleSVType.Inversion
 
SimpleSVType.SupportedType
 
SimpleTsvOutputRenderer

This class is very versatile, but as a result, it must do some lazy loading after it receives the first write command.

SimpleXSVWriter

A simple TSV/CSV/XSV writer with support for writing in the cloud with configurable delimiter.

SingleBarcodeDistanceMetric

A class for finding the distance between a single barcode and a barcode-read (with base qualities)

SingleFileLocationTranslator
 
SinglePassSamProgram

Super class that is designed to provide some consistent structure between subclasses that
 simply iterate once over a coordinate sorted BAM and collect information from the records
 as the go in order to produce some kind of output.

SingleSequenceReferenceAligner<T,U>

Encompasses an aligner to a single-sequence reference.

SingleSequenceReferenceAligner.TriFunction<T,U,V,W>
 
SingularValueDecomposer

Perform singular value decomposition (and pseudoinverse calculation).

SiteDepth

The read depth of each base call for a sample at some locus.

SiteDepthBCICodec

Codec to handle SiteDepths in BlockCompressedInterval files

SiteDepthCodec

Codec to handle SiteDepths in tab-delimited text files

SiteDepthSortMerger

Imposes additional ordering of same-locus SiteDepth records by sample.

SiteDepthtoBAF

Merges locus-sorted SiteDepth evidence files, and calculates the bi-allelic frequency (baf)
 for each sample and site, and writes these values as a BafEvidence output file.

SliceSampler

Implements slice sampling of a continuous, univariate, unnormalized probability density function (PDF),
 which is assumed to be unimodal.

SmithWatermanAligner

Interface and factory for Smith-Waterman aligners

SmithWatermanAligner.Implementation
 
SmithWatermanAlignment
 
SmithWatermanAlignmentConstants

This class collects the various SWParameters that are used for various alignment procedures.

SmithWatermanIntelAligner

SmithWatermanIntelAligner class that converts instance of SWAlignerNativeBinding into a SmithWatermanIntelAligner
 This is optimized for Intel Architectures and can fail if Machine does not support AVX and will throw UserException

SmithWatermanJavaAligner

Pairwise discrete smith-waterman alignment implemented in pure java

 ************************************************************************
 ****                    IMPORTANT NOTE:                             ****
 ****  This class assumes that all bytes come from UPPERCASED chars! ****
 ************************************************************************

SmithWatermanJavaAligner.State

The state of a trace step through the matrix

Snp

Class to represent a SNP in context of a haplotype block that is used in fingerprinting.

SnpEffPositionModifier

Stratifies variants as genes or coding regions, according to the effect modifier, as indicated by snpEff.

SnpEffPositionModifier.PositionModifier
 
SnpEffUtil

Created with IntelliJ IDEA.

SnpEffUtil.EffectFunctionalClass
 
SnpEffUtil.EffectType
 
SNVMapper

An implementation of a feature mapper that finds SNPs (SVN)

 This class only finds SNP that are surrounded by a specific number of bases identical to the reference.

SoftClippedReadFilter

Filter out reads where the ratio of soft-clipped bases to total bases exceeds some given value.

SomaticClusteringModel

A model for the allele fraction spectrum of somatic variation.

SomaticGenotypingArgumentCollection
 
SomaticGenotypingEngine
 
SomaticGVCFBlockCombiner
 
SomaticGVCFWriter

Genome-wide VCF writer for somatic (Mutect2) output
 Merges reference blocks based on TLOD

SomaticLikelihoodsEngine

Created by David Benjamin on 3/9/17.

SomaticModelingArgumentCollection
 
SomaticReferenceConfidenceModel
 
SomaticSegmentationArgumentCollection
 
SortableJexlVCMatchExp
 
SortedBasecallsConverter<CLUSTER_OUTPUT_RECORD>

SortedBasecallsConverter utilizes an underlying IlluminaDataProvider to convert parsed and decoded sequencing data
 from standard Illumina formats to specific output records (FASTA records/SAM records).

SortGff

 Summary

SortSam

Sorts a SAM or BAM file.

SortSamSpark

SortSam on Spark (works on SAM/BAM/CRAM)

SortVcf

Sorts one or more VCF files according to the order of the contigs in the header/sequence dictionary and then
 by coordinate.

SparkCommandLineArgumentCollection

Command line arguments needed for configuring a spark context

SparkCommandLineProgram
 
SparkContextFactory

Manages creation of the Spark context.

SparkConverter

Class with helper methods to convert objects (mostly matrices) to/from Spark (particularly, in MLLib)

SparkSharder

Utility methods for sharding Locatable objects (such as reads) for given intervals, without using a shuffle.

SparkSingularValueDecomposer

SVD using MLLib

SparkUtils

Miscellaneous Spark-related utilities

SplitCRAM

SplitCRAM - split a cram file into smaller cram files (shards) containing a minimal number of records
 while still respecting container boundaries.

SplitIntervals

This tool takes in intervals via the standard arguments of
 IntervalArgumentCollection and splits them into interval files for scattering.

SplitNCigarReads

Splits reads that contain Ns in their cigar string (e.g.

SplitReadEvidence

Documents evidence of reads (of some sample at some locus) that align well to reference for
 some portion of the read, and fails to align for another portion of the read.

SplitReadEvidenceBCICodec

Codec to handle SplitReadEvidence in BlockCompressedInterval files

SplitReadEvidenceCodec

Codec to handle SplitReadEvidence in tab-delimited text files

SplitReadEvidenceSortMerger

Imposes additional ordering of same-locus SplitReadEvidence by sample and strand.

SplitReads

Outputs reads from a SAM/BAM/CRAM by read group, sample and library name

SplitSamByLibrary

Command-line program to split a SAM/BAM/CRAM file into separate files based on
 library name.

SplitSamByNumberOfReads


 Splits the input queryname sorted or query-grouped SAM/BAM/CRAM file and writes it into
 multiple BAM files, each with an approximately equal number of reads.

SplitVcfs

Splits the input VCF file into two, one for indels and one for SNPs.

StandardAnnotation

This is a marker interface used to indicate which annotations are "Standard".

StandardArgumentDefinitions

A set of String constants in which the name of the constant (minus the _SHORT_NAME suffix)
 is the standard long Option name, and the value of the constant is the standard shortName.

StandardCallerArgumentCollection

This is pulled out so that every caller isn't exposed to the arguments from every other caller.

StandardCovariateList

Represents the list of standard BQSR covariates.

StandardEval
 
StandardFlowBasedAnnotation

This is a marker interface used to indicate which annotations are part of the standard flow based group

StandardHCAnnotation

This is a marker interface used to indicate which annotations are "Standard" for the HaplotypeCaller only.

StandardMutectAnnotation

This is a marker interface used to indicate which annotations are "Standard" for Mutect2 only.

StandardOptionDefinitions

A set of String constants in which the name of the constant (minus the _SHORT_NAME suffix)
 is the standard long Option name, and the value of the constant is the standard shortName.

StandardPairHMMInputScoreImputator

Standard or classic pair-hmm score imputator.

StandardStratification
 
StorageAPIAvroReader
 
Strand
 
Strand.Serializer
 
StrandArtifactFilter
 
StrandArtifactFilter.EStep
 
StrandBiasBySample

Number of forward and reverse reads that support REF and ALT alleles

StrandBiasTest

Class of tests to detect strand bias.

StrandBiasUtils

Common strand bias utilities used by allele specific strand bias annotators

StrandCorrectedAllele

Class to represent a strand-corrected Allele.

StrandCorrectedReferenceBases

Simple container class to represent bases that have been corrected for strandedness already.

StrandedInterval

Represents an interval and strand from the reference genome.

StrandedInterval.Serializer
 
StrandOddsRatio

Strand bias estimated by the Symmetric Odds Ratio test

StrandSwitch

For symbolizing the change of strand from one alignment to the next
 of an assembly contig.

StratificationManager<K extends Stratifier<Object>,V>

Represents the full state space of all stratification combinations

StratificationManager.Combiner<V>
 
Stratifier<T>

A basic interface for a class to be used with the StratificationManager system

STRDecimationTable

Represents a decimation table.

StreamingProcessController

Facade to Runtime.exec() and java.lang.Process.

StreamingPythonScriptExecutor<T>

Python executor used to interact with a cooperative, keep-alive Python process.

StreamingToolConstants

Various constants used by StreamingProcessController that require synchronized equivalents in
 the companion process, i.e., if the streaming process is written in Python, there must be
 equivalent Python constants for use by the Python code.

StreamLocation

Where to read/write a stream

StreamOutput

The content of stdout or stderr.

StrictStrandBiasFilter
 
StripMateNumberTransformer

Removes /1 or /2 and any whitespace from the end of the read name if present

STRTableFile

Class to create and access STR table file contents.

STRTableFileBuilder

Utility class to compose the contents of the STR Table file.

StructuralVariantDiscoverer
 
StructuralVariantDiscoveryProgramGroup

Tools that detect structural variants

StructuralVariantFilter
 
StructuralVariationDiscoveryArgumentCollection
 
StructuralVariationDiscoveryArgumentCollection.DiscoverVariantsFromContigAlignmentsArgumentCollection
 
StructuralVariationDiscoveryArgumentCollection.FindBreakpointEvidenceSparkArgumentCollection
 
StructuralVariationDiscoveryPipelineSpark

Runs the structural variation discovery workflow on a single sample

StructuralVariationDiscoveryPipelineSpark.InMemoryAlignmentParser
 
SubsettedLikelihoodMatrix<EVIDENCE extends htsjdk.samtools.util.Locatable,A extends htsjdk.variant.variantcontext.Allele>

Fast wrapper for a LikelihoodMatrix that uses only a subset of alleles.

SVAlignmentLengthFilter
 
SVAlleleCounter

Simple allele counter for SVs.

SVAnnotate

Adds gene overlap, predicted functional consequence, and noncoding element overlap annotations to
 a structural variant (SV) VCF from the GATK-SV pipeline.

SVAnnotateEngine
 
SVAnnotateEngine.GTFIntervalTreesContainer
 
SVAnnotateEngine.SVSegment
 
SVCallRecord
 
SVCallRecordUtils
 
SVCluster

Clusters structural variants based on coordinates, event type, and supporting algorithms.

SVClusterEngine

Base class for clustering items that possess start/end genomic coordinates.

SVClusterEngine.CLUSTERING_TYPE

Available clustering algorithms

SVClusterEngine.OutputCluster
 
SVClusterEngineArgumentsCollection

Arguments for use with SVClusterEngine.

SVClusterEngineFactory

Some useful functions for creating different kinds of SVClusterEngine.

SVClusterLinkage<T extends SVLocatable>
 
SVConcordance

This tool calculates SV genotype concordance between an "evaluation" VCF and a "truth" VCF.

SVConcordanceAnnotator

Generates SV records annotated with concordance metrics given a pair of "evaluation" and "truth" SVs.

SVConcordanceLinkage
 
SVContext

Variant context with additional method to mine the structural variant specific information from structural
 variant records.

SVD

Interface for SVD implementation.

SVDDenoisedCopyRatioResult

Represents copy ratios for a sample that has been standardized and denoised by an SVDReadCountPanelOfNormals.

SVDDenoisingUtils

Utility class for package-private methods for performing SVD-based denoising and related operations.

SVDFactory

Entry point for creating an instance of SVD.

SvDiscoverFromLocalAssemblyContigAlignmentsSpark

(Internal) Examines aligned contigs from local assemblies and calls structural variants or their breakpoints

SvDiscoverFromLocalAssemblyContigAlignmentsSpark.AssemblyContigsClassifiedByAlignmentSignatures
 
SvDiscoveryInputMetaData
 
SvDiscoveryInputMetaData.ReferenceData
 
SvDiscoveryInputMetaData.SampleSpecificData
 
SvDiscoveryUtils
 
SVDReadCountPanelOfNormals

Interface for the panel of normals (PoN) for SVD-based coverage denoising.

SVDUSTFilteredKmerizer

An iterator over kmers with a specified maximum DUST-style, low-complexity score.

SVFastqUtils

Memory-economical utilities for producing a FASTQ file.

SVFastqUtils.FastqRead
 
SVFastqUtils.FastqRead.Serializer
 
SVFastqUtils.Mapping
 
SVFeature
 
SVFeaturesHeader
 
SVFileUtils
 
SVInterval

Naturally collating, simple interval

 WARNING: THIS IS NOT THE SAME AS THE BED COORDINATE SYSTEM OR SimpleInterval !!!!!

SVInterval.Serializer
 
SVInterval.SVIntervalConstructorArgsValidator
 
SVIntervalTree<V>

A Red-Black tree with intervals for keys.

SVIntervalTree.Entry<V1>
 
SVIntervalTree.Serializer<T>
 
SVIntervalTree.ValuesIterator<V1>
 
SVKmer
 
SVKmer.Base
 
SVKmerizer

Iterator over successive Kmers from a sequence of characters.

SVKmerizer.ASCIICharSequence
 
SVKmerLong

An immutable SVKmerLong.

SVKmerLong.Serializer
 
SVKmerShort

An immutable SVKmerShort.

SVKmerShort.Serializer
 
SVLocatable

Any class with loci that are potentially on different chromosomes should implement this interface.

SVLocation
 
SVMappingQualityFilter
 
SVReadFilter
 
SVReferenceUtils
 
SvType

Various types of structural variations.

SVUtils

Useful scraps of this and that.

SVUtils.IteratorFilter<T>
 
SVVCFReader
 
SVVCFWriter

A utility class that writes out variants to a VCF file.

SWNativeAlignerWrapper

A wrapper that converts instances of SWAlignerNativeBinding into a SmithWatermanAligner

SystemProperty

An annotation to denote Configuration options that should be injected into the Java System Properties.

TabbedInputParser

Parser for tab-delimited files

TabbedTextFileWithHeaderParser

Parse a tabbed text file in which columns are found by looking at a header line rather than by position.

TableCodec

Reads tab deliminated tabular text files

TableColumnCollection

Represents a list of table columns.

TableFeature

Feature representing a row in a text table.

TableFuncotation

A Funcotation to hold data from simple tabular data.

TableReader<R>

Reads the contents of a tab separated value formatted text input into
 records of an arbitrary type TableReader.

TableReference

A reference to a BigQuery table by project, dataset, and table name, along with the contained fields.

TableUtils

Common constants for table readers and writers.

TableWriter<R>

Class to write tab separated value files.

TagGermlineEvents
 
Tail

Enum for two-sided things, for example which end of a read has been clipped, which end of a chain within an assembly graph etc.

TandemRepeat

Tandem repeat unit composition and counts per allele

TandemRepeat

Stratifies the evals into sites that are tandem repeats

TargetedPcrMetrics

Metrics class for the analysis of reads obtained from targeted pcr experiments e.g.

TargetedPcrMetricsCollector

Calculates HS metrics for a given SAM or BAM file.

TargetMetrics

TargetMetrics, are metrics to measure how well we hit specific targets (or baits) when using a targeted sequencing process like hybrid selection
 or Targeted PCR Techniques (TSCA).

TargetMetricsBase

TargetMetrics, are metrics to measure how well we hit specific targets (or baits) when using a targeted sequencing process like hybrid selection
 or Targeted PCR Techniques (TSCA).

TargetMetricsCollector<METRIC_TYPE extends MultilevelMetrics>

TargetMetrics, are metrics to measure how well we hit specific targets (or baits) when using a targeted sequencing process like hybrid selection
 or Targeted PCR Techniques (TSCA).

TargetMetricsCollector.Coverage

A simple class that is used to store the coverage information about an interval.

TemplateFragmentOrdinal

Indicates the ordinal of a fragment in a paired sequenced template.

TensorType

TensorType documents the tensors available and what information they encode.

Test
 
Testing

For internal test purposes only.

TestingReadThreadingGraph
 
TestProgramGroup

Program group for use with internal test CommandLinePrograms only.

TextFormattingUtils

Common utilities for dealing with text formatting.

TextMDCodec
 
TextMDCodec.DeletionMDElement
 
TextMDCodec.MatchMDElement
 
TextMDCodec.MDElement
 
TextMDCodec.MismatchMDElement
 
TheoreticalSensitivity

Created by David Benjamin on 5/13/15.

TheoreticalSensitivity.RouletteWheel
 
TheoreticalSensitivityMetrics

TheoreticalSensitivityMetrics, are metrics calculated from TheoreticalSensitivity and parameters used in
 the calculation.

ThetaVariantEvaluator
 
ThreadPoolExecutorUtil
 
ThreadPoolExecutorWithExceptions

This version of the thread pool executor will throw an exception if any of the internal jobs have throw exceptions
 while executing

ThresholdCalculator
 
ThresholdCalculator.Strategy
 
Tile

Represents a tile from TileMetricsOut.bin.

TileIndex

Load a file containing 8-byte records like this:
 tile number: 4-byte int
 number of clusters in tile: 4-byte int
 Number of records to read is determined by reaching EOF.

TileIndex.TileIndexRecord
 
TileMetricsOutReader

Reads a TileMetricsOut file commonly found in the InterOp directory of an Illumina Run Folder.

TileMetricsOutReader.IlluminaLaneTileCode

Helper class which captures the combination of a lane, tile & metric code

TileMetricsOutReader.IlluminaTileMetrics

IlluminaPhasingMetrics corresponds to a single record in a TileMetricsOut file

TileMetricsOutReader.TileMetricsVersion
 
TileMetricsUtil

Utility for reading the tile data from an Illumina run directory's TileMetricsOut.bin file

TilePhasingValue

Captures information about a phasing value - Which read it corresponds to, which phasing type and a median value

TileTemplateRead

Defines the first or second template read for a tile

TiTvVariantEvaluator
 
TrainingSet

A set of training variants for use with VQSR.

TrainVariantAnnotationsModel

Trains a model for scoring variant calls based on site-level annotations.

Tranche

Created by gauthier on 7/13/17.

Tranche.TrancheComparator<T extends Tranche>
 
Tranche.TrancheTruthSensitivityComparator
 
TrancheManager
 
TrancheManager.SelectionMetric
 
TrancheManager.TruthSensitivityMetric
 
TranscriptSelectionMode

The manner to select a single transcript from a set of transcripts to report as the "best" or main transcript.

TransferReadTags

This tool takes a pair of SAM files sharing the same read names (e.g.

TransientFieldPhysicalLocation

A common class for holding the fields in PhysicalLocation that we don't want to be serialized by kryo.

Transition

Enum representation of a transition from one base to any other.

Transition.Base
 
TraversalParameters

A simple container class for parameters controlling which records get returned during traversals.

Trilean

An enumeration to represent true, false, or unknown.

TrimmedReadsReader

a service class for HaplotypeBasedVariableRecaller that reads SAM/BAM files.

Trio

A class for imposing a trio structure on three samples; a common paradigm

TumorEvidenceFilter
 
TumorNormalPair

Convenience class for tumor normal pair.

TwoPassFuncotationFilter
 
TwoPassVariantWalker
 
TypeInferredFromSimpleChimera
 
UmiAwareMarkDuplicatesWithMateCigar

This tool locates and tags duplicate reads in a BAM or SAM file, where duplicate reads are
 defined as originating from a single fragment of DNA.

UmiGraph

UmiGraph is used to identify UMIs that come from the same original source molecule.

UmiMetrics

Metrics that are calculated during the process of marking duplicates
 within a stream of SAMRecords using the UmiAwareDuplicateSetIterator.

UniqueAltReadCount

Finds a lower bound on the number of unique reads at a locus that support a non-reference allele.

UniqueIDWrapper<A>

Create a unique ID for an arbitrary object and wrap it.

UnmarkDuplicates

Clears the 0x400 duplicate SAM flag from reads.

UnsignedTypeUtil

A utility class for dealing with unsigned types.

UnsignedTypeUtil

A utility class for dealing with unsigned types.

UnsortedBasecallsConverter<CLUSTER_OUTPUT_RECORD>

UnortedBasecallsConverter utilizes an underlying IlluminaDataProvider to convert parsed and decoded sequencing data
 from standard Illumina formats to specific output records (FASTA records/SAM records).

UpdateVcfSequenceDictionary

Takes a VCF file and a Sequence Dictionary (from a variety of file types) and updates the Sequence Dictionary in VCF.

UpdateVCFSequenceDictionary

Updates the reference contigs in the header of the VCF format file, i.e.

UserException


 Class UserException.

UserException.BadInput
 
UserException.BadTempDir
 
UserException.CannotExecuteScript
 
UserException.CannotHandleGzippedRef
 
UserException.CouldNotCreateOutputFile


 Class UserException.CouldNotCreateOutputFile

UserException.CouldNotIndexFile
 
UserException.CouldNotReadInputFile


 Class UserException.CouldNotReadInputFile

UserException.EmptyIntersection
 
UserException.FailsStrictValidation
 
UserException.HardwareFeatureException
 
UserException.HeaderMissingReadGroup
 
UserException.IncompatibleRecalibrationTableParameters
 
UserException.IncompatibleSequenceDictionaries
 
UserException.LexicographicallySortedSequenceDictionary
 
UserException.MalformedBAM
 
UserException.MalformedFile


 Class UserException.MalformedFile

UserException.MalformedGenomeLoc
 
UserException.MalformedRead
 
UserException.MisencodedQualityScoresRead
 
UserException.MissingContigInSequenceDictionary
 
UserException.MissingIndex
 
UserException.MissingReference
 
UserException.MissingReferenceDictFile
 
UserException.MissingReferenceFaiFile
 
UserException.MissingReferenceGziFile
 
UserException.NoSuitableCodecs
 
UserException.ReadMissingReadGroup
 
UserException.SequenceDictionaryIsMissingContigLengths
 
UserException.UnimplementedFeature
 
UserException.ValidationFailure
 
UserException.VQSRNegativeModelFailure
 
UserException.VQSRPositiveModelFailure
 
UserException.WarnableAnnotationFailure
 
UserException.WrongFeatureType
 
Utils
 
ValidateBasicSomaticShortMutations
 
ValidateBasicSomaticShortMutations.Judgment
 
ValidateSamFile

This tool reports on the validity of a SAM or BAM file relative to the SAM format
 specification.

ValidateSamFile.Mode
 
ValidateVariants

Validate a VCF file with a strict set of criteria

ValidateVariants.ValidationType
 
ValidationReport
 
VariantAccumulatorExecutor<ACCUMULATOR extends VariantProcessor.Accumulator<RESULT>,RESULT>

Describes the functionality for an executor that manages the delegation of work to VariantProcessor.Accumulators.

VariantAccumulatorExecutor.MultiThreadedChunkBased<A extends VariantProcessor.Accumulator<R>,R>

A VariantAccumulatorExecutor that breaks down work into chunks described by the provided VariantIteratorProducer and
 spreads them over the indicated number of threads.

VariantAFEvaluator
 
VariantAnnotation

Interface of all variant annotations.

VariantAnnotationsModel

File interface for passing annotations to a modeling backend and indicating a path prefix for resulting output.

VariantAnnotationsModelBackend
 
VariantAnnotationsScorer

File interface for passing annotations to a scoring backend and returning scores.

VariantAnnotator

Annotate variant calls with context information

VariantAnnotatorEngine

The class responsible for computing annotations for variants.

VariantAnnotatorEngine.VAExpression

A container object for storing the objects necessary for carrying over expression annotations.

VariantContextGetters
 
VariantContextVariantAdapter

VariantContextVariantAdapter wraps the existing htsjdk VariantContext class so it can be
 used with the GATKVariant API.

VariantDataManager
 
VariantEval


 Given a variant callset, it is common to calculate various quality control metrics.

VariantEvalArgumentCollection

The collection of arguments for VariantEval

VariantEvalArgumentCollection.StratifyingScale
 
VariantEvalContext

A wrapper used internally by VariantEval and related classes to pass information related to the evaluation/stratification context, without exposing the entire walker to the consumer.

VariantEvalEngine

This class allows other classes to replicate the behavior of VariantEval

 Usage:
 -Pass the genotype args into the constructor, which will the initialize the engine completely

VariantEvalReportWriter

Class for writing the GATKReport for VariantEval

 Accepts a fulled evaluated (i.e., there's no more data coming) set of stratifications and evaluators
 and supports writing out the data in these evaluators to a GATKReport.

VariantEvaluationProgramGroup

Tools that evaluate and refine variant calls, e.g.

VariantEvaluator
 
VariantFilter
 
VariantFilter

Interface for classes that can generate filters for VariantContexts.

VariantFilteringProgramGroup

Tools that filter variants

VariantFilterLibrary

Collects common variant filters.

VariantFilterLibrary.AllowAllVariantsVariantFilter

Do not filter out any variants.

VariantFilterLibrary.NotSymbolicOrSVVariantFilter

Filter out any variants that are symbolic or SV.

VariantFilterLibrary.PassesFiltersVariantFilter

Filter out any variants that fail (variant-level) filters.

VariantFiltration

Filter variant calls based on INFO and/or FORMAT annotations

VariantIDsVariantFilter

Keep only variants with any of these IDs.

VariantIteratorProducer

A mechanism for iterating over CloseableIterator of VariantContexts in in some fashion, given VCF files and optionally
 an interval list.

VariantLocusWalker

VariantLocusWalker processes variants from a single source, grouped by locus overlap, or optionally one
 at a time in order, with optional contextual information from a reference, sets of reads, and/or supplementary sources
 of Features.

VariantManipulationProgramGroup

Tools that manipulate variant call format (VCF) data

VariantOverlapAnnotator

Annotate the ID field and attribute overlap FLAGs for a VariantContext against a FeatureContext or a list
 of VariantContexts.

VariantProcessor<RESULT,ACCUMULATOR extends VariantProcessor.Accumulator<RESULT>>

Describes an object that processes variants and produces a result.

VariantProcessor.Accumulator<RESULT>

Handles VariantContexts, and accumulates their data in some fashion internally.

VariantProcessor.AccumulatorGenerator<ACCUMULATOR extends VariantProcessor.Accumulator<RESULT>,RESULT>

Generates instances of VariantProcessor.Accumulators.

VariantProcessor.Builder<A extends VariantProcessor.Accumulator<R>,R>

Simple builder of VariantProcessors.

VariantProcessor.ResultMerger<RESULT>

Takes a collection of results produced by VariantProcessor.Accumulator.result() and merges them into a single RESULT.

VariantRecalibrator

Build a recalibration model to score variant quality for filtering purposes

VariantRecalibratorEngine
 
VariantRecallerResultWriter
 
VariantShard

VariantShard is section of the genome that's used for sharding work for pairing things with
 variants.

VariantsSparkSink

VariantsSparkSink writes variants to a VCF file in parallel using Hadoop-BAM.

VariantsSparkSource

VariantsSparkSource loads Variants from files serially (using FeatureDataSource) or in parallel
 using Hadoop-BAM.

VariantsToTable

Extract fields from a VCF file to a tab-delimited table

VariantStratifier
 
VariantSummary
 
VariantSummary.Type
 
VariantTransformer

Classes which perform transformations from VariantContext -> VariantContext
 should implement this interface by overriding < VariantContext ,VariantContext>#apply(VariantContext)
 Created by jonn on 6/26/18.

VariantType

Flow Annotation: type of variant: SNP/NON-H-INDEL/H-INDEL

VariantType

Stratifies the eval variants by their type (SNP, INDEL, ETC)

VariantType

This code and logic for determining variant types was mostly retained from VQSR.

VariantType

Enum to hold the possible types of dbSnps.

VariantTypesVariantFilter

Keep only variants with any of these variant types.

VariantWalker

A VariantWalker is a tool that processes a variant at a time from a source of variants, with
 optional contextual information from a reference, sets of reads, and/or supplementary sources
 of Features.

VariantWalkerBase

Base class for variant walkers, which process variants from one or more sources of variants,
 with optional contextual information from a reference, sets of reads, and/or supplementary sources of
 Features.

VariantWalkerContext

Encapsulates a VariantContext with the reads that overlap it (the ReadsContext and
 its ReferenceContext and FeatureContext.

VariantWalkerSpark

A Spark version of VariantWalker.

VcfFileSegment
Deprecated.
from 2022-03-17, Use VcfPathSegment

VcfFileSegmentGenerator
Deprecated.
from 2022-03-17, Use VcfPathSegmentGenerator

VcfFormatConverter

Converts an ASCII VCF file to a binary BCF or vice versa.

VcfFuncotationFactory

A class to create annotations from VCF feature sources.

VcfFuncotationMetadata

A concrete class for FuncotationMetadata that can be easily built from a VCF Header.

VcfOutputRenderer

A Funcotator output renderer for writing to VCF files.

VcfPathSegment

Describes a segment of a particular VCF file.

VcfPathSegmentGenerator

Describes a mechanism for producing VcfPathSegments from a VCF file path.

VcfToAdpc

A simple program to convert a Genotyping Arrays VCF to an ADPC file (Illumina intensity data file).

VcfToIntervalList

Converts a VCF or BCF file to a Picard Interval List.

VcfToIntervalList.VARIANT_ID_TYPES
 
VcfUtils

Utils for dealing with VCF files.

VcfUtils

Created by farjoun on 4/1/17.

VectorLoglessPairHMM

Class for performing the pair HMM for global alignment using AVX instructions contained in a native shared library.

VectorLoglessPairHMM.Implementation

Type for implementation of VectorLoglessPairHMM

VerifyIDIntensityContaminationMetrics
 
ViewSam

Prints a SAM or BAM file to the screen.

ViewSam.AlignmentStatus
 
ViewSam.PfStatus
 
VQSLODTranche
 
WalkerBase

Base class for pre-packaged walker traversals in the GATK engine.

WellformedFlowBasedReadFilter

Tests whether a flow based read is "well-formed" -- that is, is free of major internal inconsistencies and issues that could lead
 to errors downstream.

WellformedReadFilter

Tests whether a read is "well-formed" -- that is, is free of major internal inconsistencies and issues that could lead
 to errors downstream.

WgsMetrics

Metrics for evaluating the performance of whole genome sequencing experiments.

WgsMetricsProcessor

Interface for processing data and generate result for CollectWgsMetrics

WgsMetricsProcessorImpl<T extends htsjdk.samtools.util.AbstractRecordAndOffset>

Implementation of WgsMetricsProcessor that gets input data from a given iterator
 and processes it with a help of collector

XReadLines

Support for Python-like xreadlines() function as a class.

XsvLocatableTableCodec

Codec class to read from XSV (e.g.

XsvTableFeature

A feature to represent a line in an arbitrarily delimited (XSV) file (i.e.

ZipUtils

Utility class to zip and unzip files.