Class CreateReadCountPanelOfNormals

All Implemented Interfaces:
Serializable, org.broadinstitute.barclay.argparser.CommandLinePluginProvider

@DocumentedFeature public final class CreateReadCountPanelOfNormals extends SparkCommandLineProgram
Creates a panel of normals (PoN) for read-count denoising given the read counts for samples in the panel. The resulting PoN can be used with DenoiseReadCounts to denoise other samples.

The input read counts are first transformed to log2 fractional coverages and preprocessed according to specified filtering and imputation parameters. Singular value decomposition (SVD) is then performed to find the first number-of-eigensamples principal components, which are stored in the PoN. Some or all of these principal components can then be used for denoising case samples with DenoiseReadCounts; it is assumed that the principal components used represent systematic sequencing biases (rather than statistical noise). Examining the singular values, which are also stored in the PoN, may be useful in determining the appropriate number of principal components to use for denoising.

If annotated intervals are provided, explicit GC-bias correction will be performed by GCBiasCorrector before filtering and SVD. GC-content information for the intervals will be stored in the PoN and used to perform explicit GC-bias correction identically in DenoiseReadCounts. Note that if annotated intervals are not provided, it is still likely that GC-bias correction is implicitly performed by the SVD denoising process (i.e., some of the principal components arise from GC bias).

Note that such SVD denoising cannot distinguish between variance due to systematic sequencing biases and that due to true common germline CNVs present in the panel; signal from the latter may thus be inadvertently denoised away. Furthermore, variance arising from coverage on the sex chromosomes may also significantly contribute to the principal components if the panel contains samples of mixed sex. Therefore, if sex chromosomes are not excluded from coverage collection, it is strongly recommended that users avoid creating panels of mixed sex and take care to denoise case samples only with panels containing only individuals of the same sex as the case samples. (See GermlineCNVCaller, which avoids these issues by simultaneously learning a probabilistic model for systematic bias and calling rare and common germline CNVs for samples in the panel.)

Inputs

  • Counts files (TSV or HDF5 output of CollectReadCounts).
  • (Optional) GC-content annotated-intervals file from AnnotateIntervals. Explicit GC-bias correction will be performed on the panel samples and identically for subsequent case samples.

Outputs

Usage examples

     gatk CreateReadCountPanelOfNormals \
          -I sample_1.counts.hdf5 \
          -I sample_2.counts.hdf5 \
          ... \
          -O cnv.pon.hdf5
 
     gatk CreateReadCountPanelOfNormals \
          -I sample_1.counts.hdf5 \
          -I sample_2.counts.tsv \
          ... \
          --annotated-intervals annotated_intervals.tsv \
          -O cnv.pon.hdf5
 
See Also:
  • Field Details

    • MINIMUM_INTERVAL_MEDIAN_PERCENTILE_LONG_NAME

      public static final String MINIMUM_INTERVAL_MEDIAN_PERCENTILE_LONG_NAME
      See Also:
    • MAXIMUM_ZEROS_IN_SAMPLE_PERCENTAGE_LONG_NAME

      public static final String MAXIMUM_ZEROS_IN_SAMPLE_PERCENTAGE_LONG_NAME
      See Also:
    • MAXIMUM_ZEROS_IN_INTERVAL_PERCENTAGE_LONG_NAME

      public static final String MAXIMUM_ZEROS_IN_INTERVAL_PERCENTAGE_LONG_NAME
      See Also:
    • EXTREME_SAMPLE_MEDIAN_PERCENTILE_LONG_NAME

      public static final String EXTREME_SAMPLE_MEDIAN_PERCENTILE_LONG_NAME
      See Also:
    • IMPUTE_ZEROS_LONG_NAME

      public static final String IMPUTE_ZEROS_LONG_NAME
      See Also:
    • EXTREME_OUTLIER_TRUNCATION_PERCENTILE_LONG_NAME

      public static final String EXTREME_OUTLIER_TRUNCATION_PERCENTILE_LONG_NAME
      See Also:
    • MAXIMUM_CHUNK_SIZE

      public static final String MAXIMUM_CHUNK_SIZE
      See Also:
  • Constructor Details

    • CreateReadCountPanelOfNormals

      public CreateReadCountPanelOfNormals()
  • Method Details