@InterfaceAudience.Private public class TableSnapshotInputFormatImpl extends Object
| Modifier and Type | Class and Description | 
|---|---|
static class  | 
TableSnapshotInputFormatImpl.InputSplit
Implementation class for InputSplit logic common between mapred and mapreduce. 
 | 
static class  | 
TableSnapshotInputFormatImpl.RecordReader
Implementation class for RecordReader logic common between mapred and mapreduce. 
 | 
| Modifier and Type | Field and Description | 
|---|---|
static org.slf4j.Logger | 
LOG  | 
static String | 
NUM_SPLITS_PER_REGION
For MapReduce jobs running multiple mappers per region, determines
 number of splits to generate per region. 
 | 
protected static String | 
RESTORE_DIR_KEY  | 
static boolean | 
SNAPSHOT_INPUTFORMAT_LOCALITY_ENABLED_DEFAULT  | 
static String | 
SNAPSHOT_INPUTFORMAT_LOCALITY_ENABLED_KEY
Whether to calculate the block location for splits. 
 | 
static String | 
SNAPSHOT_INPUTFORMAT_ROW_LIMIT_PER_INPUTSPLIT
In some scenario, scan limited rows on each InputSplit for sampling data extraction 
 | 
static String | 
SPLIT_ALGO
For MapReduce jobs running multiple mappers per region, determines
 what split algorithm we should be using to find split points for scanners. 
 | 
| Constructor and Description | 
|---|
TableSnapshotInputFormatImpl()  | 
| Modifier and Type | Method and Description | 
|---|---|
static Scan | 
extractScanFromConf(org.apache.hadoop.conf.Configuration conf)  | 
static List<String> | 
getBestLocations(org.apache.hadoop.conf.Configuration conf,
                HDFSBlocksDistribution blockDistribution)  | 
static List<HRegionInfo> | 
getRegionInfosFromManifest(SnapshotManifest manifest)  | 
static SnapshotManifest | 
getSnapshotManifest(org.apache.hadoop.conf.Configuration conf,
                   String snapshotName,
                   org.apache.hadoop.fs.Path rootDir,
                   org.apache.hadoop.fs.FileSystem fs)  | 
static RegionSplitter.SplitAlgorithm | 
getSplitAlgo(org.apache.hadoop.conf.Configuration conf)  | 
static List<TableSnapshotInputFormatImpl.InputSplit> | 
getSplits(org.apache.hadoop.conf.Configuration conf)  | 
static List<TableSnapshotInputFormatImpl.InputSplit> | 
getSplits(Scan scan,
         SnapshotManifest manifest,
         List<HRegionInfo> regionManifests,
         org.apache.hadoop.fs.Path restoreDir,
         org.apache.hadoop.conf.Configuration conf)  | 
static List<TableSnapshotInputFormatImpl.InputSplit> | 
getSplits(Scan scan,
         SnapshotManifest manifest,
         List<HRegionInfo> regionManifests,
         org.apache.hadoop.fs.Path restoreDir,
         org.apache.hadoop.conf.Configuration conf,
         RegionSplitter.SplitAlgorithm sa,
         int numSplits)  | 
static void | 
setInput(org.apache.hadoop.conf.Configuration conf,
        String snapshotName,
        org.apache.hadoop.fs.Path restoreDir)
Configures the job to use TableSnapshotInputFormat to read from a snapshot. 
 | 
static void | 
setInput(org.apache.hadoop.conf.Configuration conf,
        String snapshotName,
        org.apache.hadoop.fs.Path restoreDir,
        RegionSplitter.SplitAlgorithm splitAlgo,
        int numSplitsPerRegion)
Configures the job to use TableSnapshotInputFormat to read from a snapshot. 
 | 
public static final org.slf4j.Logger LOG
protected static final String RESTORE_DIR_KEY
public static final String SPLIT_ALGO
public static final String NUM_SPLITS_PER_REGION
public static final String SNAPSHOT_INPUTFORMAT_LOCALITY_ENABLED_KEY
public static final boolean SNAPSHOT_INPUTFORMAT_LOCALITY_ENABLED_DEFAULT
public static final String SNAPSHOT_INPUTFORMAT_ROW_LIMIT_PER_INPUTSPLIT
public static List<TableSnapshotInputFormatImpl.InputSplit> getSplits(org.apache.hadoop.conf.Configuration conf) throws IOException
IOExceptionpublic static RegionSplitter.SplitAlgorithm getSplitAlgo(org.apache.hadoop.conf.Configuration conf) throws IOException
IOExceptionpublic static List<HRegionInfo> getRegionInfosFromManifest(SnapshotManifest manifest)
public static SnapshotManifest getSnapshotManifest(org.apache.hadoop.conf.Configuration conf, String snapshotName, org.apache.hadoop.fs.Path rootDir, org.apache.hadoop.fs.FileSystem fs) throws IOException
IOExceptionpublic static Scan extractScanFromConf(org.apache.hadoop.conf.Configuration conf) throws IOException
IOExceptionpublic static List<TableSnapshotInputFormatImpl.InputSplit> getSplits(Scan scan, SnapshotManifest manifest, List<HRegionInfo> regionManifests, org.apache.hadoop.fs.Path restoreDir, org.apache.hadoop.conf.Configuration conf) throws IOException
IOExceptionpublic static List<TableSnapshotInputFormatImpl.InputSplit> getSplits(Scan scan, SnapshotManifest manifest, List<HRegionInfo> regionManifests, org.apache.hadoop.fs.Path restoreDir, org.apache.hadoop.conf.Configuration conf, RegionSplitter.SplitAlgorithm sa, int numSplits) throws IOException
IOExceptionpublic static List<String> getBestLocations(org.apache.hadoop.conf.Configuration conf, HDFSBlocksDistribution blockDistribution)
public static void setInput(org.apache.hadoop.conf.Configuration conf,
                            String snapshotName,
                            org.apache.hadoop.fs.Path restoreDir)
                     throws IOException
conf - the job to configurationsnapshotName - the name of the snapshot to read fromrestoreDir - a temporary directory to restore the snapshot into. Current user should have
          write permissions to this directory, and this should not be a subdirectory of rootdir.
          After the job is finished, restoreDir can be deleted.IOException - if an error occurspublic static void setInput(org.apache.hadoop.conf.Configuration conf,
                            String snapshotName,
                            org.apache.hadoop.fs.Path restoreDir,
                            RegionSplitter.SplitAlgorithm splitAlgo,
                            int numSplitsPerRegion)
                     throws IOException
conf - the job to configuresnapshotName - the name of the snapshot to read fromrestoreDir - a temporary directory to restore the snapshot into. Current user should have
          write permissions to this directory, and this should not be a subdirectory of rootdir.
          After the job is finished, restoreDir can be deleted.numSplitsPerRegion - how many input splits to generate per one regionsplitAlgo - SplitAlgorithm to be used when generating InputSplitsIOException - if an error occursCopyright © 2007–2020 The Apache Software Foundation. All rights reserved.