org.apache.accumulo.core.client.mapreduce
Class InputFormatBase<K,V>

java.lang.Object
  extended by org.apache.hadoop.mapreduce.InputFormat<K,V>
      extended by org.apache.accumulo.core.client.mapreduce.InputFormatBase<K,V>
Direct Known Subclasses:
AccumuloInputFormat, AccumuloRowInputFormat

public abstract class InputFormatBase<K,V>
extends org.apache.hadoop.mapreduce.InputFormat<K,V>

This class allows MapReduce jobs to use Accumulo as the source of data. This input format provides keys and values of type K and V to the Map() and Reduce() functions. Subclasses must implement the following method: public RecordReader createRecordReader(InputSplit split, TaskAttemptContext context) throws IOException, InterruptedException This class includes a static class that can be used to create a RecordReader: protected abstract static class RecordReaderBase extends RecordReader Subclasses of RecordReaderBase must implement the following method: public boolean nextKeyValue() throws IOException, InterruptedException This method should set the following variables: K currentK V currentV Key currentKey (used for progress reporting) int numKeysRead (used for progress reporting) See AccumuloInputFormat for an example implementation. Other static methods are optional


Nested Class Summary
static class InputFormatBase.RangeInputSplit
          The Class RangeInputSplit.
protected static class InputFormatBase.RecordReaderBase<K,V>
           
static class InputFormatBase.RegexType
          Deprecated. since 1.4 use RegExFilter and addIterator(Configuration, IteratorSetting)
 
Field Summary
protected static org.apache.log4j.Logger log
           
 
Constructor Summary
InputFormatBase()
           
 
Method Summary
static void addIterator(org.apache.hadoop.conf.Configuration conf, IteratorSetting cfg)
          Encode an iterator on the input for this configuration object.
static void addIterator(org.apache.hadoop.mapreduce.JobContext job, IteratorSetting cfg)
          Deprecated. Use addIterator(Configuration,IteratorSetting) instead
static void disableAutoAdjustRanges(org.apache.hadoop.conf.Configuration conf)
          Disables the adjustment of ranges for this configuration object.
static void disableAutoAdjustRanges(org.apache.hadoop.mapreduce.JobContext job)
          Deprecated. Use disableAutoAdjustRanges(Configuration) instead
static void fetchColumns(org.apache.hadoop.conf.Configuration conf, Collection<Pair<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>> columnFamilyColumnQualifierPairs)
          Restricts the columns that will be mapped over for this configuration object.
static void fetchColumns(org.apache.hadoop.mapreduce.JobContext job, Collection<Pair<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>> columnFamilyColumnQualifierPairs)
          Deprecated. Use fetchColumns(Configuration,Collection) instead
protected static Authorizations getAuthorizations(org.apache.hadoop.conf.Configuration conf)
          Gets the authorizations to set for the scans from the configuration.
protected static Authorizations getAuthorizations(org.apache.hadoop.mapreduce.JobContext job)
          Deprecated. Use getAuthorizations(Configuration) instead
protected static boolean getAutoAdjustRanges(org.apache.hadoop.conf.Configuration conf)
          Determines whether a configuration has auto-adjust ranges enabled.
protected static boolean getAutoAdjustRanges(org.apache.hadoop.mapreduce.JobContext job)
          Deprecated. Use getAutoAdjustRanges(Configuration) instead
protected static Set<Pair<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>> getFetchedColumns(org.apache.hadoop.conf.Configuration conf)
          Gets the columns to be mapped over from this configuration object.
protected static Set<Pair<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>> getFetchedColumns(org.apache.hadoop.mapreduce.JobContext job)
          Deprecated. Use getFetchedColumns(Configuration) instead
protected static Instance getInstance(org.apache.hadoop.conf.Configuration conf)
          Initializes an Accumulo Instance based on the configuration.
protected static Instance getInstance(org.apache.hadoop.mapreduce.JobContext job)
          Deprecated. Use getInstance(Configuration) instead
protected static List<org.apache.accumulo.core.client.mapreduce.InputFormatBase.AccumuloIteratorOption> getIteratorOptions(org.apache.hadoop.conf.Configuration conf)
          Gets a list of the iterator options specified on this configuration.
protected static List<org.apache.accumulo.core.client.mapreduce.InputFormatBase.AccumuloIteratorOption> getIteratorOptions(org.apache.hadoop.mapreduce.JobContext job)
          Deprecated. Use getIteratorOptions(Configuration) instead
protected static List<org.apache.accumulo.core.client.mapreduce.InputFormatBase.AccumuloIterator> getIterators(org.apache.hadoop.conf.Configuration conf)
          Gets a list of the iterator settings (for iterators to apply to a scanner) from this configuration.
protected static List<org.apache.accumulo.core.client.mapreduce.InputFormatBase.AccumuloIterator> getIterators(org.apache.hadoop.mapreduce.JobContext job)
          Deprecated. Use getIterators(Configuration) instead
protected static org.apache.log4j.Level getLogLevel(org.apache.hadoop.conf.Configuration conf)
          Gets the log level from this configuration.
protected static org.apache.log4j.Level getLogLevel(org.apache.hadoop.mapreduce.JobContext job)
          Deprecated. Use getLogLevel(Configuration) instead
protected static int getMaxVersions(org.apache.hadoop.conf.Configuration conf)
          Gets the maxVersions to use for the VersioningIterator from this configuration.
protected static int getMaxVersions(org.apache.hadoop.mapreduce.JobContext job)
          Deprecated. Use getMaxVersions(Configuration) instead
protected static byte[] getPassword(org.apache.hadoop.conf.Configuration conf)
          Gets the password from the configuration.
protected static byte[] getPassword(org.apache.hadoop.mapreduce.JobContext job)
          Deprecated. Use getPassword(Configuration) instead
protected static List<Range> getRanges(org.apache.hadoop.conf.Configuration conf)
          Gets the ranges to scan over from a configuration object.
protected static List<Range> getRanges(org.apache.hadoop.mapreduce.JobContext job)
          Deprecated. Use getRanges(Configuration) instead
protected static String getRegex(org.apache.hadoop.mapreduce.JobContext job, InputFormatBase.RegexType type)
          Deprecated. since 1.4 use RegExFilter and addIterator(Configuration, IteratorSetting)
 List<org.apache.hadoop.mapreduce.InputSplit> getSplits(org.apache.hadoop.mapreduce.JobContext job)
          Read the metadata table to get tablets and match up ranges to them.
protected static String getTablename(org.apache.hadoop.conf.Configuration conf)
          Gets the table name from the configuration.
protected static String getTablename(org.apache.hadoop.mapreduce.JobContext job)
          Deprecated. Use getTablename(Configuration) instead
protected static TabletLocator getTabletLocator(org.apache.hadoop.conf.Configuration conf)
          Initializes an Accumulo TabletLocator based on the configuration.
protected static TabletLocator getTabletLocator(org.apache.hadoop.mapreduce.JobContext job)
          Deprecated. Use getTabletLocator(Configuration) instead
protected static String getUsername(org.apache.hadoop.conf.Configuration conf)
          Gets the user name from the configuration.
protected static String getUsername(org.apache.hadoop.mapreduce.JobContext job)
          Deprecated. Use getUsername(Configuration) instead
protected static boolean isIsolated(org.apache.hadoop.conf.Configuration conf)
          Determines whether a configuration has isolation enabled.
protected static boolean isIsolated(org.apache.hadoop.mapreduce.JobContext job)
          Deprecated. Use isIsolated(Configuration) instead
protected static boolean isOfflineScan(org.apache.hadoop.conf.Configuration conf)
           
static void setInputInfo(org.apache.hadoop.conf.Configuration conf, String user, byte[] passwd, String table, Authorizations auths)
          Initialize the user, table, and authorization information for the configuration object that will be used with an Accumulo InputFormat.
static void setInputInfo(org.apache.hadoop.mapreduce.JobContext job, String user, byte[] passwd, String table, Authorizations auths)
          Deprecated. Use setInputInfo(Configuration,String,byte[],String,Authorizations) instead
static void setIsolated(org.apache.hadoop.conf.Configuration conf, boolean enable)
          Enable or disable use of the IsolatedScanner in this configuration object.
static void setIsolated(org.apache.hadoop.mapreduce.JobContext job, boolean enable)
          Deprecated. Use setIsolated(Configuration,boolean) instead
static void setIterator(org.apache.hadoop.mapreduce.JobContext job, int priority, String iteratorClass, String iteratorName)
          Deprecated. since 1.4, see addIterator(Configuration, IteratorSetting)
static void setIteratorOption(org.apache.hadoop.mapreduce.JobContext job, String iteratorName, String key, String value)
          Deprecated. since 1.4, see addIterator(Configuration, IteratorSetting)
static void setLocalIterators(org.apache.hadoop.conf.Configuration conf, boolean enable)
          Enable or disable use of the ClientSideIteratorScanner in this Configuration object.
static void setLocalIterators(org.apache.hadoop.mapreduce.JobContext job, boolean enable)
          Deprecated. Use setLocalIterators(Configuration,boolean) instead
static void setLogLevel(org.apache.hadoop.conf.Configuration conf, org.apache.log4j.Level level)
          Sets the log level for this configuration object.
static void setLogLevel(org.apache.hadoop.mapreduce.JobContext job, org.apache.log4j.Level level)
          Deprecated. Use setLogLevel(Configuration,Level) instead
static void setMaxVersions(org.apache.hadoop.conf.Configuration conf, int maxVersions)
          Sets the max # of values that may be returned for an individual Accumulo cell.
static void setMaxVersions(org.apache.hadoop.mapreduce.JobContext job, int maxVersions)
          Deprecated. Use setMaxVersions(Configuration,int) instead
static void setMockInstance(org.apache.hadoop.conf.Configuration conf, String instanceName)
          Configure a MockInstance for this configuration object.
static void setMockInstance(org.apache.hadoop.mapreduce.JobContext job, String instanceName)
          Deprecated. Use setMockInstance(Configuration,String) instead
static void setRanges(org.apache.hadoop.conf.Configuration conf, Collection<Range> ranges)
          Set the ranges to map over for this configuration object.
static void setRanges(org.apache.hadoop.mapreduce.JobContext job, Collection<Range> ranges)
          Deprecated. Use setRanges(Configuration,Collection) instead
static void setRegex(org.apache.hadoop.mapreduce.JobContext job, InputFormatBase.RegexType type, String regex)
          Deprecated. since 1.4 use addIterator(Configuration, IteratorSetting)
static void setScanOffline(org.apache.hadoop.conf.Configuration conf, boolean scanOff)
           Enable reading offline tables.
static void setZooKeeperInstance(org.apache.hadoop.conf.Configuration conf, String instanceName, String zooKeepers)
          Configure a ZooKeeperInstance for this configuration object.
static void setZooKeeperInstance(org.apache.hadoop.mapreduce.JobContext job, String instanceName, String zooKeepers)
          Deprecated. Use setZooKeeperInstance(Configuration,String,String) instead
protected static boolean usesLocalIterators(org.apache.hadoop.conf.Configuration conf)
          Determines whether a configuration uses local iterators.
protected static boolean usesLocalIterators(org.apache.hadoop.mapreduce.JobContext job)
          Deprecated. Use usesLocalIterators(Configuration) instead
protected static void validateOptions(org.apache.hadoop.conf.Configuration conf)
          Check whether a configuration is fully configured to be used with an Accumulo InputFormat.
protected static void validateOptions(org.apache.hadoop.mapreduce.JobContext job)
          Deprecated. Use validateOptions(Configuration) instead
 
Methods inherited from class org.apache.hadoop.mapreduce.InputFormat
createRecordReader
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

log

protected static final org.apache.log4j.Logger log
Constructor Detail

InputFormatBase

public InputFormatBase()
Method Detail

setIsolated

public static void setIsolated(org.apache.hadoop.mapreduce.JobContext job,
                               boolean enable)
Deprecated. Use setIsolated(Configuration,boolean) instead


setIsolated

public static void setIsolated(org.apache.hadoop.conf.Configuration conf,
                               boolean enable)
Enable or disable use of the IsolatedScanner in this configuration object. By default it is not enabled.

Parameters:
conf - The Hadoop configuration object
enable - if true, enable usage of the IsolatedScanner. Otherwise, disable.

setLocalIterators

public static void setLocalIterators(org.apache.hadoop.mapreduce.JobContext job,
                                     boolean enable)
Deprecated. Use setLocalIterators(Configuration,boolean) instead


setLocalIterators

public static void setLocalIterators(org.apache.hadoop.conf.Configuration conf,
                                     boolean enable)
Enable or disable use of the ClientSideIteratorScanner in this Configuration object. By default it is not enabled.

Parameters:
conf - The Hadoop configuration object
enable - if true, enable usage of the ClientSideInteratorScanner. Otherwise, disable.

setInputInfo

public static void setInputInfo(org.apache.hadoop.mapreduce.JobContext job,
                                String user,
                                byte[] passwd,
                                String table,
                                Authorizations auths)
Deprecated. Use setInputInfo(Configuration,String,byte[],String,Authorizations) instead


setInputInfo

public static void setInputInfo(org.apache.hadoop.conf.Configuration conf,
                                String user,
                                byte[] passwd,
                                String table,
                                Authorizations auths)
Initialize the user, table, and authorization information for the configuration object that will be used with an Accumulo InputFormat.

Parameters:
conf - the Hadoop configuration object
user - a valid accumulo user
passwd - the user's password
table - the table to read
auths - the authorizations used to restrict data read

setZooKeeperInstance

public static void setZooKeeperInstance(org.apache.hadoop.mapreduce.JobContext job,
                                        String instanceName,
                                        String zooKeepers)
Deprecated. Use setZooKeeperInstance(Configuration,String,String) instead


setZooKeeperInstance

public static void setZooKeeperInstance(org.apache.hadoop.conf.Configuration conf,
                                        String instanceName,
                                        String zooKeepers)
Configure a ZooKeeperInstance for this configuration object.

Parameters:
conf - the Hadoop configuration object
instanceName - the accumulo instance name
zooKeepers - a comma-separated list of zookeeper servers

setMockInstance

public static void setMockInstance(org.apache.hadoop.mapreduce.JobContext job,
                                   String instanceName)
Deprecated. Use setMockInstance(Configuration,String) instead


setMockInstance

public static void setMockInstance(org.apache.hadoop.conf.Configuration conf,
                                   String instanceName)
Configure a MockInstance for this configuration object.

Parameters:
conf - the Hadoop configuration object
instanceName - the accumulo instance name

setRanges

public static void setRanges(org.apache.hadoop.mapreduce.JobContext job,
                             Collection<Range> ranges)
Deprecated. Use setRanges(Configuration,Collection) instead


setRanges

public static void setRanges(org.apache.hadoop.conf.Configuration conf,
                             Collection<Range> ranges)
Set the ranges to map over for this configuration object.

Parameters:
conf - the Hadoop configuration object
ranges - the ranges that will be mapped over

disableAutoAdjustRanges

public static void disableAutoAdjustRanges(org.apache.hadoop.mapreduce.JobContext job)
Deprecated. Use disableAutoAdjustRanges(Configuration) instead


disableAutoAdjustRanges

public static void disableAutoAdjustRanges(org.apache.hadoop.conf.Configuration conf)
Disables the adjustment of ranges for this configuration object. By default, overlapping ranges will be merged and ranges will be fit to existing tablet boundaries. Disabling this adjustment will cause there to be exactly one mapper per range set using setRanges(Configuration, Collection).

Parameters:
conf - the Hadoop configuration object

setRegex

public static void setRegex(org.apache.hadoop.mapreduce.JobContext job,
                            InputFormatBase.RegexType type,
                            String regex)
Deprecated. since 1.4 use addIterator(Configuration, IteratorSetting)

Parameters:
job -
type -
regex -
See Also:
RegExFilter.setRegexs(IteratorSetting, String, String, String, String, boolean)

setMaxVersions

public static void setMaxVersions(org.apache.hadoop.mapreduce.JobContext job,
                                  int maxVersions)
                           throws IOException
Deprecated. Use setMaxVersions(Configuration,int) instead

Throws:
IOException

setMaxVersions

public static void setMaxVersions(org.apache.hadoop.conf.Configuration conf,
                                  int maxVersions)
                           throws IOException
Sets the max # of values that may be returned for an individual Accumulo cell. By default, applied before all other Accumulo iterators (highest priority) leveraged in the scan by the record reader. To adjust priority use setIterator() & setIteratorOptions() w/ the VersioningIterator type explicitly.

Parameters:
conf - the Hadoop configuration object
maxVersions - the max number of versions per accumulo cell
Throws:
IOException - if maxVersions is < 1

setScanOffline

public static void setScanOffline(org.apache.hadoop.conf.Configuration conf,
                                  boolean scanOff)

Enable reading offline tables. This will make the map reduce job directly read the tables files. If the table is not offline, then the job will fail. If the table comes online during the map reduce job, its likely that the job will fail.

To use this option, the map reduce user will need access to read the accumulo directory in HDFS.

Reading the offline table will create the scan time iterator stack in the map process. So any iterators that are configured for the table will need to be on the mappers classpath. The accumulo-site.xml may need to be on the mappers classpath if HDFS or the accumlo directory in HDFS are non-standard.

One way to use this feature is to clone a table, take the clone offline, and use the clone as the input table for a map reduce job. If you plan to map reduce over the data many times, it may be better to the compact the table, clone it, take it offline, and use the clone for all map reduce jobs. The reason to do this is that compaction will reduce each tablet in the table to one file, and its faster to read from one file.

There are two possible advantages to reading a tables file directly out of HDFS. First, you may see better read performance. Second, it will support speculative execution better. When reading an online table speculative execution can put more load on an already slow tablet server.

Parameters:
conf - the job
scanOff - pass true to read offline tables

fetchColumns

public static void fetchColumns(org.apache.hadoop.mapreduce.JobContext job,
                                Collection<Pair<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>> columnFamilyColumnQualifierPairs)
Deprecated. Use fetchColumns(Configuration,Collection) instead


fetchColumns

public static void fetchColumns(org.apache.hadoop.conf.Configuration conf,
                                Collection<Pair<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>> columnFamilyColumnQualifierPairs)
Restricts the columns that will be mapped over for this configuration object.

Parameters:
conf - the Hadoop configuration object
columnFamilyColumnQualifierPairs - A pair of Text objects corresponding to column family and column qualifier. If the column qualifier is null, the entire column family is selected. An empty set is the default and is equivalent to scanning the all columns.

setLogLevel

public static void setLogLevel(org.apache.hadoop.mapreduce.JobContext job,
                               org.apache.log4j.Level level)
Deprecated. Use setLogLevel(Configuration,Level) instead


setLogLevel

public static void setLogLevel(org.apache.hadoop.conf.Configuration conf,
                               org.apache.log4j.Level level)
Sets the log level for this configuration object.

Parameters:
conf - the Hadoop configuration object
level - the logging level

addIterator

public static void addIterator(org.apache.hadoop.mapreduce.JobContext job,
                               IteratorSetting cfg)
Deprecated. Use addIterator(Configuration,IteratorSetting) instead


addIterator

public static void addIterator(org.apache.hadoop.conf.Configuration conf,
                               IteratorSetting cfg)
Encode an iterator on the input for this configuration object.

Parameters:
conf - The Hadoop configuration in which to save the iterator configuration
cfg - The configuration of the iterator

setIterator

public static void setIterator(org.apache.hadoop.mapreduce.JobContext job,
                               int priority,
                               String iteratorClass,
                               String iteratorName)
Deprecated. since 1.4, see addIterator(Configuration, IteratorSetting)

Specify an Accumulo iterator type to manage the behavior of the underlying table scan this InputFormat's RecordReader will conduct, w/ priority dictating the order in which specified iterators are applied. Repeat calls to specify multiple iterators are allowed.

Parameters:
job - the job
priority - the priority
iteratorClass - the iterator class
iteratorName - the iterator name

setIteratorOption

public static void setIteratorOption(org.apache.hadoop.mapreduce.JobContext job,
                                     String iteratorName,
                                     String key,
                                     String value)
Deprecated. since 1.4, see addIterator(Configuration, IteratorSetting)

Specify an option for a named Accumulo iterator, further specifying that iterator's behavior.

Parameters:
job - the job
iteratorName - the iterator name. Should correspond to an iterator set w/ a prior setIterator call.
key - the key
value - the value

isIsolated

protected static boolean isIsolated(org.apache.hadoop.mapreduce.JobContext job)
Deprecated. Use isIsolated(Configuration) instead


isIsolated

protected static boolean isIsolated(org.apache.hadoop.conf.Configuration conf)
Determines whether a configuration has isolation enabled.

Parameters:
conf - the Hadoop configuration object
Returns:
true if isolation is enabled, false otherwise
See Also:
setIsolated(Configuration, boolean)

usesLocalIterators

protected static boolean usesLocalIterators(org.apache.hadoop.mapreduce.JobContext job)
Deprecated. Use usesLocalIterators(Configuration) instead


usesLocalIterators

protected static boolean usesLocalIterators(org.apache.hadoop.conf.Configuration conf)
Determines whether a configuration uses local iterators.

Parameters:
conf - the Hadoop configuration object
Returns:
true if uses local iterators, false otherwise
See Also:
setLocalIterators(Configuration, boolean)

getUsername

protected static String getUsername(org.apache.hadoop.mapreduce.JobContext job)
Deprecated. Use getUsername(Configuration) instead


getUsername

protected static String getUsername(org.apache.hadoop.conf.Configuration conf)
Gets the user name from the configuration.

Parameters:
conf - the Hadoop configuration object
Returns:
the user name
See Also:
setInputInfo(Configuration, String, byte[], String, Authorizations)

getPassword

protected static byte[] getPassword(org.apache.hadoop.mapreduce.JobContext job)
Deprecated. Use getPassword(Configuration) instead

WARNING: The password is stored in the Configuration and shared with all MapReduce tasks; It is BASE64 encoded to provide a charset safe conversion to a string, and is not intended to be secure.


getPassword

protected static byte[] getPassword(org.apache.hadoop.conf.Configuration conf)
Gets the password from the configuration. WARNING: The password is stored in the Configuration and shared with all MapReduce tasks; It is BASE64 encoded to provide a charset safe conversion to a string, and is not intended to be secure.

Parameters:
conf - the Hadoop configuration object
Returns:
the BASE64-encoded password
See Also:
setInputInfo(Configuration, String, byte[], String, Authorizations)

getTablename

protected static String getTablename(org.apache.hadoop.mapreduce.JobContext job)
Deprecated. Use getTablename(Configuration) instead


getTablename

protected static String getTablename(org.apache.hadoop.conf.Configuration conf)
Gets the table name from the configuration.

Parameters:
conf - the Hadoop configuration object
Returns:
the table name
See Also:
setInputInfo(Configuration, String, byte[], String, Authorizations)

getAuthorizations

protected static Authorizations getAuthorizations(org.apache.hadoop.mapreduce.JobContext job)
Deprecated. Use getAuthorizations(Configuration) instead


getAuthorizations

protected static Authorizations getAuthorizations(org.apache.hadoop.conf.Configuration conf)
Gets the authorizations to set for the scans from the configuration.

Parameters:
conf - the Hadoop configuration object
Returns:
the accumulo scan authorizations
See Also:
setInputInfo(Configuration, String, byte[], String, Authorizations)

getInstance

protected static Instance getInstance(org.apache.hadoop.mapreduce.JobContext job)
Deprecated. Use getInstance(Configuration) instead


getInstance

protected static Instance getInstance(org.apache.hadoop.conf.Configuration conf)
Initializes an Accumulo Instance based on the configuration.

Parameters:
conf - the Hadoop configuration object
Returns:
an accumulo instance
See Also:
setZooKeeperInstance(Configuration, String, String), setMockInstance(Configuration, String)

getTabletLocator

protected static TabletLocator getTabletLocator(org.apache.hadoop.mapreduce.JobContext job)
                                         throws TableNotFoundException
Deprecated. Use getTabletLocator(Configuration) instead

Throws:
TableNotFoundException

getTabletLocator

protected static TabletLocator getTabletLocator(org.apache.hadoop.conf.Configuration conf)
                                         throws TableNotFoundException
Initializes an Accumulo TabletLocator based on the configuration.

Parameters:
conf - the Hadoop configuration object
Returns:
an accumulo tablet locator
Throws:
TableNotFoundException - if the table name set on the configuration doesn't exist

getRanges

protected static List<Range> getRanges(org.apache.hadoop.mapreduce.JobContext job)
                                throws IOException
Deprecated. Use getRanges(Configuration) instead

Throws:
IOException

getRanges

protected static List<Range> getRanges(org.apache.hadoop.conf.Configuration conf)
                                throws IOException
Gets the ranges to scan over from a configuration object.

Parameters:
conf - the Hadoop configuration object
Returns:
the ranges
Throws:
IOException - if the ranges have been encoded improperly
See Also:
setRanges(Configuration, Collection)

getRegex

protected static String getRegex(org.apache.hadoop.mapreduce.JobContext job,
                                 InputFormatBase.RegexType type)
Deprecated. since 1.4 use RegExFilter and addIterator(Configuration, IteratorSetting)

See Also:
setRegex(JobContext, RegexType, String)

getFetchedColumns

protected static Set<Pair<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>> getFetchedColumns(org.apache.hadoop.mapreduce.JobContext job)
Deprecated. Use getFetchedColumns(Configuration) instead


getFetchedColumns

protected static Set<Pair<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>> getFetchedColumns(org.apache.hadoop.conf.Configuration conf)
Gets the columns to be mapped over from this configuration object.

Parameters:
conf - the Hadoop configuration object
Returns:
a set of columns
See Also:
fetchColumns(Configuration, Collection)

getAutoAdjustRanges

protected static boolean getAutoAdjustRanges(org.apache.hadoop.mapreduce.JobContext job)
Deprecated. Use getAutoAdjustRanges(Configuration) instead


getAutoAdjustRanges

protected static boolean getAutoAdjustRanges(org.apache.hadoop.conf.Configuration conf)
Determines whether a configuration has auto-adjust ranges enabled.

Parameters:
conf - the Hadoop configuration object
Returns:
true if auto-adjust is enabled, false otherwise
See Also:
disableAutoAdjustRanges(Configuration)

getLogLevel

protected static org.apache.log4j.Level getLogLevel(org.apache.hadoop.mapreduce.JobContext job)
Deprecated. Use getLogLevel(Configuration) instead


getLogLevel

protected static org.apache.log4j.Level getLogLevel(org.apache.hadoop.conf.Configuration conf)
Gets the log level from this configuration.

Parameters:
conf - the Hadoop configuration object
Returns:
the log level
See Also:
setLogLevel(Configuration, Level)

validateOptions

protected static void validateOptions(org.apache.hadoop.mapreduce.JobContext job)
                               throws IOException
Deprecated. Use validateOptions(Configuration) instead

Throws:
IOException

validateOptions

protected static void validateOptions(org.apache.hadoop.conf.Configuration conf)
                               throws IOException
Check whether a configuration is fully configured to be used with an Accumulo InputFormat.

Parameters:
conf - the Hadoop configuration object
Throws:
IOException - if the configuration is improperly configured

getMaxVersions

protected static int getMaxVersions(org.apache.hadoop.mapreduce.JobContext job)
Deprecated. Use getMaxVersions(Configuration) instead


getMaxVersions

protected static int getMaxVersions(org.apache.hadoop.conf.Configuration conf)
Gets the maxVersions to use for the VersioningIterator from this configuration.

Parameters:
conf - the Hadoop configuration object
Returns:
the max versions, -1 if not configured
See Also:
setMaxVersions(Configuration, int)

isOfflineScan

protected static boolean isOfflineScan(org.apache.hadoop.conf.Configuration conf)

getIterators

protected static List<org.apache.accumulo.core.client.mapreduce.InputFormatBase.AccumuloIterator> getIterators(org.apache.hadoop.mapreduce.JobContext job)
Deprecated. Use getIterators(Configuration) instead


getIterators

protected static List<org.apache.accumulo.core.client.mapreduce.InputFormatBase.AccumuloIterator> getIterators(org.apache.hadoop.conf.Configuration conf)
Gets a list of the iterator settings (for iterators to apply to a scanner) from this configuration.

Parameters:
conf - the Hadoop configuration object
Returns:
a list of iterators
See Also:
addIterator(Configuration, IteratorSetting)

getIteratorOptions

protected static List<org.apache.accumulo.core.client.mapreduce.InputFormatBase.AccumuloIteratorOption> getIteratorOptions(org.apache.hadoop.mapreduce.JobContext job)
Deprecated. Use getIteratorOptions(Configuration) instead


getIteratorOptions

protected static List<org.apache.accumulo.core.client.mapreduce.InputFormatBase.AccumuloIteratorOption> getIteratorOptions(org.apache.hadoop.conf.Configuration conf)
Gets a list of the iterator options specified on this configuration.

Parameters:
conf - the Hadoop configuration object
Returns:
a list of iterator options
See Also:
addIterator(Configuration, IteratorSetting)

getSplits

public List<org.apache.hadoop.mapreduce.InputSplit> getSplits(org.apache.hadoop.mapreduce.JobContext job)
                                                       throws IOException
Read the metadata table to get tablets and match up ranges to them.

Specified by:
getSplits in class org.apache.hadoop.mapreduce.InputFormat<K,V>
Throws:
IOException


Copyright © 2013 The Apache Software Foundation. All Rights Reserved.