InputFormatBase (accumulo-core 1.4.3 API)

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

org.apache.accumulo.core.client.mapreduce
Class InputFormatBase<K,V>

java.lang.Object
  org.apache.hadoop.mapreduce.InputFormat<K,V>
      org.apache.accumulo.core.client.mapreduce.InputFormatBase<K,V>

Direct Known Subclasses:: AccumuloInputFormat, AccumuloRowInputFormat

public abstract class InputFormatBase<K,V>
extends org.apache.hadoop.mapreduce.InputFormat<K,V>
extends org.apache.hadoop.mapreduce.InputFormat<K,V>

This class allows MapReduce jobs to use Accumulo as the source of data. This input format provides keys and values of type K and V to the Map() and Reduce() functions. Subclasses must implement the following method: public RecordReader createRecordReader(InputSplit split, TaskAttemptContext context) throws IOException, InterruptedException This class includes a static class that can be used to create a RecordReader: protected abstract static class RecordReaderBase extends RecordReader Subclasses of RecordReaderBase must implement the following method: public boolean nextKeyValue() throws IOException, InterruptedException This method should set the following variables: K currentK V currentV Key currentKey (used for progress reporting) int numKeysRead (used for progress reporting) See AccumuloInputFormat for an example implementation. Other static methods are optional

Nested Class Summary
`static class`	`InputFormatBase.RangeInputSplit` The Class RangeInputSplit.
`protected static class`	`InputFormatBase.RecordReaderBase<K,V>`
`static class`	`InputFormatBase.RegexType` Deprecated. since 1.4 use `RegExFilter` and `addIterator(Configuration, IteratorSetting)`

Field Summary
`protected static org.apache.log4j.Logger`	`log`

Constructor Summary
`InputFormatBase()`

Method Summary
`static void`	`addIterator(org.apache.hadoop.conf.Configuration conf, IteratorSetting cfg)` Encode an iterator on the input for this configuration object.
`static void`	`addIterator(org.apache.hadoop.mapreduce.JobContext job, IteratorSetting cfg)` Deprecated. Use `addIterator(Configuration,IteratorSetting)` instead
`static void`	`disableAutoAdjustRanges(org.apache.hadoop.conf.Configuration conf)` Disables the adjustment of ranges for this configuration object.
`static void`	`disableAutoAdjustRanges(org.apache.hadoop.mapreduce.JobContext job)` Deprecated. Use `disableAutoAdjustRanges(Configuration)` instead
`static void`	`fetchColumns(org.apache.hadoop.conf.Configuration conf, Collection<Pair<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>> columnFamilyColumnQualifierPairs)` Restricts the columns that will be mapped over for this configuration object.
`static void`	`fetchColumns(org.apache.hadoop.mapreduce.JobContext job, Collection<Pair<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>> columnFamilyColumnQualifierPairs)` Deprecated. Use `fetchColumns(Configuration,Collection)` instead
`protected static Authorizations`	`getAuthorizations(org.apache.hadoop.conf.Configuration conf)` Gets the authorizations to set for the scans from the configuration.
`protected static Authorizations`	`getAuthorizations(org.apache.hadoop.mapreduce.JobContext job)` Deprecated. Use `getAuthorizations(Configuration)` instead
`protected static boolean`	`getAutoAdjustRanges(org.apache.hadoop.conf.Configuration conf)` Determines whether a configuration has auto-adjust ranges enabled.
`protected static boolean`	`getAutoAdjustRanges(org.apache.hadoop.mapreduce.JobContext job)` Deprecated. Use `getAutoAdjustRanges(Configuration)` instead
`protected static Set<Pair<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>>`	`getFetchedColumns(org.apache.hadoop.conf.Configuration conf)` Gets the columns to be mapped over from this configuration object.
`protected static Set<Pair<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>>`	`getFetchedColumns(org.apache.hadoop.mapreduce.JobContext job)` Deprecated. Use `getFetchedColumns(Configuration)` instead
`protected static Instance`	`getInstance(org.apache.hadoop.conf.Configuration conf)` Initializes an Accumulo `Instance` based on the configuration.
`protected static Instance`	`getInstance(org.apache.hadoop.mapreduce.JobContext job)` Deprecated. Use `getInstance(Configuration)` instead
`protected static List<org.apache.accumulo.core.client.mapreduce.InputFormatBase.AccumuloIteratorOption>`	`getIteratorOptions(org.apache.hadoop.conf.Configuration conf)` Gets a list of the iterator options specified on this configuration.
`protected static List<org.apache.accumulo.core.client.mapreduce.InputFormatBase.AccumuloIteratorOption>`	`getIteratorOptions(org.apache.hadoop.mapreduce.JobContext job)` Deprecated. Use `getIteratorOptions(Configuration)` instead
`protected static List<org.apache.accumulo.core.client.mapreduce.InputFormatBase.AccumuloIterator>`	`getIterators(org.apache.hadoop.conf.Configuration conf)` Gets a list of the iterator settings (for iterators to apply to a scanner) from this configuration.
`protected static List<org.apache.accumulo.core.client.mapreduce.InputFormatBase.AccumuloIterator>`	`getIterators(org.apache.hadoop.mapreduce.JobContext job)` Deprecated. Use `getIterators(Configuration)` instead
`protected static org.apache.log4j.Level`	`getLogLevel(org.apache.hadoop.conf.Configuration conf)` Gets the log level from this configuration.
`protected static org.apache.log4j.Level`	`getLogLevel(org.apache.hadoop.mapreduce.JobContext job)` Deprecated. Use `getLogLevel(Configuration)` instead
`protected static int`	`getMaxVersions(org.apache.hadoop.conf.Configuration conf)` Gets the maxVersions to use for the `VersioningIterator` from this configuration.
`protected static int`	`getMaxVersions(org.apache.hadoop.mapreduce.JobContext job)` Deprecated. Use `getMaxVersions(Configuration)` instead
`protected static byte[]`	`getPassword(org.apache.hadoop.conf.Configuration conf)` Gets the password from the configuration.
`protected static byte[]`	`getPassword(org.apache.hadoop.mapreduce.JobContext job)` Deprecated. Use `getPassword(Configuration)` instead
`protected static List<Range>`	`getRanges(org.apache.hadoop.conf.Configuration conf)` Gets the ranges to scan over from a configuration object.
`protected static List<Range>`	`getRanges(org.apache.hadoop.mapreduce.JobContext job)` Deprecated. Use `getRanges(Configuration)` instead
`protected static String`	`getRegex(org.apache.hadoop.mapreduce.JobContext job, InputFormatBase.RegexType type)` Deprecated. since 1.4 use `RegExFilter` and `addIterator(Configuration, IteratorSetting)`
`List<org.apache.hadoop.mapreduce.InputSplit>`	`getSplits(org.apache.hadoop.mapreduce.JobContext job)` Read the metadata table to get tablets and match up ranges to them.
`protected static String`	`getTablename(org.apache.hadoop.conf.Configuration conf)` Gets the table name from the configuration.
`protected static String`	`getTablename(org.apache.hadoop.mapreduce.JobContext job)` Deprecated. Use `getTablename(Configuration)` instead
`protected static TabletLocator`	`getTabletLocator(org.apache.hadoop.conf.Configuration conf)` Initializes an Accumulo `TabletLocator` based on the configuration.
`protected static TabletLocator`	`getTabletLocator(org.apache.hadoop.mapreduce.JobContext job)` Deprecated. Use `getTabletLocator(Configuration)` instead
`protected static String`	`getUsername(org.apache.hadoop.conf.Configuration conf)` Gets the user name from the configuration.
`protected static String`	`getUsername(org.apache.hadoop.mapreduce.JobContext job)` Deprecated. Use `getUsername(Configuration)` instead
`protected static boolean`	`isIsolated(org.apache.hadoop.conf.Configuration conf)` Determines whether a configuration has isolation enabled.
`protected static boolean`	`isIsolated(org.apache.hadoop.mapreduce.JobContext job)` Deprecated. Use `isIsolated(Configuration)` instead
`protected static boolean`	`isOfflineScan(org.apache.hadoop.conf.Configuration conf)`
`static void`	`setInputInfo(org.apache.hadoop.conf.Configuration conf, String user, byte[] passwd, String table, Authorizations auths)` Initialize the user, table, and authorization information for the configuration object that will be used with an Accumulo InputFormat.
`static void`	`setInputInfo(org.apache.hadoop.mapreduce.JobContext job, String user, byte[] passwd, String table, Authorizations auths)` Deprecated. Use `setInputInfo(Configuration,String,byte[],String,Authorizations)` instead
`static void`	`setIsolated(org.apache.hadoop.conf.Configuration conf, boolean enable)` Enable or disable use of the `IsolatedScanner` in this configuration object.
`static void`	`setIsolated(org.apache.hadoop.mapreduce.JobContext job, boolean enable)` Deprecated. Use `setIsolated(Configuration,boolean)` instead
`static void`	`setIterator(org.apache.hadoop.mapreduce.JobContext job, int priority, String iteratorClass, String iteratorName)` Deprecated. since 1.4, see `addIterator(Configuration, IteratorSetting)`
`static void`	`setIteratorOption(org.apache.hadoop.mapreduce.JobContext job, String iteratorName, String key, String value)` Deprecated. since 1.4, see `addIterator(Configuration, IteratorSetting)`
`static void`	`setLocalIterators(org.apache.hadoop.conf.Configuration conf, boolean enable)` Enable or disable use of the `ClientSideIteratorScanner` in this Configuration object.
`static void`	`setLocalIterators(org.apache.hadoop.mapreduce.JobContext job, boolean enable)` Deprecated. Use `setLocalIterators(Configuration,boolean)` instead
`static void`	`setLogLevel(org.apache.hadoop.conf.Configuration conf, org.apache.log4j.Level level)` Sets the log level for this configuration object.
`static void`	`setLogLevel(org.apache.hadoop.mapreduce.JobContext job, org.apache.log4j.Level level)` Deprecated. Use `setLogLevel(Configuration,Level)` instead
`static void`	`setMaxVersions(org.apache.hadoop.conf.Configuration conf, int maxVersions)` Sets the max # of values that may be returned for an individual Accumulo cell.
`static void`	`setMaxVersions(org.apache.hadoop.mapreduce.JobContext job, int maxVersions)` Deprecated. Use `setMaxVersions(Configuration,int)` instead
`static void`	`setMockInstance(org.apache.hadoop.conf.Configuration conf, String instanceName)` Configure a `MockInstance` for this configuration object.
`static void`	`setMockInstance(org.apache.hadoop.mapreduce.JobContext job, String instanceName)` Deprecated. Use `setMockInstance(Configuration,String)` instead
`static void`	`setRanges(org.apache.hadoop.conf.Configuration conf, Collection<Range> ranges)` Set the ranges to map over for this configuration object.
`static void`	`setRanges(org.apache.hadoop.mapreduce.JobContext job, Collection<Range> ranges)` Deprecated. Use `setRanges(Configuration,Collection)` instead
`static void`	`setRegex(org.apache.hadoop.mapreduce.JobContext job, InputFormatBase.RegexType type, String regex)` Deprecated. since 1.4 use `addIterator(Configuration, IteratorSetting)`
`static void`	`setScanOffline(org.apache.hadoop.conf.Configuration conf, boolean scanOff)` Enable reading offline tables.
`static void`	`setZooKeeperInstance(org.apache.hadoop.conf.Configuration conf, String instanceName, String zooKeepers)` Configure a `ZooKeeperInstance` for this configuration object.
`static void`	`setZooKeeperInstance(org.apache.hadoop.mapreduce.JobContext job, String instanceName, String zooKeepers)` Deprecated. Use `setZooKeeperInstance(Configuration,String,String)` instead
`protected static boolean`	`usesLocalIterators(org.apache.hadoop.conf.Configuration conf)` Determines whether a configuration uses local iterators.
`protected static boolean`	`usesLocalIterators(org.apache.hadoop.mapreduce.JobContext job)` Deprecated. Use `usesLocalIterators(Configuration)` instead
`protected static void`	`validateOptions(org.apache.hadoop.conf.Configuration conf)` Check whether a configuration is fully configured to be used with an Accumulo `InputFormat`.
`protected static void`	`validateOptions(org.apache.hadoop.mapreduce.JobContext job)` Deprecated. Use `validateOptions(Configuration)` instead

Methods inherited from class org.apache.hadoop.mapreduce.InputFormat
`createRecordReader`

Methods inherited from class java.lang.Object
`clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait`

Field Detail

log

protected static final org.apache.log4j.Logger log

Constructor Detail

InputFormatBase

public InputFormatBase()

Method Detail

setIsolated

public static void setIsolated(org.apache.hadoop.mapreduce.JobContext job,
                               boolean enable)

Deprecated. Use setIsolated(Configuration,boolean) instead

setIsolated

public static void setIsolated(org.apache.hadoop.conf.Configuration conf,
                               boolean enable)

Enable or disable use of the IsolatedScanner in this configuration object. By default it is not enabled.

Parameters:: conf - The Hadoop configuration object; enable - if true, enable usage of the IsolatedScanner. Otherwise, disable.

setLocalIterators

public static void setLocalIterators(org.apache.hadoop.mapreduce.JobContext job,
                                     boolean enable)

Deprecated. Use setLocalIterators(Configuration,boolean) instead

setLocalIterators

public static void setLocalIterators(org.apache.hadoop.conf.Configuration conf,
                                     boolean enable)

Enable or disable use of the ClientSideIteratorScanner in this Configuration object. By default it is not enabled.

Parameters:: conf - The Hadoop configuration object; enable - if true, enable usage of the ClientSideInteratorScanner. Otherwise, disable.

setInputInfo

public static void setInputInfo(org.apache.hadoop.mapreduce.JobContext job,
                                String user,
                                byte[] passwd,
                                String table,
                                Authorizations auths)

Deprecated. Use setInputInfo(Configuration,String,byte[],String,Authorizations) instead

setInputInfo

public static void setInputInfo(org.apache.hadoop.conf.Configuration conf,
                                String user,
                                byte[] passwd,
                                String table,
                                Authorizations auths)

Initialize the user, table, and authorization information for the configuration object that will be used with an Accumulo InputFormat.

Parameters:: conf - the Hadoop configuration object; user - a valid accumulo user; passwd - the user's password; table - the table to read; auths - the authorizations used to restrict data read

setZooKeeperInstance

public static void setZooKeeperInstance(org.apache.hadoop.mapreduce.JobContext job,
                                        String instanceName,
                                        String zooKeepers)

Deprecated. Use setZooKeeperInstance(Configuration,String,String) instead

setZooKeeperInstance

public static void setZooKeeperInstance(org.apache.hadoop.conf.Configuration conf,
                                        String instanceName,
                                        String zooKeepers)

Configure a ZooKeeperInstance for this configuration object.

Parameters:: conf - the Hadoop configuration object; instanceName - the accumulo instance name; zooKeepers - a comma-separated list of zookeeper servers

setMockInstance

public static void setMockInstance(org.apache.hadoop.mapreduce.JobContext job,
                                   String instanceName)

Deprecated. Use setMockInstance(Configuration,String) instead

setMockInstance

public static void setMockInstance(org.apache.hadoop.conf.Configuration conf,
                                   String instanceName)

Configure a MockInstance for this configuration object.

Parameters:: conf - the Hadoop configuration object; instanceName - the accumulo instance name

setRanges

public static void setRanges(org.apache.hadoop.mapreduce.JobContext job,
                             Collection<Range> ranges)

Deprecated. Use setRanges(Configuration,Collection) instead

setRanges

public static void setRanges(org.apache.hadoop.conf.Configuration conf,
                             Collection<Range> ranges)

Set the ranges to map over for this configuration object.

Parameters:: conf - the Hadoop configuration object; ranges - the ranges that will be mapped over

disableAutoAdjustRanges

public static void disableAutoAdjustRanges(org.apache.hadoop.mapreduce.JobContext job)

Deprecated. Use disableAutoAdjustRanges(Configuration) instead

disableAutoAdjustRanges

public static void disableAutoAdjustRanges(org.apache.hadoop.conf.Configuration conf)

Disables the adjustment of ranges for this configuration object. By default, overlapping ranges will be merged and ranges will be fit to existing tablet boundaries. Disabling this adjustment will cause there to be exactly one mapper per range set using setRanges(Configuration, Collection).

Parameters:: conf - the Hadoop configuration object

setRegex

public static void setRegex(org.apache.hadoop.mapreduce.JobContext job,
                            InputFormatBase.RegexType type,
                            String regex)

Deprecated. since 1.4 use addIterator(Configuration, IteratorSetting)

Parameters:: job -; type -; regex -
See Also:: RegExFilter.setRegexs(IteratorSetting, String, String, String, String, boolean)

setMaxVersions

public static void setMaxVersions(org.apache.hadoop.mapreduce.JobContext job,
                                  int maxVersions)
                           throws IOException

Deprecated. Use setMaxVersions(Configuration,int) instead

Throws:: IOException

setMaxVersions

public static void setMaxVersions(org.apache.hadoop.conf.Configuration conf,
                                  int maxVersions)
                           throws IOException

Sets the max # of values that may be returned for an individual Accumulo cell. By default, applied before all other Accumulo iterators (highest priority) leveraged in the scan by the record reader. To adjust priority use setIterator() & setIteratorOptions() w/ the VersioningIterator type explicitly.

Parameters:: conf - the Hadoop configuration object; maxVersions - the max number of versions per accumulo cell
Throws:: IOException - if maxVersions is < 1

setScanOffline

public static void setScanOffline(org.apache.hadoop.conf.Configuration conf,
                                  boolean scanOff)

Enable reading offline tables. This will make the map reduce job directly read the tables files. If the table is not offline, then the job will fail. If the table comes online during the map reduce job, its likely that the job will fail.

To use this option, the map reduce user will need access to read the accumulo directory in HDFS.

Reading the offline table will create the scan time iterator stack in the map process. So any iterators that are configured for the table will need to be on the mappers classpath. The accumulo-site.xml may need to be on the mappers classpath if HDFS or the accumlo directory in HDFS are non-standard.

One way to use this feature is to clone a table, take the clone offline, and use the clone as the input table for a map reduce job. If you plan to map reduce over the data many times, it may be better to the compact the table, clone it, take it offline, and use the clone for all map reduce jobs. The reason to do this is that compaction will reduce each tablet in the table to one file, and its faster to read from one file.

There are two possible advantages to reading a tables file directly out of HDFS. First, you may see better read performance. Second, it will support speculative execution better. When reading an online table speculative execution can put more load on an already slow tablet server.

Parameters:: conf - the job; scanOff - pass true to read offline tables

fetchColumns

public static void fetchColumns(org.apache.hadoop.mapreduce.JobContext job,
                                Collection<Pair<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>> columnFamilyColumnQualifierPairs)

Deprecated. Use fetchColumns(Configuration,Collection) instead

fetchColumns

public static void fetchColumns(org.apache.hadoop.conf.Configuration conf,
                                Collection<Pair<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>> columnFamilyColumnQualifierPairs)

Restricts the columns that will be mapped over for this configuration object.

Parameters:: conf - the Hadoop configuration object; columnFamilyColumnQualifierPairs - A pair of Text objects corresponding to column family and column qualifier. If the column qualifier is null, the entire column family is selected. An empty set is the default and is equivalent to scanning the all columns.

setLogLevel

public static void setLogLevel(org.apache.hadoop.mapreduce.JobContext job,
                               org.apache.log4j.Level level)

Deprecated. Use setLogLevel(Configuration,Level) instead

setLogLevel

public static void setLogLevel(org.apache.hadoop.conf.Configuration conf,
                               org.apache.log4j.Level level)

Sets the log level for this configuration object.

Parameters:: conf - the Hadoop configuration object; level - the logging level

addIterator

public static void addIterator(org.apache.hadoop.mapreduce.JobContext job,
                               IteratorSetting cfg)

Deprecated. Use addIterator(Configuration,IteratorSetting) instead

addIterator

public static void addIterator(org.apache.hadoop.conf.Configuration conf,
                               IteratorSetting cfg)

Encode an iterator on the input for this configuration object.

Parameters:: conf - The Hadoop configuration in which to save the iterator configuration; cfg - The configuration of the iterator

setIterator

public static void setIterator(org.apache.hadoop.mapreduce.JobContext job,
                               int priority,
                               String iteratorClass,
                               String iteratorName)

Deprecated. since 1.4, see addIterator(Configuration, IteratorSetting)

Specify an Accumulo iterator type to manage the behavior of the underlying table scan this InputFormat's RecordReader will conduct, w/ priority dictating the order in which specified iterators are applied. Repeat calls to specify multiple iterators are allowed.

Parameters:: job - the job; priority - the priority; iteratorClass - the iterator class; iteratorName - the iterator name

setIteratorOption

public static void setIteratorOption(org.apache.hadoop.mapreduce.JobContext job,
                                     String iteratorName,
                                     String key,
                                     String value)

Deprecated. since 1.4, see addIterator(Configuration, IteratorSetting)

Specify an option for a named Accumulo iterator, further specifying that iterator's behavior.

Parameters:: job - the job; iteratorName - the iterator name. Should correspond to an iterator set w/ a prior setIterator call.; key - the key; value - the value

isIsolated

protected static boolean isIsolated(org.apache.hadoop.mapreduce.JobContext job)

Deprecated. Use isIsolated(Configuration) instead

isIsolated

protected static boolean isIsolated(org.apache.hadoop.conf.Configuration conf)

Determines whether a configuration has isolation enabled.

Parameters:: conf - the Hadoop configuration object
Returns:: true if isolation is enabled, false otherwise
See Also:: setIsolated(Configuration, boolean)

usesLocalIterators

protected static boolean usesLocalIterators(org.apache.hadoop.mapreduce.JobContext job)

Deprecated. Use usesLocalIterators(Configuration) instead

usesLocalIterators

protected static boolean usesLocalIterators(org.apache.hadoop.conf.Configuration conf)

Determines whether a configuration uses local iterators.

Parameters:: conf - the Hadoop configuration object
Returns:: true if uses local iterators, false otherwise
See Also:: setLocalIterators(Configuration, boolean)

getUsername

protected static String getUsername(org.apache.hadoop.mapreduce.JobContext job)

Deprecated. Use getUsername(Configuration) instead

getUsername

protected static String getUsername(org.apache.hadoop.conf.Configuration conf)

Gets the user name from the configuration.

Parameters:: conf - the Hadoop configuration object
Returns:: the user name
See Also:: setInputInfo(Configuration, String, byte[], String, Authorizations)

getPassword

protected static byte[] getPassword(org.apache.hadoop.mapreduce.JobContext job)

Deprecated. Use getPassword(Configuration) instead

WARNING: The password is stored in the Configuration and shared with all MapReduce tasks; It is BASE64 encoded to provide a charset safe conversion to a string, and is not intended to be secure.

getPassword

protected static byte[] getPassword(org.apache.hadoop.conf.Configuration conf)

Gets the password from the configuration. WARNING: The password is stored in the Configuration and shared with all MapReduce tasks; It is BASE64 encoded to provide a charset safe conversion to a string, and is not intended to be secure.

Parameters:: conf - the Hadoop configuration object
Returns:: the BASE64-encoded password
See Also:: setInputInfo(Configuration, String, byte[], String, Authorizations)

getTablename

protected static String getTablename(org.apache.hadoop.mapreduce.JobContext job)

Deprecated. Use getTablename(Configuration) instead

getTablename

protected static String getTablename(org.apache.hadoop.conf.Configuration conf)

Gets the table name from the configuration.

Parameters:: conf - the Hadoop configuration object
Returns:: the table name
See Also:: setInputInfo(Configuration, String, byte[], String, Authorizations)

getAuthorizations

protected static Authorizations getAuthorizations(org.apache.hadoop.mapreduce.JobContext job)

Deprecated. Use getAuthorizations(Configuration) instead

getAuthorizations

protected static Authorizations getAuthorizations(org.apache.hadoop.conf.Configuration conf)

Gets the authorizations to set for the scans from the configuration.

Parameters:: conf - the Hadoop configuration object
Returns:: the accumulo scan authorizations
See Also:: setInputInfo(Configuration, String, byte[], String, Authorizations)

getInstance

protected static Instance getInstance(org.apache.hadoop.mapreduce.JobContext job)

Deprecated. Use getInstance(Configuration) instead

getInstance

protected static Instance getInstance(org.apache.hadoop.conf.Configuration conf)

Initializes an Accumulo Instance based on the configuration.

Parameters:: conf - the Hadoop configuration object
Returns:: an accumulo instance
See Also:: setZooKeeperInstance(Configuration, String, String), setMockInstance(Configuration, String)

getTabletLocator

protected static TabletLocator getTabletLocator(org.apache.hadoop.mapreduce.JobContext job)
                                         throws TableNotFoundException

Deprecated. Use getTabletLocator(Configuration) instead

Throws:: TableNotFoundException

getTabletLocator

protected static TabletLocator getTabletLocator(org.apache.hadoop.conf.Configuration conf)
                                         throws TableNotFoundException

Initializes an Accumulo TabletLocator based on the configuration.

Parameters:: conf - the Hadoop configuration object
Returns:: an accumulo tablet locator
Throws:: TableNotFoundException - if the table name set on the configuration doesn't exist

getRanges

protected static List<Range> getRanges(org.apache.hadoop.mapreduce.JobContext job)
                                throws IOException

Deprecated. Use getRanges(Configuration) instead

Throws:: IOException

getRanges

protected static List<Range> getRanges(org.apache.hadoop.conf.Configuration conf)
                                throws IOException

Gets the ranges to scan over from a configuration object.

Parameters:: conf - the Hadoop configuration object
Returns:: the ranges
Throws:: IOException - if the ranges have been encoded improperly
See Also:: setRanges(Configuration, Collection)

getRegex

protected static String getRegex(org.apache.hadoop.mapreduce.JobContext job,
                                 InputFormatBase.RegexType type)

Deprecated. since 1.4 use RegExFilter and addIterator(Configuration, IteratorSetting)

See Also:: setRegex(JobContext, RegexType, String)

getFetchedColumns

protected static Set<Pair<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>> getFetchedColumns(org.apache.hadoop.mapreduce.JobContext job)

Deprecated. Use getFetchedColumns(Configuration) instead

getFetchedColumns

protected static Set<Pair<org.apache.hadoop.io.Text,org.apache.hadoop.io.Text>> getFetchedColumns(org.apache.hadoop.conf.Configuration conf)

Gets the columns to be mapped over from this configuration object.

Parameters:: conf - the Hadoop configuration object
Returns:: a set of columns
See Also:: fetchColumns(Configuration, Collection)

getAutoAdjustRanges

protected static boolean getAutoAdjustRanges(org.apache.hadoop.mapreduce.JobContext job)

Deprecated. Use getAutoAdjustRanges(Configuration) instead

getAutoAdjustRanges

protected static boolean getAutoAdjustRanges(org.apache.hadoop.conf.Configuration conf)

Determines whether a configuration has auto-adjust ranges enabled.

Parameters:: conf - the Hadoop configuration object
Returns:: true if auto-adjust is enabled, false otherwise
See Also:: disableAutoAdjustRanges(Configuration)

getLogLevel

protected static org.apache.log4j.Level getLogLevel(org.apache.hadoop.mapreduce.JobContext job)

Deprecated. Use getLogLevel(Configuration) instead

getLogLevel

protected static org.apache.log4j.Level getLogLevel(org.apache.hadoop.conf.Configuration conf)

Gets the log level from this configuration.

Parameters:: conf - the Hadoop configuration object
Returns:: the log level
See Also:: setLogLevel(Configuration, Level)

validateOptions

protected static void validateOptions(org.apache.hadoop.mapreduce.JobContext job)
                               throws IOException

Deprecated. Use validateOptions(Configuration) instead

Throws:: IOException

validateOptions

protected static void validateOptions(org.apache.hadoop.conf.Configuration conf)
                               throws IOException

Check whether a configuration is fully configured to be used with an Accumulo InputFormat.

Parameters:: conf - the Hadoop configuration object
Throws:: IOException - if the configuration is improperly configured

getMaxVersions

protected static int getMaxVersions(org.apache.hadoop.mapreduce.JobContext job)

Deprecated. Use getMaxVersions(Configuration) instead

getMaxVersions

protected static int getMaxVersions(org.apache.hadoop.conf.Configuration conf)

Gets the maxVersions to use for the VersioningIterator from this configuration.

Parameters:: conf - the Hadoop configuration object
Returns:: the max versions, -1 if not configured
See Also:: setMaxVersions(Configuration, int)

isOfflineScan

protected static boolean isOfflineScan(org.apache.hadoop.conf.Configuration conf)

getIterators

protected static List<org.apache.accumulo.core.client.mapreduce.InputFormatBase.AccumuloIterator> getIterators(org.apache.hadoop.mapreduce.JobContext job)

Deprecated. Use getIterators(Configuration) instead

getIterators

protected static List<org.apache.accumulo.core.client.mapreduce.InputFormatBase.AccumuloIterator> getIterators(org.apache.hadoop.conf.Configuration conf)

Gets a list of the iterator settings (for iterators to apply to a scanner) from this configuration.

Parameters:: conf - the Hadoop configuration object
Returns:: a list of iterators
See Also:: addIterator(Configuration, IteratorSetting)

getIteratorOptions

protected static List<org.apache.accumulo.core.client.mapreduce.InputFormatBase.AccumuloIteratorOption> getIteratorOptions(org.apache.hadoop.mapreduce.JobContext job)

Deprecated. Use getIteratorOptions(Configuration) instead

getIteratorOptions

protected static List<org.apache.accumulo.core.client.mapreduce.InputFormatBase.AccumuloIteratorOption> getIteratorOptions(org.apache.hadoop.conf.Configuration conf)

Gets a list of the iterator options specified on this configuration.

Parameters:: conf - the Hadoop configuration object
Returns:: a list of iterator options
See Also:: addIterator(Configuration, IteratorSetting)

getSplits

public List<org.apache.hadoop.mapreduce.InputSplit> getSplits(org.apache.hadoop.mapreduce.JobContext job)
                                                       throws IOException

Read the metadata table to get tablets and match up ranges to them.

Specified by:: getSplits in class org.apache.hadoop.mapreduce.InputFormat<K,V>

Throws:: IOException

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

org.apache.accumulo.core.client.mapreduce Class InputFormatBase<K,V>

log

InputFormatBase

setIsolated

setIsolated

setLocalIterators

setLocalIterators

setInputInfo

setInputInfo

setZooKeeperInstance

setZooKeeperInstance

setMockInstance

setMockInstance

setRanges

setRanges

disableAutoAdjustRanges

disableAutoAdjustRanges

setRegex

setMaxVersions

setMaxVersions

setScanOffline

fetchColumns

fetchColumns

setLogLevel

setLogLevel

addIterator

addIterator

setIterator

setIteratorOption

isIsolated

isIsolated

usesLocalIterators

usesLocalIterators

getUsername

getUsername

getPassword

getPassword

getTablename

getTablename

getAuthorizations

getAuthorizations

getInstance

getInstance

getTabletLocator

getTabletLocator

getRanges

getRanges

getRegex

getFetchedColumns

getFetchedColumns

getAutoAdjustRanges

getAutoAdjustRanges

getLogLevel

getLogLevel

validateOptions

validateOptions

getMaxVersions

getMaxVersions

isOfflineScan

getIterators

getIterators

getIteratorOptions

getIteratorOptions

getSplits

org.apache.accumulo.core.client.mapreduce
Class InputFormatBase<K,V>