|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectorg.apache.hadoop.mapreduce.InputFormat<ImmutableBytesWritable,Result>
org.apache.hadoop.hbase.mapreduce.TableInputFormatBase
@InterfaceAudience.Public @InterfaceStability.Stable public abstract class TableInputFormatBase
A base for TableInputFormats. Receives a HTable, an
Scan instance that defines the input columns etc. Subclasses may use
other TableRecordReader implementations.
An example of a subclass:
public static class ExampleTIF extends TableInputFormatBase implements JobConfigurable {
| Field Summary | |
|---|---|
static String |
INPUT_AUTOBALANCE_MAXSKEWRATIO
Specify if ratio for data skew in M/R jobs, it goes well with the enabling hbase.mapreduce .input.autobalance property. |
static String |
MAPREDUCE_INPUT_AUTOBALANCE
Specify if we enable auto-balance for input in M/R jobs. |
static String |
TABLE_ROW_TEXTKEY
Specify if the row key in table is text (ASCII between 32~126), default is true. |
| Constructor Summary | |
|---|---|
TableInputFormatBase()
|
|
| Method Summary | |
|---|---|
List<org.apache.hadoop.mapreduce.InputSplit> |
calculateRebalancedSplits(List<org.apache.hadoop.mapreduce.InputSplit> list,
org.apache.hadoop.mapreduce.JobContext context,
long average)
Calculates the number of MapReduce input splits for the map tasks. |
org.apache.hadoop.mapreduce.RecordReader<ImmutableBytesWritable,Result> |
createRecordReader(org.apache.hadoop.mapreduce.InputSplit split,
org.apache.hadoop.mapreduce.TaskAttemptContext context)
Builds a TableRecordReader. |
protected HTable |
getHTable()
Allows subclasses to get the HTable. |
Scan |
getScan()
Gets the scan defining the actual details like columns etc. |
static byte[] |
getSplitKey(byte[] start,
byte[] end,
boolean isText)
select a split point in the region. |
List<org.apache.hadoop.mapreduce.InputSplit> |
getSplits(org.apache.hadoop.mapreduce.JobContext context)
Calculates the splits that will serve as input for the map tasks. |
protected Pair<byte[][],byte[][]> |
getStartEndKeys()
|
protected boolean |
includeRegionInSplit(byte[] startKey,
byte[] endKey)
Test if the given region is to be included in the InputSplit while splitting the regions of a table. |
String |
reverseDNS(InetAddress ipAddress)
|
protected void |
setHTable(HTable table)
Allows subclasses to set the HTable. |
void |
setScan(Scan scan)
Sets the scan defining the actual details like columns etc. |
protected void |
setTableRecordReader(TableRecordReader tableRecordReader)
Allows subclasses to set the TableRecordReader. |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Field Detail |
|---|
public static final String MAPREDUCE_INPUT_AUTOBALANCE
public static final String INPUT_AUTOBALANCE_MAXSKEWRATIO
public static final String TABLE_ROW_TEXTKEY
| Constructor Detail |
|---|
public TableInputFormatBase()
| Method Detail |
|---|
public org.apache.hadoop.mapreduce.RecordReader<ImmutableBytesWritable,Result> createRecordReader(org.apache.hadoop.mapreduce.InputSplit split,
org.apache.hadoop.mapreduce.TaskAttemptContext context)
throws IOException
createRecordReader in class org.apache.hadoop.mapreduce.InputFormat<ImmutableBytesWritable,Result>split - The split to work with.context - The current context.
IOException - When creating the reader fails.InputFormat.createRecordReader(
org.apache.hadoop.mapreduce.InputSplit,
org.apache.hadoop.mapreduce.TaskAttemptContext)
protected Pair<byte[][],byte[][]> getStartEndKeys()
throws IOException
IOException
public List<org.apache.hadoop.mapreduce.InputSplit> getSplits(org.apache.hadoop.mapreduce.JobContext context)
throws IOException
getSplits in class org.apache.hadoop.mapreduce.InputFormat<ImmutableBytesWritable,Result>context - The current job context.
IOException - When creating the list of splits fails.InputFormat.getSplits(
org.apache.hadoop.mapreduce.JobContext)
public String reverseDNS(InetAddress ipAddress)
throws NamingException,
UnknownHostException
NamingException
UnknownHostException
public List<org.apache.hadoop.mapreduce.InputSplit> calculateRebalancedSplits(List<org.apache.hadoop.mapreduce.InputSplit> list,
org.apache.hadoop.mapreduce.JobContext context,
long average)
throws IOException
list - The list of input splits before balance.context - The current job context.average - The average size of all regions .
IOException - When creating the list of splits fails.InputFormat.getSplits(
org.apache.hadoop.mapreduce.JobContext)
public static byte[] getSplitKey(byte[] start,
byte[] end,
boolean isText)
start - Start key of the regionend - End key of the regionisText - It determines to use text key mode or binary key mode
protected boolean includeRegionInSplit(byte[] startKey,
byte[] endKey)
This optimization is effective when there is a specific reasoning to exclude an entire region from the M-R job,
(and hence, not contributing to the InputSplit), given the start and end keys of the same.
Useful when we need to remember the last-processed top record and revisit the [last, current) interval for M-R processing,
continuously. In addition to reducing InputSplits, reduces the load on the region server as
well, due to the ordering of the keys.
Note: It is possible that endKey.length() == 0 , for the last (recent) region.
Override this method, if you want to bulk exclude regions altogether from M-R.
By default, no region is excluded( i.e. all regions are included).
startKey - Start key of the regionendKey - End key of the region
protected HTable getHTable()
HTable.
protected void setHTable(HTable table)
HTable.
table - The table to get the data from.public Scan getScan()
public void setScan(Scan scan)
scan - The scan to set.protected void setTableRecordReader(TableRecordReader tableRecordReader)
TableRecordReader.
tableRecordReader - A different TableRecordReader
implementation.
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||