GroupingTableMapper (Apache HBase - Server 0.98.18-hadoop2 API)

Overview

Package

Class

Use

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

org.apache.hadoop.hbase.mapreduce
Class GroupingTableMapper

java.lang.Object
  org.apache.hadoop.mapreduce.Mapper<ImmutableBytesWritable,Result,KEYOUT,VALUEOUT>
      org.apache.hadoop.hbase.mapreduce.TableMapper<ImmutableBytesWritable,Result>
          org.apache.hadoop.hbase.mapreduce.GroupingTableMapper

All Implemented Interfaces:: org.apache.hadoop.conf.Configurable

@InterfaceAudience.Public @InterfaceStability.Stable public class GroupingTableMapper
extends TableMapper<ImmutableBytesWritable,Result>
implements org.apache.hadoop.conf.Configurable
extends TableMapper<ImmutableBytesWritable,Result>
implements org.apache.hadoop.conf.Configurable

Extract grouping columns from input record.

Nested Class Summary

Nested classes/interfaces inherited from class org.apache.hadoop.mapreduce.Mapper
`org.apache.hadoop.mapreduce.Mapper.Context`

Field Summary
`protected byte[][]`	`columns` The grouping columns.
`static String`	`GROUP_COLUMNS` JobConf parameter to specify the columns used to produce the key passed to collect from the map phase.

Constructor Summary
`GroupingTableMapper()`

Method Summary
`protected ImmutableBytesWritable`	`createGroupKey(byte[][] vals)` Create a key by concatenating multiple column values.
`protected byte[][]`	`extractKeyValues(Result r)` Extract columns values from the current record.
`org.apache.hadoop.conf.Configuration`	`getConf()` Returns the current configuration.
`static void`	`initJob(String table, Scan scan, String groupColumns, Class<? extends TableMapper> mapper, org.apache.hadoop.mapreduce.Job job)` Use this before submitting a TableMap job.
`void`	`map(ImmutableBytesWritable key, Result value, org.apache.hadoop.mapreduce.Mapper.Context context)` Extract the grouping columns from value to construct a new key.
`void`	`setConf(org.apache.hadoop.conf.Configuration configuration)` Sets the configuration.

Methods inherited from class org.apache.hadoop.mapreduce.Mapper
`cleanup, run, setup`

Methods inherited from class java.lang.Object
`clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait`

Field Detail

GROUP_COLUMNS

public static final String GROUP_COLUMNS

JobConf parameter to specify the columns used to produce the key passed to collect from the map phase.

See Also:: Constant Field Values

columns

protected byte[][] columns

The grouping columns.

Constructor Detail

GroupingTableMapper

public GroupingTableMapper()

Method Detail

initJob

public static void initJob(String table,
                           Scan scan,
                           String groupColumns,
                           Class<? extends TableMapper> mapper,
                           org.apache.hadoop.mapreduce.Job job)
                    throws IOException

Use this before submitting a TableMap job. It will appropriately set up the job.

Parameters:: table - The table to be processed.; scan - The scan with the columns etc.; groupColumns - A space separated list of columns used to form the key used in collect.; mapper - The mapper class.; job - The current job.
Throws:: IOException - When setting up the job fails.

map

public void map(ImmutableBytesWritable key,
                Result value,
                org.apache.hadoop.mapreduce.Mapper.Context context)
         throws IOException,
                InterruptedException

Extract the grouping columns from value to construct a new key. Pass the new key and value to reduce. If any of the grouping columns are not found in the value, the record is skipped.

Overrides:: map in class org.apache.hadoop.mapreduce.Mapper<ImmutableBytesWritable,Result,ImmutableBytesWritable,Result>

Parameters:: key - The current key.; value - The current value.; context - The current context.
Throws:: IOException - When writing the record fails.; InterruptedException - When the job is aborted.

extractKeyValues

protected byte[][] extractKeyValues(Result r)

Extract columns values from the current record. This method returns null if any of the columns are not found.

Override this method if you want to deal with nulls differently.

Parameters:: r - The current values.
Returns:: Array of byte values.

createGroupKey

protected ImmutableBytesWritable createGroupKey(byte[][] vals)

Create a key by concatenating multiple column values.

Override this function in order to produce different types of keys.

Parameters:: vals - The current key/values.
Returns:: A key generated by concatenating multiple column values.

getConf

public org.apache.hadoop.conf.Configuration getConf()

Returns the current configuration.

Specified by:: getConf in interface org.apache.hadoop.conf.Configurable

Returns:: The current configuration.
See Also:: Configurable.getConf()

setConf

public void setConf(org.apache.hadoop.conf.Configuration configuration)

Sets the configuration. This is used to set up the grouping details.

Specified by:: setConf in interface org.apache.hadoop.conf.Configurable

Parameters:: configuration - The configuration to set.
See Also:: Configurable.setConf( org.apache.hadoop.conf.Configuration)