A simple abstraction over the HBaseContext.foreachPartition method.
A simple abstraction over the HBaseContext.foreachPartition method.
It allow addition support for a user to take a JavaRDD and generate delete and send them to HBase.
The complexity of managing the Connection is removed from the developer
Original JavaRDD with data to iterate over
The name of the table to delete from
Function to convert a value in the JavaRDD to a HBase Deletes
The number of deletes to batch before sending to HBase
A simple abstraction over the HBaseContext.mapPartition method.
A simple abstraction over the HBaseContext.mapPartition method.
It allow addition support for a user to take a JavaRDD and generates a new RDD based on Gets and the results they bring back from HBase
The name of the table to get from
batch size of how many gets to retrieve in a single fetch
Original JavaRDD with data to iterate over
Function to convert a value in the JavaRDD to a HBase Get
This will convert the HBase Result object to what ever the user wants to put in the resulting JavaRDD
New JavaRDD that is created by the Get to HBase
A simple abstraction over the HBaseContext.foreachPartition method.
A simple abstraction over the HBaseContext.foreachPartition method.
It allow addition support for a user to take JavaRDD and generate puts and send them to HBase. The complexity of managing the Connection is removed from the developer
Original JavaRDD with data to iterate over
The name of the table to put into
Function to convert a value in the JavaRDD to a HBase Put
A simple enrichment of the traditional Spark javaRdd foreachPartition.
A simple enrichment of the traditional Spark javaRdd foreachPartition. This function differs from the original in that it offers the developer access to a already connected Connection object
Note: Do not close the Connection object. All Connection management is handled outside this method
Original javaRdd with data to iterate over
Function to be given a iterator to iterate through the RDD values and a Connection object to interact with HBase
A overloaded version of HBaseContext hbaseRDD that define the type of the resulting JavaRDD
A overloaded version of HBaseContext hbaseRDD that define the type of the resulting JavaRDD
The name of the table to scan
The HBase scan object to use to read data from HBase
New JavaRDD with results from scan
This function will use the native HBase TableInputFormat with the given scan object to generate a new JavaRDD
This function will use the native HBase TableInputFormat with the given scan object to generate a new JavaRDD
The name of the table to scan
The HBase scan object to use to read data from HBase
Function to convert a Result object from HBase into What the user wants in the final generated JavaRDD
New JavaRDD with results from scan
A simple enrichment of the traditional Spark JavaRDD mapPartition.
A simple enrichment of the traditional Spark JavaRDD mapPartition. This function differs from the original in that it offers the developer access to a already connected Connection object
Note: Do not close the Connection object. All Connection management is handled outside this method
Note: Make sure to partition correctly to avoid memory issue when getting data from HBase
Original JavaRdd with data to iterate over
Function to be given a iterator to iterate through the RDD values and a Connection object to interact with HBase
Returns a new RDD generated by the user definition function just like normal mapPartition
(Since version ) see corresponding Javadoc for more information.
This is the Java Wrapper over HBaseContext which is written in Scala. This class will be used by developers that want to work with Spark or Spark Streaming in Java