|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectorg.apache.hadoop.hbase.client.Operation
org.apache.hadoop.hbase.client.OperationWithAttributes
org.apache.hadoop.hbase.client.Query
org.apache.hadoop.hbase.client.Scan
@InterfaceAudience.Public @InterfaceStability.Stable public class Scan
Used to perform Scan operations.
All operations are identical to Get
with the exception of
instantiation. Rather than specifying a single row, an optional startRow
and stopRow may be defined. If rows are not specified, the Scanner will
iterate over all rows.
To scan everything for each row, instantiate a Scan object.
To modify scanner caching for just this scan, use setCaching
.
If caching is NOT set, we will use the caching value of the hosting HTable
. See
HTable.setScannerCaching(int)
. In addition to row caching, it is possible to specify a
maximum result size, using setMaxResultSize(long)
. When both are used,
single server requests are limited by either number of rows or maximum result size, whichever
limit comes first.
To further define the scope of what to get when scanning, perform additional methods as outlined below.
To get all columns from specific families, execute addFamily
for each family to retrieve.
To get specific columns, execute addColumn
for each column to retrieve.
To only retrieve columns within a specific range of version timestamps,
execute setTimeRange
.
To only retrieve columns with a specific timestamp, execute
setTimestamp
.
To limit the number of versions of each column to be returned, execute
setMaxVersions
.
To limit the maximum number of values returned for each call to next(),
execute setBatch
.
To add a filter, execute setFilter
.
Expert: To explicitly disable server-side block caching for this scan,
execute setCacheBlocks(boolean)
.
Note: Usage alters Scan instances. Internally, attributes are updated as the Scan runs and if enabled, metrics accumulate in the Scan instance. Be aware this is the case when you go to clone a Scan instance or if you go to reuse a created Scan instance; safer is create a Scan instance per usage.
Field Summary | |
---|---|
static String |
HINT_LOOKAHEAD
Deprecated. without replacement This is now a no-op, SEEKs and SKIPs are optimizated automatically. |
static String |
SCAN_ATTRIBUTES_METRICS_DATA
|
static String |
SCAN_ATTRIBUTES_METRICS_ENABLE
|
static String |
SCAN_ATTRIBUTES_TABLE_NAME
|
Fields inherited from class org.apache.hadoop.hbase.client.Query |
---|
filter |
Fields inherited from class org.apache.hadoop.hbase.client.OperationWithAttributes |
---|
ID_ATRIBUTE |
Constructor Summary | |
---|---|
Scan()
Create a Scan operation across all rows. |
|
Scan(byte[] startRow)
Create a Scan operation starting at the specified row. |
|
Scan(byte[] startRow,
byte[] stopRow)
Create a Scan operation for the range of rows specified. |
|
Scan(byte[] startRow,
Filter filter)
|
|
Scan(Get get)
Builds a scan object with the same specs as get. |
|
Scan(Scan scan)
Creates a new instance of this class while copying all values. |
Method Summary | |
---|---|
Scan |
addColumn(byte[] family,
byte[] qualifier)
Get the column from the specified family with the specified qualifier. |
Scan |
addFamily(byte[] family)
Get all columns from the specified family. |
boolean |
doLoadColumnFamiliesOnDemand()
Get the logical value indicating whether on-demand CF loading should be allowed. |
int |
getBatch()
|
boolean |
getCacheBlocks()
Get whether blocks should be cached for this Scan. |
int |
getCaching()
|
byte[][] |
getFamilies()
|
Map<byte[],NavigableSet<byte[]>> |
getFamilyMap()
Getting the familyMap |
Filter |
getFilter()
|
Map<String,Object> |
getFingerprint()
Compile the table and column family (i.e. |
Boolean |
getLoadColumnFamiliesOnDemandValue()
Get the raw loadColumnFamiliesOnDemand setting; if it's not set, can be null. |
long |
getMaxResultSize()
|
int |
getMaxResultsPerColumnFamily()
|
int |
getMaxVersions()
|
int |
getRowOffsetPerColumnFamily()
Method for retrieving the scan's offset per row per column family (#kvs to be skipped) |
byte[] |
getStartRow()
|
byte[] |
getStopRow()
|
TimeRange |
getTimeRange()
|
boolean |
hasFamilies()
|
boolean |
hasFilter()
|
boolean |
isGetScan()
|
boolean |
isRaw()
|
boolean |
isReversed()
Get whether this scan is a reversed one. |
boolean |
isSmall()
Get whether this scan is a small scan |
int |
numFamilies()
|
void |
setBatch(int batch)
Set the maximum number of values to return for each call to next() |
void |
setCacheBlocks(boolean cacheBlocks)
Set whether blocks should be cached for this Scan. |
void |
setCaching(int caching)
Set the number of rows for caching that will be passed to scanners. |
Scan |
setFamilyMap(Map<byte[],NavigableSet<byte[]>> familyMap)
Setting the familyMap |
Scan |
setFilter(Filter filter)
Apply the specified server-side filter when performing the Query. |
void |
setLoadColumnFamiliesOnDemand(boolean value)
Set the value indicating whether loading CFs on demand should be allowed (cluster default is false). |
void |
setMaxResultSize(long maxResultSize)
Set the maximum result size. |
void |
setMaxResultsPerColumnFamily(int limit)
Set the maximum number of values to return per row per Column Family |
Scan |
setMaxVersions()
Get all available versions. |
Scan |
setMaxVersions(int maxVersions)
Get up to the specified number of versions of each column. |
void |
setRaw(boolean raw)
Enable/disable "raw" mode for this scan. |
Scan |
setReversed(boolean reversed)
Set whether this scan is a reversed one |
void |
setRowOffsetPerColumnFamily(int offset)
Set offset for the row per Column Family. |
void |
setSmall(boolean small)
Set whether this scan is a small scan |
Scan |
setStartRow(byte[] startRow)
Set the start row of the scan. |
Scan |
setStopRow(byte[] stopRow)
Set the stop row. |
Scan |
setTimeRange(long minStamp,
long maxStamp)
Get versions of columns only within the specified timestamp range, [minStamp, maxStamp). |
Scan |
setTimeStamp(long timestamp)
Get versions of columns with the specified timestamp. |
Map<String,Object> |
toMap(int maxCols)
Compile the details beyond the scope of getFingerprint (row, columns, timestamps, etc.) into a Map along with the fingerprinted information. |
Methods inherited from class org.apache.hadoop.hbase.client.Query |
---|
getACL, getACLStrategy, getAuthorizations, getIsolationLevel, setACL, setACL, setACLStrategy, setAuthorizations, setIsolationLevel |
Methods inherited from class org.apache.hadoop.hbase.client.OperationWithAttributes |
---|
getAttribute, getAttributeSize, getAttributesMap, getId, setAttribute, setId |
Methods inherited from class org.apache.hadoop.hbase.client.Operation |
---|
toJSON, toJSON, toMap, toString, toString |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Field Detail |
---|
@Deprecated public static final String HINT_LOOKAHEAD
addColumn(byte[], byte[])
Scan s = new Scan(...);
s.addColumn(...);
s.setAttribute(Scan.HINT_LOOKAHEAD, Bytes.toBytes(2));
Default is 0 (always reseek).
public static final String SCAN_ATTRIBUTES_METRICS_ENABLE
public static final String SCAN_ATTRIBUTES_METRICS_DATA
public static final String SCAN_ATTRIBUTES_TABLE_NAME
Constructor Detail |
---|
public Scan()
public Scan(byte[] startRow, Filter filter)
public Scan(byte[] startRow)
If the specified row does not exist, the Scanner will start from the next closest row after the specified row.
startRow
- row to start scanner at or afterpublic Scan(byte[] startRow, byte[] stopRow)
startRow
- row to start scanner at or after (inclusive)stopRow
- row to stop scanner before (exclusive)public Scan(Scan scan) throws IOException
scan
- The scan instance to copy from.
IOException
- When copying the values fails.public Scan(Get get)
get
- get to model scan afterMethod Detail |
---|
public boolean isGetScan()
public Scan addFamily(byte[] family)
Overrides previous calls to addColumn for this family.
family
- family name
public Scan addColumn(byte[] family, byte[] qualifier)
Overrides previous calls to addFamily for this family.
family
- family namequalifier
- column qualifier
public Scan setTimeRange(long minStamp, long maxStamp) throws IOException
minStamp
- minimum timestamp value, inclusivemaxStamp
- maximum timestamp value, exclusive
IOException
- if invalid time rangesetMaxVersions()
,
setMaxVersions(int)
public Scan setTimeStamp(long timestamp) throws IOException
timestamp
- version timestamp
IOException
setMaxVersions()
,
setMaxVersions(int)
public Scan setStartRow(byte[] startRow)
startRow
- row to start scan on (inclusive)
Note: In order to make startRow exclusive add a trailing 0 byte
public Scan setStopRow(byte[] stopRow)
stopRow
- row to end at (exclusive)
Note: In order to make stopRow inclusive add a trailing 0 byte
public Scan setMaxVersions()
public Scan setMaxVersions(int maxVersions)
maxVersions
- maximum versions for each column
public void setBatch(int batch)
batch
- the maximum number of valuespublic void setMaxResultsPerColumnFamily(int limit)
limit
- the maximum number of values returned / row / CFpublic void setRowOffsetPerColumnFamily(int offset)
offset
- is the number of kvs that will be skipped.public void setCaching(int caching)
HTable.getScannerCaching()
will apply.
Higher caching values will enable faster scanners but will use more memory.
caching
- the number of rows for cachingpublic long getMaxResultSize()
setMaxResultSize(long)
public void setMaxResultSize(long maxResultSize)
maxResultSize
- The maximum result size in bytes.public Scan setFilter(Filter filter)
Query
Filter.filterKeyValue(Cell)
is called AFTER all tests
for ttl, column match, deletes and max versions have been run.
setFilter
in class Query
filter
- filter to run on the server
public Scan setFamilyMap(Map<byte[],NavigableSet<byte[]>> familyMap)
familyMap
- map of family to qualifier
public Map<byte[],NavigableSet<byte[]>> getFamilyMap()
public int numFamilies()
public boolean hasFamilies()
public byte[][] getFamilies()
public byte[] getStartRow()
public byte[] getStopRow()
public int getMaxVersions()
public int getBatch()
public int getMaxResultsPerColumnFamily()
public int getRowOffsetPerColumnFamily()
public int getCaching()
public TimeRange getTimeRange()
public Filter getFilter()
getFilter
in class Query
public boolean hasFilter()
public void setCacheBlocks(boolean cacheBlocks)
This is true by default. When true, default settings of the table and family are used (this will never override caching blocks if the block cache is disabled for that family or entirely).
cacheBlocks
- if false, default settings are overridden and blocks
will not be cachedpublic boolean getCacheBlocks()
public Scan setReversed(boolean reversed)
This is false by default which means forward(normal) scan.
reversed
- if true, scan will be backward order
public boolean isReversed()
public void setLoadColumnFamiliesOnDemand(boolean value)
public Boolean getLoadColumnFamiliesOnDemandValue()
public boolean doLoadColumnFamiliesOnDemand()
public Map<String,Object> getFingerprint()
getFingerprint
in class Operation
public Map<String,Object> toMap(int maxCols)
toMap
in class Operation
maxCols
- a limit on the number of columns output prior to truncation
public void setRaw(boolean raw)
raw
- True/False to enable/disable "raw" mode.public boolean isRaw()
public void setSmall(boolean small)
Small scan should use pread and big scan can use seek + read seek + read is fast but can cause two problem (1) resource contention (2) cause too much network io [89-fb] Using pread for non-compaction read request https://issues.apache.org/jira/browse/HBASE-7266 On the other hand, if setting it true, we would do openScanner,next,closeScanner in one RPC call. It means the better performance for small scan. [HBASE-9488]. Generally, if the scan range is within one data block(64KB), it could be considered as a small scan.
small
- public boolean isSmall()
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |