public class TableTunnel extends Object
TableTunnel.UploadSession
和 TableTunnel.DownloadSession
这两个会话来负责。示例代码(将一张表的数据导入到另一张表):
public class Sample { private static String accessID = ""; private static String accessKey = " "; private static String odpsURL = " "; private static String tunnelURL = " "; private static String project = " "; private static String table1 = " "; private static String table2 = " "; public static void main(String args[]) { Account account = new AliyunAccount(accessID, accessKey); Odps odps = new Odps(account); odps.setEndpoint(odpsURL); odps.setDefaultProject(project); TableTunnel tunnel = new TableTunnel(odps); tunnel.setEndpoint(tunnelURL); try { DownloadSession downloadSession = tunnel.createDownloadSession(project, table1); long count = downloadSession.getRecordCount(); RecordReader recordReader = downloadSession.openRecordReader(0, count); Record record; UploadSession uploadSession = tunnel.createUploadSession(project, table2); RecordWriter recordWriter = uploadSession.openRecordWriter(0); while ((record = recordReader.read()) != null) { recordWriter.write(record); } recordReader.close(); recordWriter.close(); uploadSession.commit(new Long[]{0L}); } catch (TunnelException e) { e.printStackTrace(); } catch (IOException e1) { e1.printStackTrace(); } } }
Modifier and Type | Class and Description |
---|---|
class |
TableTunnel.DownloadSession
DownloadSession 表示从 ODPS 表中下载数据的会话,一般通过
TableTunnel 来创建。Session ID 是 Session 的唯一标识符,可通过 TableTunnel.DownloadSession.getId() 获取。表中Record总数可通过 TableTunnel.DownloadSession.getRecordCount() 得到,用户可根据 Record 总数来启动并发下载。DownloadSession 通过创建 RecordReader 来完成数据的读取,需指定读取记录的起始位置和数量RecordReader 对应HTTP请求的超时时间为 300S,超时后 service 端会主动关闭。 |
static class |
TableTunnel.DownloadStatus
下载会话的状态
UNKNOWN 未知 NORMAL 正常 CLOSED 关闭 EXPIRED 过期 |
class |
TableTunnel.UploadSession
UploadSession 表示向ODPS表中上传数据的会话,一般通过
TableTunnel 来创建。上传 Session 是 INSERT INTO 语义,即对同一张表或 partition 的多个/多次上传 Session 互不影响。 Session ID 是Session的唯一标识符,可通过 TableTunnel.UploadSession.getId() 获取。UploadSession 通过创建 RecordWriter 来完成数据的写入操作。每个 RecordWriter 对应一个 HTTP Request,单个 UploadSession 可创建多个RecordWriter。 创建 RecordWriter 时需指定 block ID,block ID是 RecordWriter 的唯一标识符,取值范围 [0, 20000),单个block上传的数据限制是 100G。 同一 UploadSession 中,使用同一 block ID 多次打开 RecordWriter 会导致覆盖行为,最后一个调用 close() 的 RecordWriter 所上传的数据会被保留。同一RecordWriter实例不能重复调用 close(). RecordWriter 对应的 HTTP Request超时为 120s,若 120s 内没有数据传输,service 端会主动关闭连接。特别提醒,HTTP协议本身有8K buffer。 |
static class |
TableTunnel.UploadStatus
UploadStatus表示当前Upload的状态
UNKNOWN 未知 NORMAL 正常 CLOSING 关闭中 CLOSED 已关闭 CANCELED 已取消 EXPIRED 已过期 CRITICAL 严重错误 |
Constructor and Description |
---|
TableTunnel(Odps odps)
构造此类对象
|
Modifier and Type | Method and Description |
---|---|
TableTunnel.DownloadSession |
createDownloadSession(String projectName,
String tableName)
在非分区表上创建下载会话
|
TableTunnel.DownloadSession |
createDownloadSession(String projectName,
String tableName,
boolean async)
在非分区表上创建下载会话
|
TableTunnel.DownloadSession |
createDownloadSession(String projectName,
String tableName,
long shardId)
在shard表上创建下载会话
|
TableTunnel.DownloadSession |
createDownloadSession(String projectName,
String tableName,
PartitionSpec partitionSpec)
在分区表上创建下载会话
|
TableTunnel.DownloadSession |
createDownloadSession(String projectName,
String tableName,
PartitionSpec partitionSpec,
boolean async)
在分区表上创建下载会话
|
TableTunnel.DownloadSession |
createDownloadSession(String projectName,
String tableName,
PartitionSpec partitionSpec,
long shardId)
在shard表上创建下载会话
|
TableTunnel.UploadSession |
createUploadSession(String projectName,
String tableName)
在非分区表上创建上传会话
|
TableTunnel.UploadSession |
createUploadSession(String projectName,
String tableName,
PartitionSpec partitionSpec)
在分区表上创建上传会话
|
TableTunnel.DownloadSession |
getDownloadSession(String projectName,
String tableName,
long shardId,
String id)
获得在非分区表上创建的下载会话
|
TableTunnel.DownloadSession |
getDownloadSession(String projectName,
String tableName,
PartitionSpec partitionSpec,
long shardId,
String id)
获得在shard表上创建的下载会话
|
TableTunnel.DownloadSession |
getDownloadSession(String projectName,
String tableName,
PartitionSpec partitionSpec,
String id)
获得在分区表上创建的下载会话
|
TableTunnel.DownloadSession |
getDownloadSession(String projectName,
String tableName,
String id)
获得在非分区表上创建的下载会话
|
TableTunnel.UploadSession |
getUploadSession(String projectName,
String tableName,
PartitionSpec partitionSpec,
String id)
获得在分区表上创建的上传会话
|
TableTunnel.UploadSession |
getUploadSession(String projectName,
String tableName,
PartitionSpec partitionSpec,
String id,
long shares,
long shareId)
获得在分区表的上传会话,且该会话将要使用
TunnelBufferedWriter 进行数据上传。
当有多个这样的会话实例(多进程或多线程)共享会话 ID 时,需要同时声明此会话实例的唯一标识(shareId)和共享的会话实例个数(shares)。 |
TableTunnel.UploadSession |
getUploadSession(String projectName,
String tableName,
String id)
获得在非分区表上创建的上传会话
|
TableTunnel.UploadSession |
getUploadSession(String projectName,
String tableName,
String id,
long shares,
long shareId)
获得在非分区表的上传会话,且该会话将要使用
TunnelBufferedWriter 进行数据上传。
当有多个这样的会话实例(多进程或多线程)共享会话 ID 时,需要同时声明此会话实例的唯一标识(shareId)和共享的会话实例个数(shares)。 |
void |
setEndpoint(String endpoint)
设置TunnelServer地址
|
public TableTunnel.UploadSession createUploadSession(String projectName, String tableName) throws TunnelException
projectName
- Project名称tableName
- 表名,非视图TableTunnel.UploadSession
TunnelException
public TableTunnel.UploadSession createUploadSession(String projectName, String tableName, PartitionSpec partitionSpec) throws TunnelException
注: 分区必须为最末级分区,如表有两级分区pt,ds, 则必须全部指定值, 不支持只指定其中一个值
projectName
- Project名tableName
- 表名,非视图partitionSpec
- 指定分区 PartitionSpec
TableTunnel.UploadSession
TunnelException
public TableTunnel.UploadSession getUploadSession(String projectName, String tableName, String id, long shares, long shareId) throws TunnelException
TunnelBufferedWriter
进行数据上传。
当有多个这样的会话实例(多进程或多线程)共享会话 ID 时,需要同时声明此会话实例的唯一标识(shareId)和共享的会话实例个数(shares)。
final String sid = ""; Thread t1 = new Thread() {
projectName
- Project名tableName
- 表名,非视图id
- 上传会话的ID TableTunnel.UploadSession.getId()
shares
- 有多少个 UploadSession 实例共享这个会话 IDshareId
- 此 UploadSession 的唯一标识,建议为 0 开始的正整数TableTunnel.UploadSession
TunnelException
public TableTunnel.UploadSession getUploadSession(String projectName, String tableName, PartitionSpec partitionSpec, String id, long shares, long shareId) throws TunnelException
TunnelBufferedWriter
进行数据上传。
当有多个这样的会话实例(多进程或多线程)共享会话 ID 时,需要同时声明此会话实例的唯一标识(shareId)和共享的会话实例个数(shares)。projectName
- Project名tableName
- 表名,非视图partitionSpec
- 指定分区 PartitionSpec
id
- 上传会话的ID TableTunnel.UploadSession.getId()
shares
- 有多少个 UploadSession 实例共享这个会话 IDshareId
- 此 UploadSession 的唯一标识,建议为 0 开始的正整数TableTunnel.UploadSession
TunnelException
public TableTunnel.UploadSession getUploadSession(String projectName, String tableName, String id) throws TunnelException
projectName
- Project名tableName
- 表名,非视图id
- 上传会话的ID TableTunnel.UploadSession.getId()
TableTunnel.UploadSession
TunnelException
public TableTunnel.UploadSession getUploadSession(String projectName, String tableName, PartitionSpec partitionSpec, String id) throws TunnelException
projectName
- Project名tableName
- 表名,非视图partitionSpec
- 上传数据表的partition描述 PartitionSpec
id
- 上传会话ID TableTunnel.UploadSession.getId()
TableTunnel.UploadSession
TunnelException
public TableTunnel.DownloadSession createDownloadSession(String projectName, String tableName) throws TunnelException
projectName
- Project名称tableName
- 表名,非视图TableTunnel.DownloadSession
TunnelException
public TableTunnel.DownloadSession createDownloadSession(String projectName, String tableName, boolean async) throws TunnelException
projectName
- Project名称tableName
- 表名,非视图async
- 异步创建session,小文件多的场景下可以避免连接超时的问题TableTunnel.DownloadSession
TunnelException
public TableTunnel.DownloadSession createDownloadSession(String projectName, String tableName, PartitionSpec partitionSpec) throws TunnelException
projectName
- Project名tableName
- 表名,非视图partitionSpec
- 指定分区 PartitionSpec
TableTunnel.DownloadSession
TunnelException
public TableTunnel.DownloadSession createDownloadSession(String projectName, String tableName, PartitionSpec partitionSpec, boolean async) throws TunnelException
projectName
- Project名tableName
- 表名,非视图partitionSpec
- 指定分区 PartitionSpec
async
- 异步创建session,小文件多的场景下可以避免连接超时的问题TableTunnel.DownloadSession
TunnelException
public TableTunnel.DownloadSession createDownloadSession(String projectName, String tableName, long shardId) throws TunnelException
projectName
- Project名tableName
- 表名,非视图shardId
- 指定shardIdTableTunnel.DownloadSession
TunnelException
public TableTunnel.DownloadSession createDownloadSession(String projectName, String tableName, PartitionSpec partitionSpec, long shardId) throws TunnelException
projectName
- Project名tableName
- 表名,非视图partitionSpec
- 指定分区 PartitionSpec
shardId
- 指定shardIsTableTunnel.DownloadSession
TunnelException
public TableTunnel.DownloadSession getDownloadSession(String projectName, String tableName, String id) throws TunnelException
projectName
- Project名tableName
- 表名,非视图id
- 下载会话ID TableTunnel.DownloadSession.getId()
TableTunnel.DownloadSession
TunnelException
public TableTunnel.DownloadSession getDownloadSession(String projectName, String tableName, long shardId, String id) throws TunnelException
projectName
- Project名tableName
- 表名,非视图id
- 下载会话ID TableTunnel.DownloadSession.getId()
TableTunnel.DownloadSession
TunnelException
public TableTunnel.DownloadSession getDownloadSession(String projectName, String tableName, PartitionSpec partitionSpec, String id) throws TunnelException
projectName
- Project名tableName
- 表名,非视图partitionSpec
- 指定分区 PartitionSpec
id
- 下载会话ID TableTunnel.DownloadSession.getId()
TableTunnel.DownloadSession
TunnelException
public TableTunnel.DownloadSession getDownloadSession(String projectName, String tableName, PartitionSpec partitionSpec, long shardId, String id) throws TunnelException
projectName
- Project名tableName
- 表名,非视图partitionSpec
- 指定分区 PartitionSpec
shardId
- 指定shardIdid
- 下载会话ID TableTunnel.DownloadSession.getId()
TableTunnel.DownloadSession
TunnelException
public void setEndpoint(String endpoint)
没有设置TunnelServer地址的情况下, 自动选择
endpoint
- Copyright © 2019 Alibaba Cloud Computing. All rights reserved.