Maxcompute
Maxcompute Sink 连接器
描述
用于从 Maxcompute 读取数据。
关键特性
选项
| 参数名 | 类型 | 必须 | 默认值 |
|---|---|---|---|
| accessId | string | 是 | - |
| accesskey | string | 是 | - |
| endpoint | string | 是 | - |
| project | string | 是 | - |
| table_name | string | 是 | - |
| partition_spec | string | 否 | - |
| overwrite | boolean | 否 | false |
| common-options | string | 否 |
accessId [string]
accessId 您的 Maxcompute accessId,可从阿里云访问。
accesskey [string]
accesskey 您的 Maxcompute accessKey,可从阿里云访问。
endpoint [string]
endpoint 您的 Maxcompute endpoint,以 http 开头。
project [string]
project 您在阿里云中创建的 Maxcompute 项目。
table_name [string]
table_name 目标 Maxcompute 表名,例如:fake。
partition_spec [string]
partition_spec Maxcompute 分区表的规范,例如:ds='20220101'。
overwrite [boolean]
overwrite 是否覆盖表或分区,默认值:false。
save_mode_create_template
我们使用模板来自动创建 MaxCompute 表, 它将根据上游数据和模式类型的类型创建相应的表创建语句, 默认模板可以根据情况进行修改。目前仅在多表模式下工作。
默认模板:
CREATE TABLE IF NOT EXISTS `${table}` (
${rowtype_fields}
) COMMENT '${comment}';
如果在模板中填入自定义字段,例如添加 id 字段
CREATE TABLE IF NOT EXISTS `${table}`
(
id,
${rowtype_fields}
) COMMENT '${comment}';
连接器将自动从上游获取相应的类型来完成填充,
并从 rowtype_fields 中删除 id 字段。此方法可用于自定义修改字段类型和属性。
您可以使用以下占位符
- database:用于获取上游模式中的数据库
- table_name:用于获取上游模式中的表名
- rowtype_fields:用于获取上游模式中的所有字段,我们将自动映射到 MaxCompute 的字段描述
- rowtype_primary_key:用于获取上游模式中的主键(可能是列表)
- rowtype_unique_key:用于获取上游模式中的唯一键(可能是列表)
- comment:用于获取上游模式中的表注释
schema_save_mode [Enum]
在同步任务打开之前,为目标端现有的表结构选择不同的处理方案。
选项介绍:
RECREATE_SCHEMA :表不存在时将创建,表已保存时删除并重建。如果设置了 partition_spec,分区将被删除并重建。
CREATE_SCHEMA_WHEN_NOT_EXIST :表不存在时将创建,表已保存时跳过。如果设置了 partition_spec,分区将被创建。
ERROR_WHEN_SCHEMA_NOT_EXIST :表不存在时将报错
IGNORE :忽略表的处理
data_save_mode [Enum]
在同步任务打开之前,为目标端现有的数据选择不同的处理方案。
选项介绍:
DROP_DATA:保留数据库结构并删除数据
APPEND_DATA:保留数据库结构,保留数据
CUSTOM_PROCESSING:用户定义的处理
ERROR_WHEN_DATA_EXISTS:当存在数据时,报错
custom_sql [String]
当 data_save_mode 选择 CUSTOM_PROCESSING 时,您应该填入 CUSTOM_SQL 参数。此参数通常填入可以执行的 SQL。SQL 将在同步任务之前执行。
datetime_format [String]
用户定义的格式字符串,用于将 LocalDateTime 字段转换为字符串。
当您想指定与 DateTimeUtils.Formatter 中的预定义值之一匹配的自定义日期时间格式时,请使用此选项(例如 yyyy-MM-dd HH:mm:ss、yyyyMMddHHmmss 等)。
示例值:
yyyy-MM-dd HH:mm:ssyyyy-MM-dd HH:mm:ss.SSSSSSyyyy.MM.dd HH:mm:ssyyyy/MM/dd HH:mm:ssyyyy/M/d HH:mmyyyy-M-d HH:mmyyyy/M/d HH:mm:ssyyyy-M-d HH:mm:ssyyyyMMddHHmmss
默认值:yyyy-MM-dd HH:mm:ss
tunnel_endpoint [String]
指定 MaxCompute Tunnel 服务的自定义端点 URL。
默认情况下,端点是从配置的区域自动推断的。
此选项允许您覆盖默认行为并使用自定义 Tunnel 端点。 如果未指定,连接器将使用基于区域的默认 Tunnel 端点。
通常,您不需要设置 tunnel_endpoint。仅在自定义网络、调试或本地开发时才需要。
示例值:
https://dt.cn-hangzhou.maxcompute.aliyun.comhttps://dt.ap-southeast-1.maxcompute.aliyun.comhttp://maxcompute:8080
默认值:未设置(从区域自动推断)
通用选项
Sink 插件通用参数,请参考 Sink 通用选项 详见。
示例
sink {
Maxcompute {
accessId="<your access id>"
accesskey="<your access Key>"
endpoint="<http://service.odps.aliyun.com/api>"
project="<your project>"
table_name="<your table name>"
#partition_spec="<your partition spec>"
#overwrite = false
}
}
变更日志
Change Log
| Change | Commit | Version |
|---|---|---|
| [Improve][API] Optimize the enumerator API semantics and reduce lock calls at the connector level (#9671) | https://github.com/apache/seatunnel/commit/9212a77140 | 2.3.12 |
| [Bug][Connector-V2] NoSuchMethodError caused by Netty version conflict on Spark 3.3.0 (#9632) | https://github.com/apache/seatunnel/commit/4d2b55ce3c | 2.3.12 |
| [Improve][Connector-V2] Replace deprecated createDownloadSession by buildDownloadSession (#9555) | https://github.com/apache/seatunnel/commit/6862945eef | 2.3.12 |
| [Improve][Connector-V2] Add tunnel_endpoint option to MaxCompute source for emulator test (#9548) | https://github.com/apache/seatunnel/commit/b3f3c527ca | 2.3.12 |
| [Improve][Connector-V2] Support maxcompute sink writer upsert/delete action with upsert session mode (#9462) | https://github.com/apache/seatunnel/commit/eb9c8704b9 | 2.3.12 |
| [Improve][Connector-V2] Support maxcompute sink writer with timestamp field type (#9234) | https://github.com/apache/seatunnel/commit/a513c495e3 | 2.3.12 |
| [Feature][Transform] Support define sink column type (#9114) | https://github.com/apache/seatunnel/commit/ab7119e507 | 2.3.11 |
| [Feature][Checkpoint] Add check script for source/sink state class serialVersionUID missing (#9118) | https://github.com/apache/seatunnel/commit/4f5adeb1c7 | 2.3.11 |
| [Improve] maxcompute options (#9163) | https://github.com/apache/seatunnel/commit/fdacbae1af | 2.3.11 |
| [Fix][Connector-V2] Fix maxcompute write with multi parallelism (#9089) | https://github.com/apache/seatunnel/commit/9426b7ba2c | 2.3.11 |
| [Fix][Connector-V2] Fix maxcompute sink write date less than actual date (#8999) | https://github.com/apache/seatunnel/commit/fc942a599b | 2.3.11 |
| [Fix][Connector-V2] Fix maxcompute read with partition spec (#8896) | https://github.com/apache/seatunnel/commit/e62bf6c65c | 2.3.10 |
| [Fix][Connector-V2] Fix MaxCompute cannot get project and tableName when use schema (#8865) | https://github.com/apache/seatunnel/commit/a24fa8fef6 | 2.3.10 |
| [Improve] restruct connector common options (#8634) | https://github.com/apache/seatunnel/commit/f3499a6eeb | 2.3.10 |
| [Feature][Connector-V2] Support maxcompute source with multi-table (#8582) | https://github.com/apache/seatunnel/commit/0f78242923 | 2.3.10 |
| [Fix][Connector-V2] Fixed adding table comments (#8514) | https://github.com/apache/seatunnel/commit/edca75b0d6 | 2.3.10 |
| [Improve][Connector-V2] MaxComputeSink support create partition in savemode (#8474) | https://github.com/apache/seatunnel/commit/0b8f9de465 | 2.3.10 |
| [Improve][Transform] Rename sql transform table name from 'fake' to 'dual' (#8298) | https://github.com/apache/seatunnel/commit/e6169684fb | 2.3.9 |
| [Feature][Connector-V2] Support MaxCompute save mode (#8277) | https://github.com/apache/seatunnel/commit/44ea675f1e | 2.3.9 |
| [Improve][dist]add shade check rule (#8136) | https://github.com/apache/seatunnel/commit/51ef800016 | 2.3.9 |
[Feature][Core] Rename result_table_name/source_table_name to plugin_input/plugin_output (#8072) | https://github.com/apache/seatunnel/commit/c7bbd322db | 2.3.9 |
| [Feature][Restapi] Allow metrics information to be associated to logical plan nodes (#7786) | https://github.com/apache/seatunnel/commit/6b7c53d03c | 2.3.9 |
| [Fix] Fix dead link on seatunnel connectors list url (#7453) | https://github.com/apache/seatunnel/commit/62b4f16f4e | 2.3.8 |
| [BugFix][Connector-V2][Maxcompute]fix:Maxcompute sink can't map field(#7164) (#7168) | https://github.com/apache/seatunnel/commit/d5abf8f506 | 2.3.6 |
| [Feature] Add unsupported datatype check for all catalog (#5890) | https://github.com/apache/seatunnel/commit/b9791285a0 | 2.3.4 |
| FakeSource support generate different CatalogTable for MultipleTable (#5766) | https://github.com/apache/seatunnel/commit/a8b93805ea | 2.3.4 |
| [Improve][Common] Introduce new error define rule (#5793) | https://github.com/apache/seatunnel/commit/9d1b2582b2 | 2.3.4 |
[Improve] Remove use SeaTunnelSink::getConsumedType method and mark it as deprecated (#5755) | https://github.com/apache/seatunnel/commit/8de7408100 | 2.3.4 |
[Improve][Connector] Add field name to DataTypeConvertor to improve error message (#5782) | https://github.com/apache/seatunnel/commit/ab60790f0d | 2.3.4 |
| [Improve][Test] Move MaxCompute test case file (#5786) | https://github.com/apache/seatunnel/commit/38132f5158 | 2.3.4 |
| [Fix] Fix MaxCompute use not exist SCHEMA option (#5708) | https://github.com/apache/seatunnel/commit/ba4782a67d | 2.3.4 |
| [Feature] Support catalog in MaxCompute Source (#5283) | https://github.com/apache/seatunnel/commit/946d89cb95 | 2.3.4 |
| [Bugfix][Connector-V2][maxcompute] sink commit with Block not exsits on server (#4725) | https://github.com/apache/seatunnel/commit/2760cae73c | 2.3.2 |
| [Bug][Maxcompute] Fix failed to parse some maxcompute type (#3894) | https://github.com/apache/seatunnel/commit/642901f0a2 | 2.3.1 |
| [Improve][build] Give the maven module a human readable name (#4114) | https://github.com/apache/seatunnel/commit/d7cd601051 | 2.3.1 |
| [Improve][Project] Code format with spotless plugin. (#4101) | https://github.com/apache/seatunnel/commit/a2ab166561 | 2.3.1 |
| [Feature][Connector] add get source method to all source connector (#3846) | https://github.com/apache/seatunnel/commit/417178fb84 | 2.3.1 |
| [Feature][API & Connector & Doc] add parallelism and column projection interface (#3829) | https://github.com/apache/seatunnel/commit/b9164b8ba1 | 2.3.1 |
| [Feature][Connector-V2][Maxcompute] Add Maxcompute source & sink connector (#3640) | https://github.com/apache/seatunnel/commit/80cf8f4e42 | 2.3.0 |