跳到主要内容
版本:Next

Kudu

Kudu 源连接器

支持 Kudu 版本

  • 1.11.1/1.12.0/1.13.0/1.14.0/1.15.0

支持这些引擎

Spark
Flink
SeaTunnel Zeta

关键特性

描述

用于从 Kudu 读取数据。

测试的 kudu 版本是 1.11.1。

数据类型映射

Kudu 数据类型SeaTunnel 数据类型
BOOLBOOLEAN
INT8
INT16
INT32
INT
INT64BIGINT
DECIMALDECIMAL
FLOATFLOAT
DOUBLEDOUBLE
STRINGSTRING
UNIXTIME_MICROSTIMESTAMP
BINARYBYTES

源选项

参数名类型必须默认值描述
kudu_mastersString-Kudu master 地址。用 ',' 分隔,例如 '192.168.88.110:7051'。
table_nameString-Kudu 表的名称。
client_worker_countInt2 * Runtime.getRuntime().availableProcessors()Kudu worker 数量。默认值是当前 CPU 核心数的两倍。
client_default_operation_timeout_msLong30000Kudu 普通操作超时时间。
client_default_admin_operation_timeout_msLong30000Kudu 管理操作超时时间。
enable_kerberosBoolfalseKerberos principal 启用。
kerberos_principalString-Kerberos principal。注意所有 zeta 节点都需要有此文件。
kerberos_keytabString-Kerberos keytab。注意所有 zeta 节点都需要有此文件。
kerberos_krb5confString-Kerberos krb5 conf。注意所有 zeta 节点都需要有此文件。
scan_token_query_timeoutLong30000连接扫描令牌的超时时间。如果未设置,将与 operationTimeout 相同。
scan_token_batch_size_bytesInt1024 * 1024Kudu 扫描字节数。一次读取的最大字节数,默认为 1MB。
filterString-Kudu 扫描过滤表达式,例如 id > 100 AND id < 200。
schemaMap1024 * 1024SeaTunnel Schema。
table_listArray-要读取的表列表。您可以使用此配置代替 table_path,例如:table_list = [{ table_name = "kudu_source_table_1"},{ table_name = "kudu_source_table_2"}]
common-options-源插件通用参数,请参考 源通用选项 详见。

任务示例

简单

以下示例针对名为 "kudu_source_table" 的 Kudu 表,目标是在控制台打印此表中的数据并写入 kudu 表 "kudu_sink_table"

# 定义运行时环境
env {
parallelism = 2
job.mode = "BATCH"
}

source {
# 这是一个示例源插件 **仅用于测试和演示源插件功能**
kudu {
kudu_masters = "kudu-master:7051"
table_name = "kudu_source_table"
plugin_output = "kudu"
enable_kerberos = true
kerberos_principal = "xx@xx.COM"
kerberos_keytab = "xx.keytab"
}
}

transform {
}

sink {
console {
plugin_input = "kudu"
}

kudu {
plugin_input = "kudu"
kudu_masters = "kudu-master:7051"
table_name = "kudu_sink_table"
enable_kerberos = true
kerberos_principal = "xx@xx.COM"
kerberos_keytab = "xx.keytab"
}
}

多表

env {
# 您可以在此处设置引擎配置
parallelism = 1
job.mode = "STREAMING"
checkpoint.interval = 5000
}

source {
# 这是一个示例源插件 **仅用于测试和演示源插件功能**
kudu{
kudu_masters = "kudu-master:7051"
table_list = [
{
table_name = "kudu_source_table_1"
},{
table_name = "kudu_source_table_2"
}
]
plugin_output = "kudu"
}
}

transform {
}

sink {
Assert {
rules {
table-names = ["kudu_source_table_1", "kudu_source_table_2"]
}
}
}

变更日志

Change Log
ChangeCommitVersion
[Chore] fix typos filed -> field (#9757)https://github.com/apache/seatunnel/commit/e3e1c67d292.3.12
[Improve][Core] Update apache common to apache common lang3 (#9694)https://github.com/apache/seatunnel/commit/6e5737c1ec2.3.12
[Improve][API] Optimize the enumerator API semantics and reduce lock calls at the connector level (#9671)https://github.com/apache/seatunnel/commit/9212a771402.3.12
[Feature][connector-kudu] implement the filter (#9405)https://github.com/apache/seatunnel/commit/2714dd11052.3.12
[Feature][Checkpoint] Add check script for source/sink state class serialVersionUID missing (#9118)https://github.com/apache/seatunnel/commit/4f5adeb1c72.3.11
[Improve] kudu options (#9162)https://github.com/apache/seatunnel/commit/e7edafdbac2.3.11
[Improve] restruct connector common options (#8634)https://github.com/apache/seatunnel/commit/f3499a6eeb2.3.10
[Improve][Transform] Rename sql transform table name from 'fake' to 'dual' (#8298)https://github.com/apache/seatunnel/commit/e6169684fb2.3.9
[Improve][dist]add shade check rule (#8136)https://github.com/apache/seatunnel/commit/51ef8000162.3.9
[Improve][API] Unified tables_configs and table_list (#8100)https://github.com/apache/seatunnel/commit/84c0b8d6602.3.9
[Feature][Core] Rename result_table_name/source_table_name to plugin_input/plugin_output (#8072)https://github.com/apache/seatunnel/commit/c7bbd322db2.3.9
[Feature][Restapi] Allow metrics information to be associated to logical plan nodes (#7786)https://github.com/apache/seatunnel/commit/6b7c53d03c2.3.9
[Improve][Connector] Add multi-table sink option check (#7360)https://github.com/apache/seatunnel/commit/2489f6446b2.3.7
[Feature][Core] Support using upstream table placeholders in sink options and auto replacement (#7131)https://github.com/apache/seatunnel/commit/c4ca74122c2.3.6
correct the typo of kudu kerberos config (#6905)https://github.com/apache/seatunnel/commit/fcb85549722.3.6
[Fix][KuduCatalogFactory]: Fix KuduCatalogFactory.optionRule() will throw an Exception (#6787)https://github.com/apache/seatunnel/commit/45a4e1532d2.3.6
[Feature][Engine] Unify job env parameters (#6003)https://github.com/apache/seatunnel/commit/2410ab38f02.3.4
[Feature][Connector-V2] Support multi-table sink feature for kudu (#5951)https://github.com/apache/seatunnel/commit/82460c0bf02.3.4
[Feature] Add unsupported datatype check for all catalog (#5890)https://github.com/apache/seatunnel/commit/b9791285a02.3.4
[Feature][Kudu] Support multi-table source read (#5878)https://github.com/apache/seatunnel/commit/8d9a0b7d112.3.4
[Improve][Common] Introduce new error define rule (#5793)https://github.com/apache/seatunnel/commit/9d1b2582b22.3.4
[Feature][Connector-V2] Support TableSourceFactory/TableSinkFactory on kudu (#5789)https://github.com/apache/seatunnel/commit/10e791d60a2.3.4
[Improve] Remove use SeaTunnelSink::getConsumedType method and mark it as deprecated (#5755)https://github.com/apache/seatunnel/commit/8de74081002.3.4
[Feature][Kudu] Refactor Kudu functionality and Sink support CDC data. (#5437)https://github.com/apache/seatunnel/commit/22110eb7b32.3.4
[Improve][build] Give the maven module a human readable name (#4114)https://github.com/apache/seatunnel/commit/d7cd6010512.3.1
[Improve][Project] Code format with spotless plugin. (#4101)https://github.com/apache/seatunnel/commit/a2ab1665612.3.1
[Hotfix][Connector-V2] Fix connector source snapshot state NPE (#4027)https://github.com/apache/seatunnel/commit/e39c4988cc2.3.1
[Feature][Connector] add get source method to all source connector (#3846)https://github.com/apache/seatunnel/commit/417178fb842.3.1
[Feature][API &amp; Connector &amp; Doc] add parallelism and column projection interface (#3829)https://github.com/apache/seatunnel/commit/b9164b8ba12.3.1
[Hotfix][OptionRule] Fix option rule about all connectors (#3592)https://github.com/apache/seatunnel/commit/226dc6a1192.3.0
[Improve][Connector-V2] Bad smell ToArrayCallWithZeroLengthArrayArgument: (#3577)https://github.com/apache/seatunnel/commit/cc448d98c42.3.0
[Improve][Connector-V2][Kudu] Unified exception for kudu source & sink connector (#3564)https://github.com/apache/seatunnel/commit/273418ddc92.3.0
[Connector][Dependency] Add Miss Dependency Cassandra And Change Kudu Plugin Name (#3432)https://github.com/apache/seatunnel/commit/6ac6a0a0cd2.3.0
[Feature][Connector V2] expose configurable options in Kudu (#3365)https://github.com/apache/seatunnel/commit/c422210e2c2.3.0
[Feature][Core][Connector-V2] Unified The way of setting JobName (#2908)https://github.com/apache/seatunnel/commit/bf2c97484b2.3.0-beta
remove duplicate ExceptionUtil class (#3037)https://github.com/apache/seatunnel/commit/c9dc7c50c22.3.0-beta
[Improve][all] change Log to @Slf4j (#3001)https://github.com/apache/seatunnel/commit/6016100f122.3.0-beta
[Improve][Connector-V2]Kudu Sink Connector Support to upsert rowhttps://github.com/apache/seatunnel/commit/1ece805ab12.3.0-beta
[DEV][Api] Replace SeaTunnelContext with JobContext and remove singleton pattern (#2706)https://github.com/apache/seatunnel/commit/cbf82f755c2.2.0-beta
[#2606]Dependency management split (#2630)https://github.com/apache/seatunnel/commit/fc047be69b2.2.0-beta
[Connector-V2] Add Kudu source and sink connector (#2254)https://github.com/apache/seatunnel/commit/0483cbc2df2.2.0-beta