s3_upload_file_mapper#
Mapper to upload local files to S3 and update paths to S3 URLs.
This operator uploads files from local paths to S3 storage. It supports:
Uploading multiple files concurrently
Updating file paths in the dataset to S3 URLs
Optional deletion of local files after successful upload
Custom S3 endpoints (for S3-compatible services like MinIO)
Skipping already uploaded files (based on S3 key)
The operator processes nested lists of paths, maintaining the original structure in the output.
用于将本地文件上传至 S3 并将路径更新为 S3 URL 的 Mapper。
该算子将本地路径的文件上传至 S3 存储,支持以下功能:
并发上传多个文件
更新数据集中的文件路径为 S3 URL
可选在成功上传后删除本地文件
支持自定义 S3 端点(适用于 MinIO 等 S3 兼容服务)
跳过已上传的文件(基于 S3 key 判断)
该算子可处理嵌套的路径列表,并在输出中保持原始结构。
Type 算子类型: mapper
Tags 标签: cpu
🔧 Parameter Configuration 参数配置#
name 参数名 |
type 类型 |
default 默认值 |
desc 说明 |
|---|---|---|---|
|
<class 'str'> |
|
The field name containing file paths to upload. |
|
<class 'str'> |
|
S3 bucket name to upload files to. |
|
<class 'str'> |
|
Prefix (folder path) in S3 bucket. E.g., 'videos/' or 'data/videos/'. |
|
<class 'str'> |
|
AWS access key ID for S3. |
|
<class 'str'> |
|
AWS secret access key for S3. |
|
<class 'str'> |
|
AWS session token for S3 (optional). |
|
<class 'str'> |
|
AWS region for S3. |
|
<class 'str'> |
|
Custom S3 endpoint URL (for S3-compatible services). |
|
<class 'bool'> |
|
Whether to delete local files after successful upload. |
|
<class 'bool'> |
|
Whether to skip uploading if file already exists in S3. |
|
<class 'int'> |
|
Maximum concurrent uploads. |
|
|
extra args |
|
|
|
extra args |