latex_merge_tex_mapper#
Extracts and concatenates all .tex files from a compressed LaTeX project archive into a single text field.
Supported archive formats: .tar, .tar.gz / .tgz, and .zip. Plain .gz (single-file gzip) is not supported because gzip archives carry no filename metadata, making it impossible to verify that the content is actually a .tex file. All .tex files found inside the archive are read in-memory and joined with a configurable separator. No ordering or deduplication is applied.
This operator is typically placed before LaTeX-processing operators such as remove_comments_mapper, expand_macro_mapper, or latex_figure_context_extractor_mapper.
从压缩的 LaTeX 项目归档文件中提取并拼接所有 .tex 文件到一个文本字段中。
支持的归档格式:.tar、.tar.gz / .tgz 以及 .zip。不支持单独的 .gz(单文件 gzip),因为 gzip 格式不包含文件名元数据,无法验证内容是否为 .tex 文件。归档中所有 .tex 文件会被读入内存,并使用可配置的分隔符拼接。不会进行排序或去重。
该算子通常放置在 LaTeX 处理算子(如 remove_comments_mapper、expand_macro_mapper 或 latex_figure_context_extractor_mapper)之前。
Type 算子类型: mapper
Tags 标签: cpu, text
🔧 Parameter Configuration 参数配置#
name 参数名 |
type 类型 |
default 默认值 |
desc 说明 |
|---|---|---|---|
|
|
|
Field name that stores the archive file path. 存储归档文件路径的字段名。 |
|
|
|
String used to join the contents of multiple |
|
|
|
Maximum allowed uncompressed size in bytes for a single |