Skip to main content
Ctrl+K

Data Juicer Hub

  • DOCS
  • Core
  • Sandbox
  • Agents
  • GitHub
🌐 en
English 简体中文
📦 main
main
  • DOCS
  • Core
  • Sandbox
  • Agents
  • GitHub
🌐 en
English 简体中文
📦 main
main

Section Navigation

  • Data Recipe Gallery
  • Refine Alpaca-CoT Config Files
  • Notification System
  • BLOOM Config Files
  • Redpajama Config Files
  • DOCS

DOCS#

  • Data Recipe Gallery
    • 1. Data-Juicer Minimal Example Recipe
    • 2. Reproduce Open Source Text Datasets
    • 3. Improved Open Source Pre-training Text Datasets
    • 4. Improved Open Source Post-tuning Text Dataset
    • 5. Synthetic Contrastive Learning Image-text datasets
    • 6. Improved Open Source Image-text datasets
    • 7. Basic Example Recipes for Video Data
    • 8. Synthesize Human-centric Video Benchmarks
    • 9. Improve Existing Open Source Video Datasets
  • Refine Alpaca-CoT Config Files
    • Preprocess
    • Process
  • Notification System
    • Basic Configuration
    • Channel-Specific Settings
    • Email Configuration
    • Secure Password Handling
    • Certificate Authentication
    • Examples
  • BLOOM Config Files
    • Oscar
  • Redpajama Config Files
    • arXiv
    • Books
    • Code
    • StackExchange

previous

Data-Juicer-Hub

next

Data Recipe Gallery

This Page

  • Show Source

© Copyright 2024, Data-Juicer Team.

Created using Sphinx 8.2.3.

Built with the PyData Sphinx Theme 0.16.1.