Skip to content

automated Multilingual Documentation Pipeline Planning (EN → EN)

This document outlines the planning and design of a pipeline that automatically translates documents written in Korean into English and synchronizes them to the en/ folder. It does not include code implementation, defining only the architecture, triggers, tools, implementation steps, and Best Practices.


  • Principle: The primary writing language is Korean (ko/). All source document are written under docs/src/content/docs/ko/.
  • Automation: Whenever a Korean document is pushed, English translation is automatically generated and saved in the same relative path at docs/src/content/docs/en/.
  • Synchronization: Only content is synchronized while maintaining Astro Starlight’s multilingual structure (ko/ , en/). This involves translation, not simple file copying.
IncludeExclude
EN auto-translation when docs/src/content/docs/ko/**/*.md changesko/ manual translation of original
Frontmatter title , description translationWebsite build·deployment (maintain existing process)
Preserve Markdown body, code blocks·Mermaid syntaxReal-time translation API integration (batch only)

2. Pipeline Architecture (GitHub Actions-based)

Section titled “2. Pipeline Architecture (GitHub Actions-based)”
  1. Trigger: Developer pushes (or PRs) Korean documentation (.md) to docs/src/content/docs/ko/.
  2. Detection: GitHub Actions detects changes to on.push.paths (or pull_request.paths) via docs/src/content/docs/ko/**.
  3. Translation: Based on the list of modified/added files, translate only the body and frontmatter text into English using the DeepL API or OpenAI (GPT-4o) API while preserving the Markdown format (code blocks, frontmatter, Mermaid).
  4. Sync: Save the translation results to the same relative path as docs/src/content/docs/en/. (e.g., ko/v1/guides/foo.mden/v1/guides/foo.md)
  5. Commit / PR: Commit the generated/modified English files to the repository, or commit to a dedicated branch and create a PR. Include [skip ci] in the commit message to prevent the translation commit from triggering the workflow again.
[개발자: ko/**/*.md Push]
[GitHub Actions: paths = docs/src/content/docs/ko/**]
[변경된 .md 파일 목록 추출]
[번역 스크립트: DeepL or OpenAI]
- Frontmatter title, description 번역
- 본문 번역 (코드/Mermaid 블록은 스킵 또는 보존)
[en/ 동일 경로에 저장]
[git commit & push] or [Create PR]

  • Option A – DeepL API
    • Pros: Cost-quality balance, Markdown-friendly.
    • Tool example: deepl-transformer Type action or direct call script (Python/Node).
  • Option B – OpenAI (GPT-4o) API
    • Advantages: Can explicitly specify directives like “preserve code blocks, Mermaid, frontmatter structure”.
    • Implementation: Use GitHub Actions secrets (OPENAI_API_KEY), request/parse via Python/Node script.
  • Checkout: actions/checkout@v4 (Full directory or modified files only).
  • Translation Script:
    • Input: --src docs/src/content/docs/ko (or list of changed files),
    • Output: --dest docs/src/content/docs/en
    • Script location example: scripts/translate_docs.py or scripts/translate_docs.mjs.
  • Auto-commit:
    • Commit only changes to en/ using stefanzweifel/git-auto-commit-action@v5,
    • commit_message: e.g., docs: automatic translation KO → EN [skip ci].

  • .cursor/rules/docs.md
    • Add/maintain the Multilingual·Translation Policy section.
    • Key points:
      • Original content is written only in ko/.
      • en/ is automatically generated via GitHub Actions, so avoid manual editing.
      • When translating, ensure the Frontmatter fields title and description are included in the translation target.
  • File Location: .github/workflows/translate.yml (or translate-docs.yml).
  • Trigger:
    -on.push.paths: docs/src/content/docs/ko/**
    • (Optional) on.pull_request.paths: Preview translation results when PRing to the same path.
  • Job:
    1. Checkout, 2) Runtime (Python/Node) setup, 3) Execute translation script (inject API key into environment variables), 4) Auto-commit or create PR.

4.3 Translation Script Requirements (Planning Level)

Section titled “4.3 Translation Script Requirements (Planning Level)”
  • Input:
    • Source directory: docs/src/content/docs/ko
    • (Optional) Support argument to process only files modified in this Push.
  • Processing:
    • Markdown parsing: Distinguish Frontmatter / Body / Code blocks / Mermaid blocks.
    • Translation targets: Text within Frontmatter (title, description) and body text excluding code/Mermaid.
    • Recommended to pass Glossary when calling API (see 5.2 below).
  • Output:
    • Generate files under en/ in the same relative path.
    • Overwrite existing en/ files (or explicitly state “overwrite only when changed”).
  • If ko and en are already configured in astro.config.mjs’s locales, English will appear in the Language Selection at the top of the site immediately upon file creation in en/. Minimize separate build configuration changes.

5. Architect Recommendations (Best Practices)

Section titled “5. Architect Recommendations (Best Practices)”
  • Machine translation alone does not guarantee quality.
  • Critical documents (whitepapers, user manuals, etc.) should undergo a review phase after automatic translation.
  • Utilize the Frontmatter field status: review to distinguish the “Translation Complete·Pending Review” status, and implement a policy to switch to published after review completion.
  • Ensure proper nouns and domain-specific terms like ‘personalization engine’, ‘weight logic’, ‘seed contract’, etc., ensure proper nouns and domain-specific terms are consistently translated by designing a script that passes the Glossary along with translation API calls.
  • Manage the Glossary as a project file (e.g., docs/glossary.csv, docs/glossary.json) and read it via CI to pass as API parameters.
  • Manually translating and copying to en/ constitutes Toil (waste) and causes version mismatches and duplicate edits.
  • Automation via GitHub Actions is the default; manual handling is recommended only for exceptions (e.g., urgent fixes).

6. Deliverables and Checklist (for implementation reference)

Section titled “6. Deliverables and Checklist (for implementation reference)”
StepDeliverableNotes
1Reflect multilingual/translation policy in .cursor/rules/docs.mdUpdate only if partially implemented
2.github/workflows/translate.yml (or equivalent workflow)Trigger, job, script invocation
3Translation script (e.g., scripts/translate_docs.py or .mjs)API selection (DeepL/OpenAI), Frontmatter/body/code separation
4Glossary file and script integrationOptional but recommended
5Secret configuration (e.g., OPENAI_API_KEY , DEEPL_API_KEY)Repo Secrets
6Automated commit actions or PR creation logicIncludes [skip ci]

  • Trigger: Execute GitHub Actions when docs/src/content/docs/ko/** changes.
  • Translation: Translate text only from KO → EN using DeepL or OpenAI (GPT-4o), preserving Markdown structure.
  • Sync: Save translation results to the same path: docs/src/content/docs/en/.
  • Commit: Auto-commit or create PR, prevent re-execution via [skip ci].
  • Policy: Original source: ko/. Avoid automatic generation or manual editing of en/. For critical documents, use status: review for human-in-the-loop review and maintain term consistency via Glossary.

Based on this planning document, the next steps can specify the detailed translation script design and workflow YAML·prompt configuration.