R2 Migration Speed Improvement Proposal

Document Version: 1.0
Purpose: Analyze the issue of excessively slow object movement during pnpm migrate:r2:raw-prefix execution and plan resolution strategies. Code generation is not included.

1. Symptoms

Execution: pnpm migrate:r2:raw-prefix (Moves R2 objects under the raw/ prefix)
Target: Approximately 2,190 objects (excluding raw/ and prod/)
Symptoms: User interruption (^C) after processing 50 objects. Estimated to take several seconds per object with sequential processing → Expected to take 1 to 3 hours or more for full completion.

2. Current Method Bottleneck

2.1 Processing Flow (Per Object)

Step	Action	Notes
1	`wrangler r2 object get bucket/key -f TMP_FILE --remote`	Download (R2 → Local)
2	`wrangler r2 object put bucket/raw/key --file TMP_FILE --remote`	Upload (Local → R2)
3	Delete local temporary file
4	`wrangler r2 object delete bucket/key --remote`	Delete

Per object: wrangler CLI calls 3 times (get, put, delete) + local disk I/O.
Data Path: R2 → Local → R2 (object body traverses two network segments).
Execution Mode: Sequential (one object completed before next). 2,190 × (get + put + delete) ≈ 6,570 CLI calls executed sequentially.

2.2 Summary of Slow Performance Causes

Cause	Explanation
Sequential Processing	Only one object processed at a time. Significant CPU and network idle time.
Double Transfer of Body Data	Full download via get followed by full upload via put. Data traverses the client even for transfers within the same bucket.
CLI Overhead	Each call requires launching `npx wrangler` + authentication/parsing. High call frequency.
Single Temporary File	Processes one object at a time, so disk bottlenecks are relatively small, but I/O still occurs.

3. Solution Directions

3.1 Option Comparison

Approach	Overview	Advantages	Disadvantages/Risks
A. S3 CopyObject + DeleteObject	Copies within the same bucket using S3’s CopyObject API. Body moves only within R2. Deletes original with DeleteObject (or DeleteObjects batch) after copying.	Data does not pass through the client. Only CopyObject and DeleteObject are called, reducing cost and latency per call. Parallel calls possible using existing `@aws-sdk/client-s3` (e.g., 20-50 concurrent). Up to 1,000 keys/request can be deleted in batches using DeleteObjects.	Requires adjusting concurrency and retries to match R2/S3 rate limits.
B. Current Approach + Parallelization	Maintain wrangler get/put/delete, but process multiple objects simultaneously (e.g., 5-10).	Relatively small implementation change scope.	Object body double transmission and CLI overhead remain. Requires N temporary files simultaneously. Increases rate limit and local disk burden.
C. Queue + Worker Distribution	Send each key as a queue message; Workers perform copy+delete.	Theoretically horizontally scalable.	Limited benefit for one-time migration relative to complexity. Worker CPU/time constraints, queue configuration and monitoring overhead.

3.2 Recommended Approach: A. S3 CopyObject + DeleteObject + Parallel/Batch

CopyObject (same bucket):
- Perform only server-side copy using CopySource: bucket/key and Key: raw/key.
- Clients handle metadata only; object bodies move only within R2.
Parallel Copy:
- Execute N concurrent CopyObject operations (e.g., 20~50). Process in batch units rather than all 2,190 at once.
- Set N considering R2 rate limits and timeouts (start small, increase if no issues).
Deletion:
- Batch delete up to 1,000 keys/request using DeleteObjects (S3 Multi-Object Delete).
- Choose policy: “Perform all copies first → verify then batch delete” or “Batch copy first, then batch delete only that batch”.
Utilize existing assets:
- Use existing loadR2FileList (S3 ListObjectsV2) for list queries.
- Authentication and endpoints remain the same as existing R2_ACCESS_KEY_ID , R2_SECRET_ACCESS_KEY , R2_ACCOUNT_ID , etc.

4. Implementation Considerations (Planning Level)

Item	Details
CopyObject Source Format	When copying within the same bucket via S3 API, `CopySource` should be `bucket/key` or URL-encoded format. Specify according to SDK/documentation.
Concurrency Limit	Start with concurrent CopyObject operations at 20~50. Retry with backoff and reduce concurrency if 429 (Throttling) occurs.
Deletion Timing	(1) Batch delete only keys where all copies succeeded, (2) or list only keys where copy succeeded and periodically use DeleteObjects. Retry failed keys or review manually.
dry-run / —no-delete	Supported as before. dry-run outputs target keys without copying/deleting. —no-delete performs CopyObject only, skipping DeleteObjects.
Progress·Logging	Output progress per batch. Failed keys are logged or saved to a file for re-execution or manual handling.
idempotency	If `raw/key` already exists, determine policy for CopyObject behavior (overwrite or not) and whether to “skip if already in raw/“.

5. Expected Effects (Qualitative)

Metric	Current (Sequential get/put/delete)	Improvement (CopyObject+DeleteObject, Parallel·Batch)
Network per object	Body 2 times (download+upload)	Body 0 times (internal server copy)
API Calls per Object	3 times (get, put, delete)	1 copy + delete per 1,000 keys
Execution Method	Sequential 2,190 times	Copy parallel (e.g., 20~50 concurrent), Delete batch (1,000 keys/request)
Estimated Duration	1–3+ hours	Expected to be minutes to 20 minutes depending on concurrency/rate limit (specific values will be adjusted after implementation/measurement).

6. Summary

Item	Description
Issue	Current migration is very slow due to sequential execution of get → put → delete per object.
Cause	Sequential processing, duplicate body transmission, repeated wrangler CLI calls.
Recommendation	Use S3 CopyObject (same bucket) + DeleteObjects (batch), with parallel execution for CopyObject.
Next Steps	Design and implement migration script (or dedicated module) using the above approach, then validate with dry-run/small-scale testing before full execution.

This document defines speed improvement directions only; code changes will be handled as a separate task.