KV Deduplication Logic
Overview
Section titled “Overview”The KV-based deduplication system is enabled via the NEWSFORK_DEDUP environment variable.
Environment Variables
Section titled “Environment Variables”Local Development
Section titled “Local Development”export NEWSFORK_DEDUP=trueexport CF_ACCOUNT_ID="your-account-id"export CF_KV_NAMESPACE_ID="your-namespace-id"export CF_API_TOKEN="your-api-token"GitHub Secrets
Section titled “GitHub Secrets”gh secret set NEWSFORK_DEDUP --body "true"gh secret set CF_ACCOUNT_ID --body "your-account-id"gh secret set CF_KV_NAMESPACE_ID --body "your-namespace-id"gh secret set CF_API_TOKEN --body "your-api-token"Deduplication Functions
Section titled “Deduplication Functions”// URL 중복 여부 확인isDuplicate(url: string, config: KVConfig): Promise<DeduplicationResult>
// 배치 중복 검사batchDeduplication(urls: string[], config: KVConfig, batchSize: number): Promise<BatchDeduplicationResult>
// 중복 검사 통계generateDeduplicationStats(results: DeduplicationResult[]): Record<string, any>Operation Flow
Section titled “Operation Flow”- Generate URL Hash: Create KV key using SHA-1 hash
- KV Lookup: Check if the key exists
- Duplicate Check:
- If exists → Duplicate (No processing)
- If does not exist → New (Register in KV then process)
- TTL Setting: Automatically delete after 30 days
Activation Conditions
Section titled “Activation Conditions”- Set the environment variable
NEWSFORK_DEDUP=true - Set all Cloudflare KV environment variables
Related Files
Section titled “Related Files”test/utils/kv-deduplication.ts— KV connection and duplicate check logictest/kv-deduplication.test.ts— Test suite