Skip to content

Personalization Engine Design

This document defines the design principles, data flow, and Redis-based caching strategy for the Newsfork news personalization engine.

personalization enginecollects user clicks and dwell time(dwell time) signals to estimate interest vector, and adjusts news feed rankings based on this.

SignalDescriptionWeight Reflection
ClickWhether article was clickedIncreases weight for relevant topic/entity upon click
dwell timeArticle page dwell time(seconds)Reflects weight as positive signal if exceeds threshold
  • Click: Increases the weight for the article’s category/tag/entity by a fixed percentage.
  • dwell time: If above the set threshold (e.g., 30 seconds), updates the weight in the same direction as a click. Ignore or apply decay to short dwells.
  • Decay: Periodically reduces interest weights over time, allowing recent actions to have greater influence.
  • Key Pattern: user:{userId}:interests — Per-user interest vector(hash or sorted set).
  • TTL: Set TTL for inactive user keys to limit memory usage.
  • Write: Perform incremental updates to Redis upon receiving events (clicks/dwells); periodically synchronize with persistent storage after batch aggregation.
  • Queries interest vector from Redis upon personalized feed requests.
  • Falls back to default profile or empty vector on cache misses, with asynchronous recovery from event history.

The Mermaid diagram below illustrates the data flow of personalization engine.

  • Input: Click events, dwell time.
  • Processing: Weight-based interest extraction, decay application.
  • Cache: Redis per-user interest vector, TTL, and fallback policy.
  • Output: Personalized news feed ranking.

This design is a draft and will maintain change history based on version 1.0.0 during implementation.