Feature Store Design in Python — Deep Dive

Build feature pipelines with Feast, implement point-in-time joins, configure online/offline stores, and design feature services for real-time ML serving.

Feast: The Open-Source Feature Store

Feast (Feature Store) is the most widely adopted open-source feature store. It provides feature definition as code, materialization to online/offline stores, and consistent retrieval APIs.

Project Setup

pip install feast
feast init my_feature_repo
cd my_feature_repo

The generated structure:

my_feature_repo/
├── feature_store.yaml    # Infrastructure configuration
├── features.py           # Feature definitions
└── data/                 # Sample data sources

Defining Features

# features.py
from feast import Entity, FeatureView, Field, FileSource
from feast.types import Float32, Int64, String
from datetime import timedelta

# Entity: the thing features describe
customer = Entity(
    name="customer_id",
    join_keys=["customer_id"],
    description="Unique customer identifier"
)

# Data source
customer_stats_source = FileSource(
    path="data/customer_stats.parquet",
    timestamp_field="event_timestamp",
    created_timestamp_column="created_timestamp"
)

# Feature view: a group of related features
customer_stats_fv = FeatureView(
    name="customer_stats",
    entities=[customer],
    ttl=timedelta(days=1),
    schema=[
        Field(name="purchase_count_30d", dtype=Int64),
        Field(name="avg_order_value", dtype=Float32),
        Field(name="days_since_last_purchase", dtype=Int64),
        Field(name="preferred_category", dtype=String),
    ],
    source=customer_stats_source,
    online=True,
    tags={"team": "growth", "owner": "ml-platform"}
)

Infrastructure Configuration

# feature_store.yaml
project: fraud_detection
provider: gcp
registry: gs://my-bucket/feast-registry.pb
online_store:
  type: redis
  connection_string: redis://10.0.0.1:6379
offline_store:
  type: bigquery
  dataset: feast_offline

Materialization

# Load historical data into the offline store and latest values into online
feast apply          # Register feature definitions
feast materialize-incremental $(date -u +"%Y-%m-%dT%H:%M:%S")

Materialization reads from the source, computes the latest value per entity, and writes to the online store. Running it incrementally processes only new data since the last materialization.

Training Data Retrieval with Point-in-Time Joins

from feast import FeatureStore
import pandas as pd

store = FeatureStore(repo_path=".")

# Entity dataframe: events we want features for
entity_df = pd.DataFrame({
    "customer_id": [1001, 1002, 1003, 1001],
    "event_timestamp": pd.to_datetime([
        "2025-06-15 10:00:00",
        "2025-06-15 11:00:00",
        "2025-06-16 09:00:00",
        "2025-06-17 14:00:00",  # Same customer, different time
    ])
})

# Point-in-time correct retrieval
training_df = store.get_historical_features(
    entity_df=entity_df,
    features=[
        "customer_stats:purchase_count_30d",
        "customer_stats:avg_order_value",
        "customer_stats:days_since_last_purchase",
    ]
).to_df()

Each row gets feature values from the most recent feature row before its event timestamp. Customer 1001 appears twice with different timestamps and may receive different feature values — reflecting their state at each point in time.

Online Serving

# Low-latency retrieval for real-time inference
feature_vector = store.get_online_features(
    features=[
        "customer_stats:purchase_count_30d",
        "customer_stats:avg_order_value",
        "customer_stats:preferred_category",
    ],
    entity_rows=[{"customer_id": 1001}]
).to_dict()

# Returns: {"customer_id": [1001], "purchase_count_30d": [42], ...}

Online reads hit the Redis (or DynamoDB) store and typically return in 1-5ms — fast enough for serving paths that need sub-50ms total latency.

Feature Services: Bundling for Models

from feast import FeatureService

fraud_detection_service = FeatureService(
    name="fraud_detection_v2",
    features=[
        customer_stats_fv[["purchase_count_30d", "avg_order_value"]],
        transaction_stats_fv[["txn_amount_zscore", "merchant_risk_score"]],
    ],
    description="Features for fraud detection model v2",
    tags={"model": "fraud-detector-v2"}
)

Feature services pin the exact features a model needs. When the model gets retrained or updated, the feature service version documents which features it consumed — creating a reproducibility link.

Streaming Features

Batch materialization introduces latency — features are only as fresh as the last materialization run. For real-time use cases (fraud detection, dynamic pricing), streaming features update the online store continuously:

from feast import StreamFeatureView
from feast.data_sources import KafkaSource

txn_stream = KafkaSource(
    name="transaction_events",
    kafka_bootstrap_servers="broker:9092",
    topic="transactions",
    timestamp_field="event_timestamp",
    message_format=AvroFormat(schema_json=AVRO_SCHEMA)
)

realtime_txn_stats = StreamFeatureView(
    name="realtime_txn_stats",
    entities=[customer],
    ttl=timedelta(hours=1),
    schema=[
        Field(name="txn_count_last_hour", dtype=Int64),
        Field(name="txn_total_last_hour", dtype=Float32),
    ],
    source=txn_stream,
    aggregations=[
        Aggregation(column="amount", function="count", time_window=timedelta(hours=1)),
        Aggregation(column="amount", function="sum", time_window=timedelta(hours=1)),
    ],
    online=True
)

Design Decisions and Tradeoffs

Offline Store Choice

Store	Latency	Cost	Best For
BigQuery / Redshift	Seconds	Pay-per-query	Large-scale historical joins
Parquet on S3	Minutes	Storage only	Small teams, prototyping
Spark / Databricks	Seconds-minutes	Cluster cost	Existing Spark infrastructure

Online Store Choice

Store	P99 Latency	Cost	Scaling
Redis	<2ms	Memory-bound	Vertical + cluster
DynamoDB	<5ms	Pay-per-request	Automatic
Bigtable	<10ms	Provisioned	Node-based
SQLite	<1ms	Free	Single-machine only

Feature Computation: Push vs Pull

Pull model — the feature store reads from source tables on a schedule. Simple but introduces staleness.

Push model — producers write features directly to the store when events happen. Lower latency but requires producers to know the feature schema.

Most production systems use pull for batch features and push for streaming features.

Monitoring Feature Health

Feature stores need observability:

# Check feature freshness
from datetime import datetime, timedelta

def check_feature_freshness(store, feature_view_name, max_age: timedelta):
    """Alert if features are staler than max_age."""
    fv = store.get_feature_view(feature_view_name)
    last_materialized = fv.materialization_intervals[-1].end_date

    age = datetime.utcnow() - last_materialized
    if age > max_age:
        return {
            "status": "stale",
            "feature_view": feature_view_name,
            "age_hours": age.total_seconds() / 3600,
            "threshold_hours": max_age.total_seconds() / 3600
        }
    return {"status": "fresh", "feature_view": feature_view_name}

Key metrics to track:

Freshness — time since last materialization
Coverage — percentage of entities with non-null values
Distribution drift — statistical distance from training-time distributions
Serving latency — online store read times

Real-World Architecture: DoorDash

DoorDash processes millions of delivery orders daily and uses a feature store to power models for delivery time estimation, fraud detection, and merchant ranking. Their architecture:

Raw events stream through Kafka
Flink jobs compute real-time features (running delivery count, recent cancellation rate)
Features land in Redis for online serving and S3 for offline training
A metadata layer tracks feature ownership, SLAs, and schema versions

The key insight: they compute features once in Flink and serve them to 50+ models through the same online store — eliminating per-model computation and guaranteeing consistency.

One thing to remember: A well-designed feature store separates feature computation from feature consumption, ensuring every model — whether training offline or serving in real time — sees identical, point-in-time correct data.

pythonfeature-storemlopsmachine-learning