Typesense Search in Python — Core Concepts

Integrate Typesense into Python apps: define collections, index documents, build search queries with filters, facets, and geo-search using the typesense-python client.

Typesense is an open-source, typo-tolerant search engine optimized for low-latency search-as-you-type experiences. It stores data in memory for sub-10ms response times and provides a schemaful approach to indexing. Python integrates through the typesense client library.

Setup and connection

import typesense

client = typesense.Client({
    'api_key': 'your-api-key',
    'nodes': [
        {'host': 'localhost', 'port': '8108', 'protocol': 'http'}
    ],
    'connection_timeout_seconds': 5
})

For high-availability clusters:

client = typesense.Client({
    'api_key': 'your-api-key',
    'nodes': [
        {'host': 'node1.example.com', 'port': '8108', 'protocol': 'https'},
        {'host': 'node2.example.com', 'port': '8108', 'protocol': 'https'},
        {'host': 'node3.example.com', 'port': '8108', 'protocol': 'https'},
    ],
    'connection_timeout_seconds': 5
})

Creating collections

Typesense requires a schema (collection) before indexing. This is different from schemaless engines.

schema = {
    'name': 'products',
    'fields': [
        {'name': 'name', 'type': 'string'},
        {'name': 'description', 'type': 'string'},
        {'name': 'price', 'type': 'float'},
        {'name': 'category', 'type': 'string', 'facet': True},
        {'name': 'brand', 'type': 'string', 'facet': True},
        {'name': 'rating', 'type': 'float'},
        {'name': 'in_stock', 'type': 'bool'},
        {'name': 'tags', 'type': 'string[]', 'facet': True},
    ],
    'default_sorting_field': 'rating'
}

client.collections.create(schema)

The default_sorting_field determines the tiebreaker when relevance scores are equal. Choose a field that represents item quality — rating, popularity, or recency.

Indexing documents

# Single document
client.collections['products'].documents.create({
    'id': '1',
    'name': 'Wireless Headphones Pro',
    'description': 'Noise-cancelling over-ear headphones with 30h battery',
    'price': 149.99,
    'category': 'Electronics',
    'brand': 'AudioMax',
    'rating': 4.7,
    'in_stock': True,
    'tags': ['wireless', 'noise-cancelling', 'bluetooth']
})

# Bulk import (much faster)
import jsonlines
import io

documents = [...]  # List of dicts
jsonl = '\n'.join([json.dumps(doc) for doc in documents])
client.collections['products'].documents.import_(jsonl, {'action': 'upsert'})

Searching

results = client.collections['products'].documents.search({
    'q': 'wireles headphnes',  # Typos handled automatically
    'query_by': 'name,description,tags',
    'query_by_weights': '3,1,2',  # name is 3x more important
    'filter_by': 'category:Electronics && price:<200 && in_stock:true',
    'sort_by': 'rating:desc',
    'facet_by': 'category,brand',
    'per_page': 10,
    'page': 1
})

print(f"Found {results['found']} results in {results['search_time_ms']}ms")

for hit in results['hits']:
    doc = hit['document']
    print(f"{doc['name']} — ${doc['price']} — ★{doc['rating']}")

# Facet counts
for facet in results['facet_counts']:
    print(f"\n{facet['field_name']}:")
    for value in facet['counts']:
        print(f"  {value['value']}: {value['count']}")

Geo-search

Typesense supports location-based filtering and sorting:

# Add geo field to schema
# {'name': 'location', 'type': 'geopoint'}

# Search within 10km of a point
results = client.collections['stores'].documents.search({
    'q': 'coffee',
    'query_by': 'name',
    'filter_by': 'location:(48.8566, 2.3522, 10 km)',
    'sort_by': 'location(48.8566, 2.3522):asc'
})

Synonyms

Define synonyms so searches for “phone” also match “mobile” and “cell”:

client.collections['products'].synonyms.upsert('phone-synonyms', {
    'synonyms': ['phone', 'mobile', 'cell', 'smartphone']
})

Common misconception

Developers worry that Typesense’s in-memory requirement makes it expensive for large datasets. In practice, Typesense compresses data aggressively — a dataset with 10 million products typically needs 2-4GB of RAM, which costs under $20/month on most cloud providers. The speed benefit usually justifies the memory cost.

One thing to remember: Typesense takes an opinionated, schema-first approach that trades some flexibility for incredible speed and built-in typo tolerance — perfect for search boxes where every millisecond of latency matters.

pythontypesensetypesense-python