Token Compression

The model has a fixed context window. Token compression is how OpenHuman keeps long conversations, large Memory Trees, and bulky tool results from hitting that ceiling.

What gets compressed

Source	Method
Web Search results	Snippet extraction - keeps the top 3 results, drops the rest
Web Scraper output	Strip + truncate at 1 MB input / 50 K output
Memory recall results	Semantic deduplication before passing chunks to the model
Long tool outputs	Line-number truncation with a "see file" hint
Conversation history	Summary re-write when turns exceed the window

How it works

Raw input → Filter (ads, nav, boilerplate) → Chunk → Dedupe → Summarize (if over limit) → Model

Configuration

Flag	Default	What it does
`MAX_SEARCH_RESULTS`	3	Results kept per search
`MAX_SCRAPE_BYTES`	1 MB	Input cap per page
`MAX_MEMORY_CHUNKS`	20	Chunks recalled per query

What gets compressed​

How it works​

Configuration​

See also​

What gets compressed

How it works

Configuration

See also