Core Concepts
How TokenSense intercepts calls
TokenSense wraps your LLM client using Python’s__getattr__ proxy pattern. When you call client.messages.create(...), TokenSense:
- Forwards the call to the original client — unchanged
- Waits for the response
- Extracts metadata from the response (tokens, model, cost)
- Emits a
CallEventto a background thread - Returns the original response to your code
observe()
The core function. Wraps any supported LLM client and returns a drop-in replacement.Signature
Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
client | Any | required | LLM client to wrap |
output | BaseOutput | auto | Where events are sent. Auto-detects by ENV if not set |
user_id | string | None | Identifier attached to every event from this client |
session_id | string | None | Groups multiple calls into a session |
tags | list[str] | None | Labels for filtering and segmentation |
log_prompts | bool | False | Include prompt content in events (opt-in) |
log_responses | bool | False | Include response content in events (opt-in) |
on_event | callable | None | Function called after each event is written |
