Guides
Task-oriented walkthroughs for the things people actually do with tsumugi: building a collection, training a ranker, serving search, and keeping it fresh.
Each guide is built around a job rather than a flag: turning a crawl into shards, fitting a model over them, standing up a search endpoint, and bringing later crawls in without rebuilding. They assume you have worked through the quick start.
Building a collection
Turn a Parquet or JSONL crawl export into a directory of .tsumugi shards, and choose a shard size that fits your corpus.
Training a model
Fit a LambdaMART ranking model over a collection, and understand the bootstrap label that stands in until real relevance judgments exist.
Serving search
Stand up a search endpoint over a collection, query it, and understand the routing, the latency budget, and why the merged top-k is exact.
Keeping a collection fresh
Bring later crawls into a collection with add, and merge accumulated shards back down with compact, without rewriting what is already there.