small, opinionated stacks. one tool per job. we don't debate, we ship. if a tool stops earning its place, we replace it.
ai-native, not ai-flavored. ai isn't a feature we bolt on. it's how we write code, communicate, and operate. we run our own models when it matters, hosted ones when it doesn't.
spec before code, evals before trust. we write specs first so humans and agents work from the same source of truth. and we don't trust llm output until we've measured it.
collaboration
github — code, actions, source of truth.
linear — long-running projects, roadmaps, anything that outlives a discord thread.
discord — everything else. async chat, ai bots, voice when we need it.
ai stack
claude code, opencode — primary coding agents. claude code for daily work, opencode when we need to swap models or run things locally.
openrouter — hosted model access without juggling 12 api keys.
own infra (r&d) — gpu servers for local oss models. privacy, cost, and the freedom to experiment without asking permission.
hermes, openclaw — discord bots wired into our workflow.
mcp servers — openspec, playwright, linear. agents get real tools, not just text.
langfuse — every llm call is observable. if you can't measure it, you can't improve it.
backend
python + fastapi — boring, fast, every library exists for it.
langchain + langgraph — agent orchestration when you need state and branching.
postgres — default database. one db until it isn't enough (usually never).
typesense — search that doesn't require a phd to operate.
neo4j — when relationships matter more than rows.
frontend
react + vite — fast dev loop, no framework lock-in.
assistant-ui — pre-built chat surfaces so we don't rebuild the same interface every time.
clerk — auth that just works.
infra & ops
aws — cloud-first for production.
kubernetes — local test environments and project sandboxes.
sentry — errors. we want to know before users do.
grafana — logs, metrics, dashboards.
stripe — payments. boring answer, right answer.
methodology
spec-driven development with openspec. every project starts with a written spec. agents and humans both work from it. specs evolve with the code, not separately.
local-first where it counts. more of our projects are moving toward local-first architectures — own your data, don't trust the providers, ship something that keeps working when the cloud doesn't.
experimentation is the job. r&d isn't a side activity. trying new models, new patterns, new tools is how we stay ahead of the hype cycle instead of getting flattened by it.
evaluate everything. llms are tools, not oracles. we run evals before we trust outputs in production. "it works on my prompt" is not shipping.
projects we're building in the open
knct/docs [wip] — an opinionated internal knowledge hub. every team reinvents this badly. we're building one we'd actually use.
tilodesk [wip] — the missing layer between dev teams and clients. closing the gap that tickets and slack threads don't fix.
what we've learned
write the spec before the code. agents and humans both work better that way.
evaluate agents. don't trust llms.
observability is non-negotiable. we don't ship without confidence.
boring tech is a competitive advantage. save your weirdness for the product.
one database until it hurts. it rarely hurts.
experimentation isn't optional. r&d is how you stay ahead of the hype.