In the last article, I explained why PDF works as a container for multimodal context. You can restore reliably now.
But restoring assumes you can find the right thread in the first place. That's the next problem: search.
Nobody Is an Expert in Everything
Let me step back before we get technical.
When you run a one-person company, you inevitably hit domains you know nothing about. Maybe you're a product manager who has never touched infrastructure. Maybe you're an engineer who has never written marketing copy. Maybe you're a designer who has never set up billing.
This is normal. This is expected.
The question is: what do you do when you hit a wall in an unfamiliar domain?
I have some engineering background. I've worked with databases before. I've heard of Pinecone (a cloud vector database). If you asked me a year ago how to build search into an extension, I would have said: "Use a cloud vector database. Set up an API endpoint. Pay for hosting."
That's the pattern I knew. That's the limit of my knowledge.
From a product perspective, I knew this: ChatShuttle needs search. Users will accumulate many threads. They need to find the right one quickly. As a product owner, I could articulate the need. But I didn't know the implementation options.
I didn't know what I didn't know.
The Strategy: Ask AI
Here's the part that feels like a cheat code.
When you're stuck in an unfamiliar domain, most people do one of two things: either they default to the pattern they already know (even if it doesn't fit), or they spend hours reading documentation, blog posts, and Stack Overflow threads trying to piece together a mental model from scratch.
AI offers a third option. You can ask for the landscape, not the answer.
I asked ChatGPT something like: "What are the options for building search into a Chrome extension without using a backend server? I want everything to run locally."
The response didn't give me code. It gave me vocabulary. It gave me a map of possibilities I didn't know existed:
- Chrome extensions can run JavaScript and WASM in the background.
- There are embedding models designed to run in-browser via
transformers.js. - There are lightweight vector index libraries that run entirely in WASM.
This was new to me. In my mental model, Chrome was just a browser. Extensions were just small scripts that modified web pages. I didn't realize Chrome itself is a capable runtime environment.
Before AI, one of the hardest parts of learning a new domain was: how do you even frame the question? If you don't know vector databases exist, you can't search for "best vector database for browser." You're stuck at "how do I make search work."
AI bridges this gap. You start with a vague question. AI gives you unfamiliar terms. You take those terms and ask follow-up questions. You iterate. You refine. Eventually, you arrive at the right question. And then you get the right answer.
In my case:
- I started with: "I need search in a Chrome extension."
- I learned about: vector embeddings, in-browser ML inference, WASM.
- I arrived at: "Can I run a quantized sentence transformer in a browser context and query a local vector index?"
The answer was yes.
Once I understood the possibilities, I realized: in-browser vector search fits my design philosophy perfectly. ChatShuttle's core principle is local-first. No server. No intermediate database. Your data stays on your machine and your Drive.
If I had defaulted to "use Pinecone," I would have violated that principle. I would have built a dependency on external infrastructure. I would have added cost. I would have complicated the privacy story.
AI didn't just teach me a technique. It opened a door that aligned with my values.
The Lesson for One-Person Companies
You don't need to be an expert in everything. You need to:
- Know what you're trying to build. From a product perspective, what does the user need? What problem are you solving?
- Be willing to ask. When you hit an unfamiliar domain, don't assume you know the options. Ask AI to map the landscape for you.
- Iterate. You won't get the right answer on the first prompt. Use each response to sharpen your next question.
- Check alignment. When you find a technical approach, ask: does this fit my values? My constraints? My architecture? Don't adopt a solution just because it's popular.
This is not just about search. It's about every unfamiliar domain you'll encounter as a one-person company.
Now, the Technical Part
Okay. Method talk is over. Let me show you what I actually built.
If you're building local vector search (in browser or on device), here are the engineering points you have to get right.
Model Choice. You need an embedding model that runs in JavaScript/WASM without killing the browser. ChatShuttle uses Xenova/all-MiniLM-L6-v2 via transformers.js. It's a sentence transformer, quantized for CPU. What can go wrong: larger models give better embeddings but freeze the browser; smaller models are fast but miss nuance. Pick quantized models designed for edge inference.
Quantization. Running a 400MB model in the browser is a non-starter. Quantized models trade precision for size. The model ChatShuttle uses is around 30MB quantized. It downloads to a local cache on first run. Use established quantized checkpoints (like those from Xenova). Don't roll your own quantization.
Index File Format. Your embeddings need to be stored somewhere. ChatShuttle generates index.voy, a single file that contains all vectors. This is the AI memory index. This file is stored in your Google Drive (ChatShuttle_Memories folder). The ChatShuttle Nexus skill downloads it to ~/.chatshuttle/cache for local querying. If the index file and Drive content get out of sync, search returns stale results. ChatShuttle shows a "Repair Needed" badge when sync breaks; clicking it rebuilds the index.
Incremental Updates. Rebuilding the entire index every time you import a chat is wasteful. ChatShuttle only generates embeddings for new content and appends to the index. The tradeoff: append-only indexes can accumulate garbage over time (deleted chats still in index). ChatShuttle's repair function doubles as a compactor.
The Boundaries
This is another thing I learned while exploring with AI.
Vector search is great for concepts. "That conversation about authentication" or "the thread where I debugged the API timeout." It turns your AI chat history into a searchable knowledge base.
It struggles with specifics. Exact code snippets. Literal function names. import { Auth } from './utils'.
When I asked AI about this limitation, it suggested hybrid search: combine semantic search with keyword search. ChatShuttle implements this using Reciprocal Rank Fusion (RRF), merging results from voy-search (semantic) and fuse.js (keyword). The combination covers more ground than either alone. But it's not perfect. You'll still hit edge cases.
The point: AI didn't just help me build search. It helped me understand the boundaries of what I was building. That kind of awareness is hard to get from documentation alone.
Switching Hats
We just went deep on tech. Model choice, quantization, index format, hybrid search.
And now I'm thinking about pricing.
Because compute is local and storage is Drive, the architecture costs $0/month in infrastructure. No server costs to pass to users. No recurring infrastructure debt.
So what do I charge? How do I charge? One-time or subscription? Those are completely different questions from "which embedding model should I use."
This is what running a one-person company feels like. One moment you're debugging WASM performance. The next, you're sketching pricing tiers on a napkin. You wear every hat. You switch contexts constantly.
Some people find that exhausting. I find it clarifying. When you're the only person, there's no handoff, no "let someone else figure out pricing." You carry the full picture. The technical decisions and the business decisions are the same conversation.
Let's continue that conversation next.
Curious how the search actually works? See the documentation for setup and usage.