How On-Device AI Makes Note Search Actually Useful
Keyword search fails the moment you forget the exact word you used. Semantic search finds notes by meaning, not by matching strings. Here's how it works on-device, why it matters for privacy, and what it looks like in practice.
Keyword search is broken for notes
You wrote a note two weeks ago about a problem you were debugging. Something about a race condition in the payment flow. You remember the idea, but not the exact words.
So you open your note app and search for "race condition." Nothing. You try "concurrency." Still nothing. You try "payment bug." One result, but it's the wrong note.
Turns out you wrote "the checkout handler fires twice when the user double-clicks." All the right information, none of the keywords you searched for.
This is the fundamental problem with keyword search in note apps. It only works when you remember the exact words you used. And with notes, you almost never do. Notes are messy, informal, written in a hurry. You don't write notes the same way you'd write documentation.
Every note app that uses keyword search has this problem. Apple Notes, Obsidian (without plugins), Bear, Stickies, plain grep. They all match characters, not meaning.
What semantic search actually does
Semantic search matches meaning instead of text. When you search for "race condition," it also finds notes about "fires twice," "concurrent requests," "duplicate execution," because those concepts are related even though the words are different.
This isn't magic. It's math.
Every note gets converted into a list of numbers called an embedding vector. Think of it as coordinates in a space where similar ideas are close together. The note about "checkout handler fires twice" and the search query "race condition" end up near each other in that space, because they describe related problems.
When you search, the system converts your query into the same kind of vector and finds the notes whose vectors are closest. The closer the vectors, the more relevant the result.
This is the same technology behind how Google understands search queries, how Spotify recommends music, and how ChatGPT makes sense of your questions. The difference is where the computation happens.
Cloud AI vs on-device AI
Most apps that offer semantic search send your data to a server. You type a query, it goes to an API (usually OpenAI or similar), the server computes the embedding, compares it against your note embeddings (also stored on their server), and sends the results back.
This works, but it comes with real costs:
Privacy. Your notes leave your machine. Every search query, every note, gets sent over the internet to someone else's infrastructure. For a shopping list, who cares. For your private thoughts, debugging notes, meeting observations, half-formed ideas? That's a different conversation.
Latency. Network round trips take time. Even on a fast connection, you're adding 200 to 500 milliseconds per search. On a plane or with bad WiFi, it doesn't work at all.
Cost. Embedding APIs charge per token. Most apps pass this cost to you as a subscription. Notion AI is $10/month per person. Mem charges for AI features. Even open-source tools like Obsidian need paid plugins or API keys for semantic search.
Dependency. If the API goes down, your search goes down. If the company changes pricing or shuts down, your search disappears.
On-device AI eliminates all four problems. The model runs on your Mac. Your notes never leave your machine. Search works offline, instantly, for free, forever.
How Stik does it on your Mac
Stik uses Apple's NaturalLanguage framework to compute embeddings locally. This framework ships with macOS and uses Apple's own language models trained for understanding text meaning.
Here's what happens when you save a note in Stik:
- You type a note and close the floating window
- Stik saves it as a
.mdfile in~/Documents/Stik/ - In the background, the NaturalLanguage framework reads the text and computes a 512-dimension embedding vector
- That vector is stored locally alongside the note metadata
When you search:
- You type a query in the Stik search bar
- The framework converts your query into the same kind of 512-dimension vector
- Stik compares your query vector against every note vector using cosine similarity
- Notes are ranked by similarity score, most relevant first
The whole process takes milliseconds. No network call, no API key, no server. Just your Mac's CPU doing linear algebra on a few hundred vectors.
What this looks like in practice
Some real examples of searches that work with semantic search but fail with keywords:
| You search for | It finds your note that says | Why it works |
|---|---|---|
| "meeting prep" | "agenda for tomorrow's standup" | Both are about preparing for meetings |
| "authentication" | "login security and session tokens" | Same concept, different vocabulary |
| "deployment issue" | "the build broke on staging again" | Related problem domain |
| "design feedback" | "the button colors feel off on mobile" | Design critique, different phrasing |
| "money stuff" | "invoice from AWS, need to expense it" | Financial topic, casual language |
None of these would match with keyword search. All of them match with semantic search because the meaning is close, even when the words are completely different.
This is especially useful for notes because you rarely use formal language when writing them. Notes are messy. You write "that thing Carlos mentioned about the API" not "API rate limiting discussion from engineering sync." Semantic search handles this naturally.
Why on-device matters more than you think
The privacy angle is obvious: your notes don't leave your computer. But there are less obvious benefits.
It works offline. On a plane, in a cafe with bad WiFi, in a building with no signal. Your search works the same everywhere because it doesn't need the internet.
It's instant. No waiting for a server response. The embeddings are pre-computed and stored locally. Search is just a vector comparison, which takes microseconds per note.
No account required. No sign-up, no API key, no OAuth flow. Install the app, start writing. That's it.
No surprise bills. Cloud AI costs money and pricing changes. OpenAI raised embedding prices, then lowered them, then introduced new tiers. With on-device AI, the cost is zero today and zero next year.
It ages well. Apple improves the NaturalLanguage framework with every macOS release. Your search gets better over time through OS updates, not through app subscriptions.
The trade-offs (being honest)
On-device AI isn't perfect. Here's where cloud-based search still wins:
Multilingual support. Apple's NaturalLanguage framework handles English well. Other languages work but with varying quality. Cloud models from OpenAI or Google tend to handle multilingual content more evenly.
Cutting-edge models. Cloud services can use the latest, largest models. On-device models are smaller by necessity because they need to run on your hardware without draining your battery. The quality gap is shrinking with every Apple Silicon generation, but it exists.
Cross-device search. If your notes are on your Mac and you search from your phone, on-device search only works on the device that has the notes. Cloud search works from anywhere. (You can work around this by syncing the notes folder via iCloud Drive, but the search itself only runs locally.)
Very large note collections. If you have tens of thousands of notes, computing and comparing vectors locally takes more resources. For most people with a few hundred to a few thousand notes, this is not an issue at all.
For quick capture notes (which are typically short, personal, and in the hundreds), on-device AI is more than good enough. The privacy and speed benefits outweigh the limitations.
The bigger picture
There's a shift happening in how software uses AI. The first wave was "send everything to the cloud and let a big model handle it." The second wave, which is happening now, is "run smaller models locally and keep data on the device."
Apple is pushing this hard with Core ML, the NaturalLanguage framework, and Apple Intelligence. Google is doing it with on-device AI in Android. Even Mozilla is exploring local AI in Firefox.
The reason is simple: people are realizing that sending all your personal data to someone else's server just to get a smarter search bar is a bad trade. The models are now small and fast enough to run locally. So why not keep your data where it belongs?
Stik was built around this idea from the start. Your notes are files on your Mac. Your search runs on your Mac. Nothing leaves your machine, ever.
Try it yourself
Search is one of those features that's hard to appreciate until you use it. Keyword search feels fine until you experience semantic search, and then you can't go back.
Download Stik (free, open source, no account needed). Write a few notes over a couple of days. Then try searching for something using different words than you wrote. You'll see the difference immediately.
If you're curious about how the NaturalLanguage framework works under the hood, Apple's documentation on text embedding is a good starting point.
Frequently asked questions
How does AI note search work without the internet?
The AI model ships with macOS as part of Apple's NaturalLanguage framework. When you save a note, your Mac computes an embedding vector locally. When you search, it compares your query vector against your note vectors. Everything runs on your CPU, no internet needed.
Is on-device AI search as good as cloud AI search?
For personal notes (short, informal, mostly English), on-device search is excellent. Cloud models have an edge with very long documents, multilingual content, and cutting-edge accuracy. For quick capture notes, on-device is more than enough and you get privacy and speed as a bonus.
Does Stik send my notes anywhere?
No. Every note is a plain .md file stored in ~/Documents/Stik/ on your Mac. The AI search runs locally using Apple's built-in frameworks. No data is sent to any server, ever. There is no account, no analytics, no telemetry.
What is an embedding vector?
An embedding vector is a list of numbers that represents the meaning of a piece of text. Similar texts get similar numbers. This lets a computer measure how related two pieces of text are, even if they use completely different words. Stik uses 512-dimension vectors computed by Apple's NaturalLanguage framework.
Can I use semantic search with other note apps?
Some note apps offer semantic search, but most require a cloud connection or a paid subscription. Notion AI runs in the cloud ($10/month). Obsidian has community plugins for semantic search, but they typically need an OpenAI API key. Apple Notes uses basic keyword search only. Stik is one of the few apps that runs semantic search entirely on-device at no cost.
Does on-device AI drain my battery?
Computing embeddings takes a small amount of CPU time when you save a note, but it's a one-time operation per note. Searching is just a vector comparison, which is extremely lightweight. In normal use, you won't notice any impact on battery life.
Will the search improve over time?
Yes. Apple updates the NaturalLanguage framework and its underlying models with macOS releases. As Apple Silicon chips get faster and the models improve, your search quality improves automatically through OS updates. No app subscription needed.
What languages does on-device AI search support?
Apple's NaturalLanguage framework supports multiple languages including English, Spanish, French, German, Italian, Portuguese, Chinese, Japanese, and more. English has the strongest support. Other languages work well but may have slightly lower accuracy for nuanced queries.