This is Part 1 in a two-part article on the impact of AI on consumer search.
Keyword Search Is Dead
For decades, we've had to figure out what machines want. It’s time for machines to figure out what we want.
As shoppers, we have had to shape our product searches based on the needs of hulking, expensive, and indifferent search engines, constrained by their implementation details. It was our job to learn the language of search. Keywords, filters, and Boolean logic were foisted upon humanity. We stumbled with no guidebook, through trial-and-error, and slowly learned how to convince these elaborate and cranky cloud vending machines to give us what we want. Our desires were never truly understood by the machine. They couldn’t know us because of the gulf between our expressive humanity and the machine’s calculating rigidity.
Search engines didn’t understand the nuanced relationships between the intent underlying our ask and the connected opportunities hidden inside the ask.
The future of search is here. AI-powered semantic search now understands meaning. The script is flipping, finally allowing consumer needs to be conveyed in naturally expressive and multisensory forms. Translation from humans to machines is now automatic thanks to neural network-powered embedding models and vector similarity engines.[1][2] The baseline of consumer expectation is beginning to shift – way up. Consumer search experiences will never be the same.
Why This Matters
The ability to extract an astounding amount of value from neural networks and innovative adjacent technologies presents an opportunity for software developers to help move their customers further along the path to search Nirvana: gaining a more precise understanding of consumer need while also presenting valuable complementary opportunities in the discovery experience.
Questions to Consider for Consumer-Facing Software Developers
How are consumer expectations for search shifting with advancements in LLMs?
What are developers struggling with to meet these expectations?
What is semantic search and how does it work?
What challenges must be overcome to make it work exceptionally well?
Part 2: How did search evolve to make semantic search possible today?
Perspectives
The Consumer Viewpoint
Think about a typical experience in a consumer app which had a mature and robust search feature, circa 2024. We fiddled with ordering keywords in text inputs, hopped around multiple separate filter buttons, dragged bracketing sliders within filters, drilled into deeper options to adjust time, distance, and price – simply to find a decent cup of coffee, a good slice of pizza, or a great pair of jeans. Because these search systems were mirrors of the constraints of the underlying technology, consumers had to make the distinctions to work through various sub-filters in different ways. We are thinking “where can I get a great burger within a few minutes of my kid’s soccer field after practice?” But we have to break that down into a language of search the software systems can manage.
Rather than experiencing a natural channel of communication between ourselves and the world, we have had to learn how to speak robot.
That is all changing and consumers already know it even if app developers don’t. With the explosive growth of generative AI,[3] people now expect natural language input capabilities. Their baseline expectation rises with the technology shift upward.
Apps that don't deliver will feel dated.
The Developer Viewpoint
Developers have faced an impossible task. Despite their efforts mastering SQL, inverted indexes, synonym caches, converting their product catalogs into complex relationship graphs, tweaking and tuning all manner of managed databases, and struggling with finicky fuzzy search libraries, there has always existed a critical semantic gap in understanding what consumers want. Software search features captured a fragile proxy of consumer need. Like a 20th generation photocopy, small errors further distort the picture in every translation, where the final copy is a smudged and Gaussian blur of the original. Developers have had to build complex systems that convert clicks and taps into the search command structure required by the backend. Even with the support of scalable managed services, the tentacles of complexity were numerous, costly, and time-intensive. Scarce roadmap space was used keeping these systems operational and competitive or they became crusty and creaky sources of technical debt and disappointment over time.
Most of this tech will be in the dustbin of history in the next couple of years as developers experiment with and deliver AI-powered search systems.
AI Eats Software: Semantic Search, In Detail
As consumer-app developers deepen their grasp of AI’s phenomenal capabilities, their products gain new features. Search features will offer greater value to consumers and the companies willing to invest in the talent and effort to make the switch.
Semantic search systems allow consumers to express their needs in the natural way they would speak them to a friend. They can accept text or speech in any language. They can understand our images. Let’s consider the search pipeline from catalog to consumer.
The Product Catalog
The trick for product catalog owners is to convert their content into semantically dense natural language that faithfully represents the essence of what they want to sell. This is not the same thing as the classic product name / brand / description approach. Search quality depends on this point. Optimizing the natural language representation of the product is critical to search success. This semantically rich content is passed through a specially trained neural network called an embedding model, converting the content into numeric arrays that represent “embedded” meaning. Interestingly, there’s no way to convert the embedding back to the original content. In that way, it’s a bit like a hash that helps you find it but it won’t tell you what it was made from. For this reason metadata (e.g. a product ID) is associated with these vectors.
The Vector Database
Storing these embedding arrays (vectors) with their metadata is the job of vector databases. The critical requirement of a vector database is it must not just store but it also must connect these embeddings with others. Similar vectors are placed close to each other and further from dissimilar ones. The distance between any two vectors is represented by a similarity score. Finding similar items is the name of the game in search.
How is this different if all previous generations of search, also focused on similarity? What is unique here is that this similarity is a measure of the difference in semantic meaning between the vectors, not just the difference between search text and product text. This opens up all the power of neural networks to search providers.
How can we know a database supports vector similarity search? They must implement a fabulous if oddly named algorithm called Hierarchical Navigable Small World Approximate Nearest Neighbor (HNSW ANN).[4] Examples of databases that support this approach include AWS OpenSearch, MemoryDB, Redis, and Pinecone. Database providers are rushing to support this method as AI features proliferate, mostly driven by retriever-augmented generation RAG use cases.
Given a well-stocked vector database, we can now get to the consumer search.
The Consumer Search
Consumers can say what they want, whether via typing, speaking, or pointing their camera at something. Each of these inputs can be passed through the same embedding model to generate a vector. Getting search results is as simple as doing a similarity match between that vector and what’s in the database. The search backend grabs the resulting product ID from the metadata, and can show product hits that most closely relate to the search.
To maximize search quality, there are some important challenges for developers to consider.
The Challenge of Overfitting Semantically Diffuse Search Phrases
Just as the conversion of the product catalog to concise, semantically dense content is key at the beginning of the pipeline, we must clean up the consumer search phrase with a similar objective. While doing this, we must remain aware that models can “overfit on language structure.” Embedding models can be too focused on the broad strokes of a search phrase and “miss the trees for the forest.” We need to help them understand the parts of the search that are important.
Let’s look at this challenge more closely with an example. Let’s say you want a new pair of running shoes. Your friend is gushing about the shoes they bought last year. So you take a picture of their shoes, open your shopping app, tap search, upload the picture and say “these snkrs but blue & this yrs model” then tap submit. Passing this directly into an embedding model isn’t going to result in great search results due to the diffuse semantic meaning in this phrase. Developers must employ other models like LLMs or specialized hybrid image and text models that can work together to improve the process in what is called a compositional semantic search strategy.
Building a Compositional Semantic Search Engine
The state of the art in semantic search requires we decompose search components, the images and their phrases into separately embeddable entities that can then be reintegrated to form a more robust search vector. In essence, we are going to start with the forest, identify the most important trees, and then line those up so that the resulting embedding is as semantically tight as possible. Let’s go back to the picture of your friend’s beloved running shoes and the search phrase:
“these snkrs but blue & this yrs model”.
The Components:
Text subject: “snkrs” – we’ll get to fixing spelling typos and abbreviations in a moment
Image subject: an image of running shoes (presumably not this years model and not blue)
Attributes: blue color
Temporal: this year’s model
How do we manage these components? Let’s separate the image problem from the text problem and re-integrate in a moment.
Compositional Semantic Search Flow
1. Parse and Clean Up the Search Phrase with an LLM
We avoid the open-ended language of classic LLM responses by using structured outputs. By pairing smart prompt engineering with this feature, you can successfully fix typos, expand abbreviations, translate from other languages, and peel apart any search phrase into components. We avoid overfitting and can find important trees in the forest.
Vastly simplified prompt (it would be 5-10x longer than this, with examples):
“You are a consumer search term parser. Clean up any typos, translate to US English, clarify the search intent, prioritize the shopper’s desires, and decompose this search term into parts ‘{search_term}’.”
You would include with this a data structure definition informing the LLM how to constrain its response.
The output from the LLM:
{“search_subject”: “running shoes”, “desired_attributes”: [“color blue”, “year 2025”]}
Where the attributes could be ordered in interpreted priority (e.g. getting blue may be more important than this year’s model).
2. Image Recognition
We know they want a running shoe, but which one? We must pass the image into a specialized embedding model like CLIP[5] or a general model like GPT-4o to identify the make, model, and product.
Output:
{“image_subject”: “Brooks Women’s Adrenaline GTS 24”, “discovered_attributes”: [“color orange”]}
3. Generate Component Vectors
We can pass each of these strings into an embedding model and get back a vector for each of “running shoes”, “blue color”, “year 2025”, “Brooks Women’s Adrenaline GTS 24”, “color orange”. We’ll sort out the two-color challenge in a moment thanks to knowing the difference between desired attributes and discovered attributes.
4. Generate the Composite Vector - Composing Meaning Algebraically
Now we can do simple vector algebra and create the final composite vector.[6] But we have a new problem caused by the simple word “these” in “these snkrs”. The semantic grounding problem is the challenge of determining what real-world concept or entity a consumer is referring to when they express something — especially when that expression is ambiguous, incomplete, or multi-modal (e.g., combining image + text).[7] We need to pin down meaning, but we have a general “running shoes” vector in one hand and a specific “Brooks Women’s Adrenaline GTS 24” in the other. If we slam these two together, we’ll actually push the resulting vector’s semantic centroid (the central point in the embedding space) further away from what the consumer actually wants. In other words, the general notion of shoes muddies the waters.
The solution? If we get a highly confident product identification from the image recognition model, we must discard the “what” from the search phrase. Sounds odd to toss out what the consumer said, but this will result in the best search results!
final_vector = image_subject + (1.2 * desired_color) + (0.8 * year)
In this case, we are prioritizing items in the desired attributes list by weighting them based on the interpreted importance. We discard and don’t factor in the text_subject and discovered_attributes. In this way, the final vector we search for is a composite of a multimodal search including text and image, where some parts are kept, some are amplified, and some are ignored.
Bonus: Vector Subtraction for Exclusion
What if our friend’s shoes were high-top sneakers and the search phrase included “but not in high tops”? (Set aside for a moment that Brooks doesn’t sell high-top women's running shoes.) This is a simple update to our original prompt, where we’d extract exclusion terms.
LLM output, with exclusion term:
{“exclusion_terms”: [“high tops”]}
This would instruct our composite vector step to simply subtract.
Final Composite Vector:
final_vector = (image_subject) + (1.2 * desired_color) + (0.8 * year) - (exclusion_term)
5. Search the Vector Database
The last step is easy. Pass the vector into the database and get a semantic similarity-ranked list of results. After we dereference the product IDs in the metadata, we present the 2025 Brooks Women’s running shoes in blue.
Desire met: conversion, revenue, retention.
The Beauty of Compositional Semantic Search
The beauty of this approach lies in the intuition we have as humans as it relates to understanding communicated desires. By layering specific desires on top of the basic meaning, and by subtracting the undesirable from base meaning, we better understand the true needs of consumers. The same holds true for vector algebra on semantic embeddings. This is an essential point in understanding why semantic search, when implemented carefully, is groundbreaking.
Other Example (Multimodal) Search Phrases
Tie current-generation search engines into knots with these. Consider how you would apply a compositional semantic search strategy to each:
“Pizaria wth bar and hpy hour Friday” - misspellings, inclusion terms, and hours
“Running track near swimming pool in Coronado” - proximity, inclusion term, and location
“best place fpr diesal no truckstop open now 92103” - reviews, misspellings, exclusion terms, hours, near location
“Nopfas pots not made in China” - word smash, abbreviations, and exclusion terms
“Find coupons for this place” (with picture of a grocery store) - multimodal
“batteries for garage opnr” (with a picture of a Genie X400) - multimodal
“Best bagels on the island Snake Plissken escaped from” - reviews, obscure cultural reference.
Factors Driving the Semantic Search Revolution
The enabling technology stack is finally in place. Every piece of the system has been refined and commoditized to a point that makes commercial adoption realistic.
Cost - The competitive land-grab for developers to test and adopt generative AI models has driven cost-per-token ever-lower despite their breathtaking power.
Marketplaces - Managed AI services and marketplaces offer two critical opportunities:
They lower the expertise and friction for iterating on model and vector store selection.
They create transparency on features and pricing, driving free-market economic benefits
Consumer expectations - People will vote with their feet (and downloads) as they get a taste of the potential of new systems leveraging these technologies. This will force product managers to make priority decisions for their apps.
Learning from Whence We Came
We are here today because of what came first. Historical context is essential to draw the full arc of the adaptation of consumer search as technology advances. Where it goes next is up to all of us.
In Part 2, we’ll step back through the milestones that shaped search and show how semantic comprehension is becoming the new strategic high ground. History won’t dictate the future, but it will sharpen our judgment as we shape it.
Stay tuned for “The History of Search and the Rise of Semantic Comprehension.”
[1] Tomas Mikolov et al., “Efficient Estimation of Word Representations in Vector Space.” arXiv (2013)
[2] Yu. A. Malkov and D. A. Yashunin, “Efficient and Robust Approximate Nearest Neighbor Search Using Hierarchical Navigable Small World Graphs.” arXiv (2016).
[3] Cindy Gordon, “ChatGPT Is the Fastest Growing App in the History of Web Applications.” Forbes (February 2 2023).
[4] Yu. A. Malkov & D. A. Yashunin, “Efficient and Robust Approximate Nearest Neighbor Search Using Hierarchical Navigable Small World Graphs”,
[5] OpenAI. “CLIP: Connecting Text and Images.” OpenAI, January 5, 2021. (accessed June 15, 2025).
[6] Dan Jurafsky and James H. Martin. “Vector Semantics and Embeddings.” (section 6.10 “Semantic Properties of Embeddings”) In Speech and Language Processing, 3rd ed. draft (Stanford University, 2024). Accessed June 15, 2025.
[7] Stevan Harnad. “The Symbol Grounding Problem.” Physica D: Nonlinear Phenomena 42, no. 1–3 (1990): 335–46. Accessed June 15, 2025.