


Abstract
E-commerce search aims to connect users with the products they seek quickly and effectively, while also offering opportunities for discovery and inspiration. Achieving this involves understanding user intent, refining search criteria, and introducing relevant product suggestions. However, traditional e-commerce search systems face significant challenges, particularly in accurately interpreting user queries and comprehending product details. These challenges highlight the need for better solutions, such as LLMs, to enhance search accuracy and personalization, offering a more intuitive and satisfying shopping experience for a market worth USD 25.93 trillion in 2023, with a projected growth to USD 83.26 trillion by 2030, at a CAGR of 18.9% from 2024 to 20301.
Goals of eCommerce Search
E-commerce search aims to efficiently and effectively connect users with the products they seek, while also offering opportunities for discovery and inspiration. The primary goals include:
- Express: Quickly finding exactly what the user intends to purchase, particularly in grocery and low-consideration non-grocery categories.
- A user searches for "milk," and the system immediately shows them available milk options in their preferred grocery store.
- Refine: Helping users refine their search criteria by understanding and anticipating their needs.
- A user searches for "laptop," and the system suggests filters like price range, brand, and specifications to narrow down the options.
- Shape: Introducing users to new products or ideas that align with their interests, often delighting them with unexpected but relevant suggestions.
- A user searches for "living room furniture," and the system recommends trendy decor items that complement their style, even if they hadn't specifically searched for them.
Balancing these goals requires not only showing users what they wanted but also presenting items that they might like based on their behavior and preferences.
How Does E-Commerce Search Work?
E-commerce search typically operates through several key components:
- Intent understanding: The first step is to accurately interpret the user’s query, whether it’s a short, long-tail, or broad search.
- Indexing and retrieval:
- Initial Catalog Pull: Once the intent is understood, the system retrieves a set of potentially relevant products from the structured data in the product catalog.
- Layers of reranking: These initial results are then refined through multiple layers of reranking, which might include factors like relevance, personalization, and business objectives (e.g., promoting ads)
- Relevance evaluation
- Human review of XXX,XXX queries/month, rated on degree of relevance; usually on full attribute match
- Gen AI evals
Challenges in eCommerce Search
E-commerce search faces four primary challenges: First, accurately interpreting the user's intent behind short search queries. Second, comprehending the products in the catalog, which requires either extensive world knowledge or a detailed product knowledge base. These two would support the express pillar of search mentioned above and are the core, the bread and butter every search product should fulfill. Third, using customer and session context to meet customers in their shopping journeys. Fourth, using LLMs to enhance and eventually replace the majority of manual evaluations.
Challenge 1: Intent Understanding
In traditional information retrieval, the primary challenge of query understanding involves extracting relevant information from a query and mapping it to specific database attributes such as brand, product name, or features like "organic." However, even with a simple search query like "blue dress under $50," many e-commerce search engines struggle to interpret the user's intent correctly. For instance, the system might return items that are blue but not dresses or items priced exactly at $50, missing the broader context of "under $50." These are common issues in query understanding, and they highlight the need for specialized algorithms and models to handle the many edge cases that arise. As a result, query understanding remains a significant challenge in industrial e-commerce search systems.
Challenge 2: Product Understanding
Every product in a product database (catalog) is defined by attributes such as product type, brand, size, and more. Identifying the most relevant products for a particular query using traditional term-based information retrieval (IR) methods can be challenging. Although recent advancements in embedding-based retrieval techniques have enhanced semantic matching, they still struggle to effectively address queries like “healthy snacks for kids” or “alternatives to ice cream.”
E-commerce search is inherently knowledge-intensive. Offering customers recommendations, ideas, and inspiration often requires information beyond what is contained in the product database. To address this, companies like Amazon and Baidu invest heavily in creating product knowledge graphs to capture comprehensive knowledge about their products.
However, building effective product knowledge graphs has proven to be difficult. Two major challenges stand out. First, explicitly modeling this knowledge in a product knowledge graph requires the development of a highly complex schema or ontology, which complicates the process for algorithms to navigate and access the necessary information. Second, creating a general-purpose algorithm capable of transforming unstructured product data into a structured format is challenging, as each product category may need specialized algorithms to detect relevant patterns.
Challenge 3: Local and Global Contextual Journeys
Modern e-commerce search must cater to increasingly sophisticated customer journeys that span both local (within the same platform) and global (across multiple platforms) contexts. Understanding and leveraging these contexts presents several challenges:
Local Contextual Journeys
Local journeys occur entirely within a single e-commerce platform, such as Amazon or Walmart. The challenge here is to maintain continuity and relevance as users engage with the platform in non-linear ways. For instance:
- A user might search for "kitchen appliances" and then click on related categories, such as "blenders" or "air fryers," before returning to their original search intent.
- Personalized recommendations based on session history or prior purchases must feel intuitive, not intrusive, and adapt as user preferences shift during the journey.
- Ensuring contextual relevance across different touchpoints—search, category browsing, product pages—requires robust session tracking and real-time intent modeling.
Global Contextual Journeys
Global journeys extend across platforms and marketplaces, requiring a search system to integrate external signals. For example:
- A user searches for "noise-canceling headphones" on Google Shopping, browses options on Amazon, and then checks reviews or price comparisons on Best Buy or smaller retailer websites.
- Interpreting these signals and offering cohesive recommendations across ecosystems requires sophisticated cross-platform data integration.
- Challenges arise in maintaining privacy and compliance (e.g., GDPR), as platforms may not share user data directly. Innovations like federated learning and anonymized insights are becoming critical for building holistic understanding without violating user privacy.
In both contexts, fragmented user data remains a major hurdle. AI-driven techniques, such as embedding models that unify structured (catalog) and unstructured (user behavior, reviews, social trends) data, are emerging as solutions. However, real-time personalization at this scale demands significant computational resources and well-coordinated data pipelines.
Challenge 4: Relevance Evaluation, Scoring, and Prediction
Evaluating the relevance of e-commerce search results is crucial to delivering a seamless user experience. However, existing methods for relevance evaluation often fall short due to their reliance on manual processes, which are time-intensive, error-prone, and difficult to scale.
Human-in-the-Loop Labeling (HITL)
Companies like Scale AI have revolutionized relevance evaluation by introducing HITL systems, which combine automated models with human oversight. However, even this hybrid approach comes with its own challenges:
- Consistency: Humans evaluating relevance may have different interpretations of what constitutes a “good” result, especially for subjective queries like “best birthday gifts.”
- Bias: Annotators may unintentionally introduce biases that skew the scoring, such as favoring certain brands or categories.
- Speed vs. Quality Tradeoff: HITL systems often incentivize annotators based on task completion speed, potentially reducing the quality of relevance evaluations.
LLM-Powered Evaluations
Recent advancements in large language models (LLMs) offer promising alternatives for relevance evaluation. LLMs can:
- Simulate human-like reasoning to assess the relevance of search results based on query intent.
- Provide detailed feedback on results, such as identifying missing attributes (e.g., no size options shown for a “men’s winter coat” search).
- Enhance scalability by automating evaluations for millions of queries without fatigue.
However, deploying LLMs introduces its own set of challenges:
- Accuracy and Hallucination: LLMs may occasionally generate confident but incorrect assessments, especially for niche or ambiguous queries.
- Context Integration: LLMs must incorporate real-time contextual signals (e.g., user location, search history) to make accurate evaluations.
- Cost: Running LLM-based evaluations at scale can be expensive, especially for e-commerce giants that process billions of queries annually.
Towards Predictive Scoring
The ultimate goal of relevance evaluation is to not only assess past results but to predict relevance for future queries:
- Predictive scoring models leverage historical query performance, user feedback, and contextual signals to rank products dynamically.
- These models must balance multiple objectives, including revenue optimization, user satisfaction, and fairness (e.g., surfacing smaller brands or diverse product categories).
Integrating LLMs into this predictive framework can enhance the system’s ability to anticipate user needs and improve search result rankings. However, ensuring transparency and interpretability of LLM-driven predictions remains an ongoing research challenge.
Startup Landscape
The search landscape is evolving rapidly, with startups tackling distinct challenges in unique ways. These companies can be grouped into three categories: e-commerce-focused startups, search-enabling tools for the web, and non-e-commerce search startups.
E-commerce-focused startups like Curatle and Kart AI stand out for their direct approach to solving pain points in online shopping. Curatle, a reimagined Google Shopping built natively with generative AI, tackles multiple challenges in e-commerce search: intent understanding, product comprehension, and contextual journeys. By aggregating and filtering the vast, fragmented universe of online products, it creates a unified discovery platform tailored to user needs. This innovation has the potential to revolutionize how consumers shop across platforms, addressing gaps left by incumbents.
Kart AI, on the other hand, democratizes Amazon's conversational AI shopping assistant, enabling any e-commerce site—no matter how small—to implement AI-powered search and personalized shopping experiences. With generative AI breakthroughs and the widespread adoption of e-commerce, the timing is perfect to equip countless brick-and-mortar stores and smaller retailers with cutting-edge search capabilities. This unlocks new possibilities for retailers previously excluded from such technologies.
Meanwhile, search-enabling tools like Exa and Perplexity offer scalable, general-purpose frameworks for search across the web. These tools are powerful but lack the domain-specific expertise to address e-commerce-specific challenges, such as catalog standardization and intent refinement. Non-e-commerce startups, like Serra and Happenstance, focus on adjacent domains such as recruitment, where personalized discovery and relevance are critical. While not directly in e-commerce, their solutions often provide valuable insights that could inspire future innovation.

Curatle
Curatle is uniquely positioned to redefine e-commerce search by tackling the fragmented experiences that plague both consumers and sellers. With 42% of online sellers operating through four or more marketplaces2, the shopping journey is increasingly scattered across platforms. Google Shopping has attempted to address this need but falls short of delivering a cohesive solution. Curatle’s ambitious generative AI-powered platform aims to surpass these limitations by not only refining how consumers shop but also how sellers manage their presence across multiple marketplaces.
At its core, Curatle solves Challenge 1: Intent Understanding and Challenge 2: Product Understanding, ensuring that users can seamlessly find the right products, regardless of platform, and that sellers can efficiently showcase their inventory. These foundational goals establish Curatle as a direct improvement over existing solutions like Google Shopping. However, its long-term vision positions it to also tackle Challenge 3: Global Contextual Journeys, a problem that few companies—such as Perplexity and Google itself—are even equipped to attempt. By unifying cross-platform discovery for consumers and enabling sellers to consolidate their operations globally, Curatle has the potential to be a truly transformative product in e-commerce.
Additionally, as Curatle models itself after Google Shopping and Google Search, it must also address Challenge 4: Relevance Evaluation to deliver accurate, personalized, and scalable results. This ambition to tackle all four core challenges makes Curatle one of the most unique and visionary products in the e-commerce space. While its goals are highly ambitious, the convergence of fragmented shopping experiences and the rapid adoption of generative AI creates the perfect "why now" moment for Curatle to succeed. By solving both immediate pain points and positioning itself to lead in the long term, Curatle is poised to revolutionize the global retail ecosystem.
Kart AI
Kart AI is at the forefront of transforming e-commerce for retailers looking to compete in a rapidly digitalizing world. With in-store sales accounting for 85% of U.S. retail revenue in 2023, the majority of physical retailers are still underrepresented online3. Kart AI bridges this gap with tools that bring advanced personalization, intent understanding, and relevance optimization to e-commerce platforms, allowing any retailer to deliver Amazon-level shopping experiences.
The core of Kart AI’s innovation lies in its Sector-Specific Reranker, which addresses Challenge 4: Relevance Evaluation by predicting which products a customer is most likely to purchase. These AI-driven models are tailored for individual stores, offering precise and low-latency product ranking that significantly enhances conversion rates. Meanwhile, Kart AI’s Personalization Agents and Granular Personalization tackle Challenges 1 and 2: Intent Understanding and Product Understanding by training large AI models to adapt store layouts, FAQs, and product recommendations to each customer’s unique preferences and behavior. This enables retailers to create highly customized shopping experiences on a store-by-store basis.
Kart AI’s existing tools position it as a leader in personalization, but its long-term potential is even more exciting. By expanding its capabilities to address Challenge 3: Local and Global Contextual Journeys, Kart AI could help retailers unify customer experiences across marketplaces, brands, and geographies. This would enable a seamless shopping journey that spans platforms, connecting millions of retailers globally while maintaining deep, individualized personalization. With its ability to solve immediate e-commerce challenges and scale to broader market needs, Kart AI is set to play a pivotal role in the future of retail.
What Existing Competitors Get Right and Wrong
Competitors like Glean and Exa have laid strong foundations in AI-powered search through advanced language models and scalable architectures. Their technology excels in understanding queries and delivering relevant results across diverse use cases. However, these platforms lack the domain-specific expertise and data partnerships required for effective e-commerce search. Challenges like unstructured product data, fragmented catalogs, and shopper-specific behaviors demand solutions that general-purpose frameworks struggle to provide. Additionally, the real-time, high-scale demands of e-commerce add cost and latency challenges that these platforms are not optimized to handle.
E-commerce giants such as Amazon, Walmart, and Baidu operate with significant advantages, including vast datasets, proprietary knowledge graphs, and control over end-to-end shopping ecosystems. Amazon leads with innovations like Rufus, a conversational shopping assistant, and its robust product recommendation systems. Walmart leverages cross-category personalization in groceries and general merchandise, while also being in early stages of their own gen AI search experience. However, these players are constrained by legacy systems, add-to-cart-driven monetization strategies, and organizational inertia, which limit their ability to adopt transformative technologies like LLMs at scale. For instance, Amazon’s focus on ad revenue compromises organic search relevance, while Walmart’s historically traditional business is not built launching different new experiences to stay competitive.
Why Can’t or Won’t They Do This?
While general-purpose search platforms have the technical potential to enter e-commerce, they lack access to the proprietary datasets and retail-specific insights needed to compete effectively. Building the necessary integrations, such as SKU mapping and supply chain synchronization, is both costly and time-intensive, creating significant barriers to entry.
Conversely, e-commerce giants face their own challenges. Their reliance on stable revenue streams, such as ads or in-store synergies, disincentivizes disruptive innovation. Implementing LLM-driven advancements in search also risks breaking existing workflows and user experiences. Giants like Baidu, though technically capable, are often focused on regional markets or specific ecosystems that do not naturally extend into global e-commerce.
Conclusion
The divide between technical capability and domain expertise creates a gap in the e-commerce search space. General-purpose platforms like Glean and Exa lack the depth needed for effective product search, while giants like Amazon and Walmart are constrained by their existing systems and revenue models. This gap presents an opportunity for specialized startups to innovate, particularly in areas such as personalized discovery and contextual journeys, which remain underserved by both incumbents and general-purpose players.
The challenges in e-commerce search—intent understanding, product comprehension, contextual journeys, and relevance evaluation—present significant opportunities for innovation. Among the most promising startups tackling these issues are Curatle and Kart AI. These companies embody a compelling "why now" moment, as advancements in AI and LLMs have made it feasible to deliver best-in-class search experiences across platforms. Curatle aims to unify product search online with unparalleled filtering and aggregation, while Kart AI democratizes conversational shopping assistants, empowering retailers to offer Amazon-level personalization. While challenges like relevance evaluation and scaling remain, these startups are best positioned to redefine e-commerce search, leveraging cutting-edge technology to address long-standing pain points and unlock untapped value in the market.
Citations
[1] Grand View Research, "E-commerce Market Size, Share and Growth Report, 2030," 2023. [Online]. Available: https://www.grandviewresearch.com/industry-analysis/e-commerce-market
[2] "61% of online sellers increased marketplace usage in the past year," PYMNTS, Nov. 18, 2022. [Online]. Available: https://www.pymnts.com/news/ecommerce/2022/the-data-point-61-pct-online-sellers-increased-marketplace-usage-in-past-year/.
[3] Capital One Shopping, "Retail Statistics," Nov. 6, 2024. [Online]. Available: https://capitaloneshopping.com/research/retail-statistics/