RAG Frustration: Is Vector Search Giving Your LLM Junk Context? : General Discussion Forums

Hey PH Makers & Builders using LLMs!

Anyone else building cool stuff like PDF Q&A or custom bots with RAG, but finding the context retrieval step... frustrating?

Most AI app data stacks these days use vector search (Pinecone, Weaviate, etc.) to grab text chunks for the LLM. But sometimes it feels like it finds stuff that's keyword-similar while totally missing the actual point the user asked for. Leads to those slightly weak or "confidently wrong" LLM answers.

So, what's your single biggest headache right now in getting truly relevant context for your RAG pipeline? Is it nailing the chunking strategy? The embedding choice? Hitting the limits of pure vector similarity? Getting re-ranking right? What part is giving you the most trouble?

(Context: We hit this wall hard internally. Similarity search wasn't cutting it for the precision we needed, so we ended up building our own retrieval engine, Spykio, focused on 'understanding' context. It made a difference for us, so we're now exploring it as an API. Transparency: I'm on the team, and we are launching on PH soon!)

But enough about our journey! Thinking about that crucial step where you plug retrieved context into your LLM prompt for RAG – what's the ideal handoff from the retrieval engine to make your life easier? When Spykio (or any retriever) finds the relevant info, what format would be most useful for you to feed the LLM?

Would you prefer the entire relevant document(s)?
Just the specific, most relevant paragraphs or chunks?
Or ideally, would you want something more processed, maybe closer to a direct answer to the query based on your data?

What level of granularity and processing from the retrieval step would best streamline your RAG workflow? Curious about your preferences!

P.S. - If playing with different retrieval ideas sounds useful after chatting here, the thing we built (Spyk.io) has a free trial to experiment – $25 credits, no CC needed. if you're testing and need more runway, you can comment and I'll add in a few more credits :)

RAG Frustration: Is Vector Search Giving Your LLM Junk Context?

Replies