Smart Recall: Enhancing Local LLM Conversations with Embedding-Aware Context Retrieval
Lucas Jeanniot • Location: TUECHTIG • Back to Haystack EU 2024
How can you make your local LLM feel less forgetful? This session will introduce a practical service architecture for improving contextual continuity in chat applications using locally stored conversation history. We’ll walk through a Python-based approach that dynamically retrieves and rewrites prior turns based on semantic similarity which leverages embeddings, token limits, and summarisation to provide relevant memory windows to your model. Attendees will learn how to structure past interactions, filter for importance, and integrate efficient recall mechanisms to ensure local LLMs stay coherent, concise, and contextually aware.
Lucas Jeanniot
Eliatra