Voice-Interactive Semantic Search Interface with Vector Databases
Identifikátory výsledku
Kód výsledku v IS VaVaI
<a href="https://www.isvavai.cz/riv?ss=detail&h=RIV%2F49777513%3A23520%2F24%3A43973057" target="_blank" >RIV/49777513:23520/24:43973057 - isvavai.cz</a>
Výsledek na webu
<a href="https://svk.fav.zcu.cz/download/proceedings_svk_2024.pdf" target="_blank" >https://svk.fav.zcu.cz/download/proceedings_svk_2024.pdf</a>
DOI - Digital Object Identifier
—
Alternativní jazyky
Jazyk výsledku
angličtina
Název v původním jazyce
Voice-Interactive Semantic Search Interface with Vector Databases
Popis výsledku v původním jazyce
Semantic searching offers significant advantages over full-text search, particularly be- cause it allows users to formulate queries in natural language without needing to know the precise indexed key phrases. By using vector databases that store and index data as high- dimensional vectors, we can search through large datasets in real-time. In this work, we present a custom web-based interface for state-of-the-art semantic search on arbitrary textual data. Additionally, we integrate our in-house speech technologies - ASR and TTS to enhance user interaction. The interface supports two modes: 1) Searching based on retrieval- augmented generation (RAG) with an LLM generating answers in a chat-like format, and 2) raw semantic matching with indexed data. In both modes, the original PDF file is shown and the exact source of the retrieved information is provided.
Název v anglickém jazyce
Voice-Interactive Semantic Search Interface with Vector Databases
Popis výsledku anglicky
Semantic searching offers significant advantages over full-text search, particularly be- cause it allows users to formulate queries in natural language without needing to know the precise indexed key phrases. By using vector databases that store and index data as high- dimensional vectors, we can search through large datasets in real-time. In this work, we present a custom web-based interface for state-of-the-art semantic search on arbitrary textual data. Additionally, we integrate our in-house speech technologies - ASR and TTS to enhance user interaction. The interface supports two modes: 1) Searching based on retrieval- augmented generation (RAG) with an LLM generating answers in a chat-like format, and 2) raw semantic matching with indexed data. In both modes, the original PDF file is shown and the exact source of the retrieved information is provided.
Klasifikace
Druh
O - Ostatní výsledky
CEP obor
—
OECD FORD obor
20205 - Automation and control systems
Návaznosti výsledku
Projekt
—
Návaznosti
S - Specificky vyzkum na vysokych skolach
Ostatní
Rok uplatnění
2024
Kód důvěrnosti údajů
S - Úplné a pravdivé údaje o projektu nepodléhají ochraně podle zvláštních právních předpisů