Maksymenko, D.; Turuta, O. Interpretable Conversation Routing with Latent Embeddings Approach. Preprints2024, 2024102295. https://doi.org/10.20944/preprints202410.2295.v1
APA Style
Maksymenko, D., & Turuta, O. (2024). Interpretable Conversation Routing with Latent Embeddings Approach. Preprints. https://doi.org/10.20944/preprints202410.2295.v1
Chicago/Turabian Style
Maksymenko, D. and Oleksii Turuta. 2024 "Interpretable Conversation Routing with Latent Embeddings Approach" Preprints. https://doi.org/10.20944/preprints202410.2295.v1
Abstract
Large language models (LLMs) get quickly implemented into question answering and support systems to automate customer experience across all domains even including medical use cases. Models in such environments should solve multiple problems like general knowledge questions, queries to external sources, function calling and many others. Some cases might not even require a full-on text generation. They possibly need different prompts or even models. All of it can be managed by a routing step. This paper focuses on interpretable few-shot approaches for conversation routing like latent embeddings retrieval. The work here presents a benchmark, a sorrow analysis, and a set of visualizations of the way latent embeddings routing works for long-context conversations in a multilingual, domain-specific environment. The results presented here show that latent embeddings router is able to achieve performance on the same level as LLM-based routers with additional interpretability and higher level of control over model decision making.
Computer Science and Mathematics, Artificial Intelligence and Machine Learning
Copyright:
This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.