[RFC] 061 - Multiple RAG Strategies Support #4204

cy948 · 2024-09-29T03:30:31Z

cy948
Sep 29, 2024

背景

RAG被广泛应用于增强LLM在没有内部知识支撑的场景下的性能表现。如领域知识敏感的任务1、幻觉问题的解决2 。近年来对RAG的Pipeline研究主要聚焦于对三个核心模块：Retrieval, Augment, Generation3的性能提升。因此，本RFC目标有：

对已有代码的重构，实现对更多不同RAG策略的支持。
除朴素RAG外，支持一些在公开数据集上“性能较好”的RAG策略。

思路

后端：重构RAG流水线中的retrieve部分。
前端：提供界面以便用户选择不同的RAG策略。

2024/10/6

考虑到不同下游任务的CoT策略较多，或许可以提供灵活的CoT模板库（或者Market）供用户在不同reasoning需求下进行定制模板，lobe这边内置一个可供LLM控制的检索module。但该方案可能含有安全问题。

进展

2024/9/29: RFC初步提出、方案实验；
2024/10/13: 找到后端rag策略定义，准备开始改造

lobe-chat/src/server/routers/lambda/chunk.ts

Line 129 in 9a369ac

semanticSearchForChat: chunkProcedure

2024/10/16: 在maintainer的建议下，先然所接入类似功能的 Dify以实现对 Agent 架构的支持。 [RFC] 064 - Dify Integration | Dify 接入 #4412

Reference

1 N. Kandpal, H. Deng, A. Roberts, E. Wallace, and C. Raffel, “Large language models struggle to learn long-tail knowledge,” in International Conference on Machine Learning. PMLR, 2023, pp. 15 696–15 707.
2 Y. Zhang, Y. Li, L. Cui, D. Cai, L. Liu, T. Fu, X. Huang, E. Zhao, Y. Zhang, Y. Chen et al., “Siren’s song in the ai ocean: A survey on hallucination in large language models,” arXiv preprint arXiv:2309.01219, 2023.
3 Gao Y, Xiong Y, Gao X, et al. Retrieval-augmented generation for large language models: A survey[J]. arXiv preprint arXiv:2312.10997, 2023.

Issues

[Request] 希望增加全文上传和更强的rag性能 #4201

l0g1n · 2024-10-08T09:24:20Z

l0g1n
Oct 8, 2024

希望可以支持text-embeddings-inference 的 embedding和rerank
https://github.com/huggingface/text-embeddings-inference

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RFC] 061 - Multiple RAG Strategies Support #4204

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment

{{title}}

Select a reply

[RFC] 061 - Multiple RAG Strategies Support #4204

cy948 Sep 29, 2024

背景

思路

进展

Reference

Issues

Replies: 1 comment

l0g1n Oct 8, 2024

cy948
Sep 29, 2024

l0g1n
Oct 8, 2024