大模型

构建专属知识库：利用llama3和langchain技术，基于RAG模型实现个性化知识管理

优质文章学习记录

06 Feb 2025 — 8 min read

LLM存在时效性和幻觉问题，在 [如何用解决大模型时效性和准确性问题？RAG技术核心原理]一文中我介绍了RAG的核心原理，本文将分享如何基于llama3和langchain搭建本地私有知识库。

先决条件

安装ollama和llama3模型，参看 [超越GPT-3.5!Llama3个人电脑本地部署教程]
安装python3.9
安装langchain用于协调LLM
安装weaviate-client用于向量数据库

pip3 install langchain weaviate-client

RAG实践

RAG需要从向量数据库检索上下文然后输入LLM进行生成，因此需要提前将文本数据向量化并存储到向量数据库。主要步骤如下：

准备文本资料
将文本分块
嵌入以及存储块到向量数据库

新建一个python3项目以及index.py文件，导入需要用到的模块：

from langchain_community.document_loaders import TextLoader # 文本加载器
from langchain.text_splitter import CharacterTextSplitter # 文本分块器
from langchain_community.embeddings import OllamaEmbeddings # Ollama向量嵌入器
import weaviate # 向量数据库
from weaviate.embedded import EmbeddedOptions # 向量嵌入选项
from langchain.prompts import ChatPromptTemplate # 聊天提示模板
from langchain_community.chat_models import ChatOllama # ChatOllma聊天模型
from langchain.schema.runnable import RunnablePassthrough
from langchain.schema.output_parser import StrOutputParser # 输出解析器
from langchain_community.vectorstores import Weaviate # 向量数据库
import requests

下载&加载语料

这里使用拜登总统2022年的国情咨文作为示例。文件链接[raw.githubusercontent.com/langchain-a…]

# 下载文件
url = "https://raw.githubusercontent.com/langchain-ai/langchain/master/docs/docs/modules/state_of_the_union.txt"
res = requests.get(url)
with open("state_of_the_union.txt", "w") as f:
    f.write(res.text)
# 加载文件
loader = TextLoader('./state_of_the_union.txt')
documents = loader.load()

语料分块

由于原始文档过大，超出了LLM的上下文窗口，需要将其分块才能让LLM识别。LangChain 提供了许多内置的文本分块工具，这里用CharacterTextSplitter作为示例：

text_splitter = CharacterTextSplitter(chunk_size=500, chunk_overlap=50)
chunks = text_splitter.split_documents(documents)

嵌入以及存储到向量数据库

为了对语料分块进行搜索，需要为每个块生成向量并嵌入文档，最后将文档和向量一起存储。这里使用Ollama&llama3生成向量，并存储到Weaviate向量数据库。

client = weaviate.Client(
    embedded_options=EmbeddedOptions()
)
print("store vector")
vectorstore = Weaviate.from_documents(
    client=client,
    documents=chunks,
    embedding=OllamaEmbeddings(model="llama3"),
    by_text=False
)

检索 & 增强

向量数据库加载数据后，可以作为检索器，通过用户查询和嵌入向量之间的语义相似性获取数据，然后使用一个固定的聊天模板即可。

# 检索器
retriever = vectorstore.as_retriever()
# LLM提示模板
template = """You are an assistant for question-answering tasks. 
   Use the following pieces of retrieved context to answer the question. 
   If you don't know the answer, just say that you don't know. 
   Use three sentences maximum and keep the answer concise.
   Question: {question} 
   Context: {context} 
   Answer:
   """
prompt = ChatPromptTemplate.from_template(template)

生成

最后，将检索器、聊天模板以及LLM组合成RAG链就可以了。

llm = ChatOllama(model="llama3", temperature=10)
rag_chain = (
        {"context": retriever, "question": RunnablePassthrough()} # 上下文信息
        | prompt
        | llm
        | StrOutputParser()
)
# 开始查询&生成
query = "What did the president mainly say?"
print(rag_chain.invoke(query))

上面的示例中我问了LLM总统主要说了什么，LLM回答如下：

The president mainly talked about continuing efforts to combat COVID-19, including vaccination rates and measures to prepare for new variants. They also discussed investments in workers, communities, and law enforcement, with a focus on fairness and justice. The tone was hopeful and emphasized the importance of taking action to improve Americans' lives.

可以看到还是像那么回事的，LLM使用的输入预料的内容答复了一些关于新冠疫情以及工作、社区等内容。

langchain支持多种LLM，有需要的读者可以尝试下使用OpenAI提供的LLM。

读者可以根据需要替换下输入预料，构造自己的私有知识检索库。

本文所有代码如下：

from langchain_community.document_loaders import TextLoader
from langchain.text_splitter import CharacterTextSplitter
from langchain_community.embeddings import OllamaEmbeddings
import weaviate
from weaviate.embedded import EmbeddedOptions
from langchain.prompts import ChatPromptTemplate
from langchain_community.chat_models import ChatOllama
from langchain.schema.runnable import RunnablePassthrough
from langchain.schema.output_parser import StrOutputParser
from langchain_community.vectorstores import Weaviate
import requests
# 下载数据
url = "https://raw.githubusercontent.com/langchain-ai/langchain/master/docs/docs/modules/state_of_the_union.txt"
res = requests.get(url)
with open("state_of_the_union.txt", "w") as f:
    f.write(res.text)
# 加载数据
loader = TextLoader('./state_of_the_union.txt')
documents = loader.load()
# 文本分块
text_splitter = CharacterTextSplitter(chunk_size=500, chunk_overlap=50)
chunks = text_splitter.split_documents(documents)
# 初始化向量数据库并嵌入目标文档
client = weaviate.Client(
    embedded_options=EmbeddedOptions()
)
vectorstore = Weaviate.from_documents(
    client=client,
    documents=chunks,
    embedding=OllamaEmbeddings(model="llama3"),
    by_text=False
)
# 检索器
retriever = vectorstore.as_retriever()
# LLM提示模板
template = """You are an assistant for question-answering tasks. 
   Use the following pieces of retrieved context to answer the question. 
   If you don't know the answer, just say that you don't know. 
   Use three sentences maximum and keep the answer concise.
   Question: {question} 
   Context: {context} 
   Answer:
   """
prompt = ChatPromptTemplate.from_template(template)
llm = ChatOllama(model="llama3", temperature=10)
rag_chain = (
        {"context": retriever, "question": RunnablePassthrough()}
        | prompt
        | llm
        | StrOutputParser()
)
# 开始查询&生成
query = "What did the president mainly say?"
print(rag_chain.invoke(query))

如何学习AI大模型？

我在一线互联网企业工作十余年里，指导过不少同行后辈。帮助很多人得到了学习和成长。

我意识到有很多经验和知识值得分享给大家，也可以通过我们的能力和经验解答大家在人工智能学习中的很多困惑，所以在工作繁忙的情况下还是坚持各种整理和分享。但苦于知识传播途径有限，很多互联网行业朋友无法获得正确的资料得到学习提升，故此将并将重要的AI大模型资料包括AI大模型入门学习思维导图、精品AI大模型学习书籍手册、视频教程、实战学习等录播视频免费分享出来。