正課第六節 (搭配UI應用) : 非常重要的搜尋引擎搭配LLM概念 - RAG技術: 智能助手開發實戰 - Cupoy

使用Streamlit 和 LangChain 來構建一個帶有搜索功能的聊天機器人應用程式。主要目的是讓使用者與一個整合了多種工具（如 Arxiv、Wikipedia 和 DuckDuckGo）進行交...

使用Streamlit 和 LangChain 來構建一個帶有搜索功能的聊天機器人應用程式。主要目的是讓使用者與一個整合了多種工具（如 Arxiv、Wikipedia 和 DuckDuckGo）進行交互，並由一個代理（Agent）來處理這些查詢。以下詳細分解和註解說明： 1.載入庫 import streamlit as st from langchain_groq import ChatGroq from langchain_community.utilities import ArxivAPIWrapper, WikipediaAPIWrapper from langchain_community.tools import ArxivQueryRun, WikipediaQueryRun, DuckDuckGoSearchRun from langchain.agents import initialize_agent, AgentType from langchain.callbacks import StreamlitCallbackHandler import os from dotenv import load_dotenv Streamlit：這是一個用於構建交互式網頁應用的 Python 庫，這裡用來構建聊天機器人的前端界面。 • LangChain 庫的相關模組： • ChatGroq：這是一個特定的聊天模型接口，使用 Groq 作為後端的 API。 • ArxivAPIWrapper 和 WikipediaAPIWrapper：這些是用來處理來自 Arxiv 和 Wikipedia 的 API 查詢結果的工具包。 • ArxivQueryRun 和 WikipediaQueryRun：這些是用來從 Arxiv 和 Wikipedia 中進行查詢的具體工具封裝。 • DuckDuckGoSearchRun：這是用來進行 DuckDuckGo 搜索查詢的工具封裝。 • initialize_agent：這是用來初始化 LangChain 的代理的函數，它將多個工具整合到一個代理中。 • StreamlitCallbackHandler：這是用來將代理執行過程（如思維和動作）顯示在 Streamlit 應用中的回調處理器。 2. 載入 .env 檔案 load_dotenv(dotenv_path="“) 這段程式碼載入指定路徑的 .env 檔案，這個檔案通常用來存儲敏感資訊，如 API 金鑰等。 • dotenv_path 指定了 .env 檔案的路徑，透過這種方式，我們可以安全地從檔案中讀取環境變數，而不需要將敏感資訊硬編碼在程式中。 3. 設定查詢工具 arxiv_wrapper = ArxivAPIWrapper(top_k_results=1, doc_content_chars_max=200) arxiv = ArxivQueryRun(api_wrapper=arxiv_wrapper) api_wrapper = WikipediaAPIWrapper(top_k_results=1, doc_content_chars_max=200) wiki = WikipediaQueryRun(api_wrapper=api_wrapper) search = DuckDuckGoSearchRun(name="Search") • Arxiv 和 Wikipedia API 包裝器：這裡創建了兩個查詢 API 的封裝器，限制了每次返回的結果數量（top_k_results=1），以及每篇文章返回的最大字元數（doc_content_chars_max=200）。 • DuckDuckGo 搜索工具：定義了一個 DuckDuckGo 的查詢工具，名稱為 Search，用於執行網路搜索查詢。 4. Streamlit 應用的標題和介紹 st.title("🔎 LangChain - Chat with search") """ In this example, we're using `StreamlitCallbackHandler` to display the thoughts and actions of an agent in an interactive Streamlit app. Try more LangChain 🤝 Streamlit Agent examples at [github.com/langchain-ai/streamlit-agent](https://github.com/langchain-ai/streamlit-agent). """ • 這一部分在 Streamlit 應用的主頁顯示標題「LangChain - Chat with search」，並解釋該應用的功能，即與多個搜索工具進行交互。 • 文本描述提供了更多關於應用的背景信息，並引導用戶查看更多示例。 5. Sidebar 設定 st.sidebar.title("Settings") api_key = st.sidebar.text_input("Enter your Groq API Key:", type="password") • Sidebar 設定：在 Streamlit 應用的側邊欄中，要求使用者輸入他們的 Groq API 金鑰，這是必需的以便後續使用 ChatGroq 模型進行交互。 • type=“password”：確保輸入的金鑰以隱藏的方式顯示，以保護敏感資訊。 6. 初始化聊天消息狀態 if "messages" not in st.session_state: st.session_state["messages"] = [ {"role": "assisstant", "content": "Hi,I'm a chatbot who can search the web. How can I help you?"} ] • st.session_state：這是 Streamlit 中用來存儲應用程序狀態的內建對象。在這裡，使用者的聊天消息被存儲在 session_state 中，以便在整個應用中持續保留。 • 如果 session_state 中還沒有消息，則初始化一條歡迎消息，扮演助手的角色，並詢問使用者的需求。 7. 顯示現有聊天消息 for msg in st.session_state.messages: st.chat_message(msg["role"]).write(msg['content']) • 這段程式碼用來遍歷 st.session_state 中的所有消息，並將其顯示在聊天窗口中。st.chat_message() 根據消息的角色（“user” 或 “assistant”）來渲染消息。 8. 接收用戶輸入 if prompt := st.chat_input(placeholder="What is machine learning?"): st.session_state.messages.append({"role": "user", "content": prompt}) st.chat_message("user").write(prompt) • st.chat_input() 用來顯示一個聊天輸入框，讓使用者可以輸入問題（例如「What is machine learning？」）。 • 當使用者輸入內容後，該輸入會被加入 session_state 中，並在應用中顯示出來。 9. 使用 ChatGroq 模型處理查詢並使用工具 llm = ChatGroq(groq_api_key=api_key, model_name="Llama3-8b-8192", streaming=True) tools = [search, arxiv, wiki] search_agent = initialize_agent(tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, handling_parsing_errors=True) langchain.agents.agent_types.AgentType — 🦜🔗 LangChain 0.2.16 (可以自己選Agent Planing and Actor 演算法) llm = ChatGroq(...)：這段程式碼實例化了 ChatGroq 模型，這是一個具備流式輸出功能的大型語言模型，名稱為 “Llama3-8b-8192”，並使用輸入的 api_key 進行認證。 • tools：這裡定義了多個工具供代理使用，包括 Arxiv、Wikipedia 和 DuckDuckGo 查詢工具。 • initialize_agent(...)：這是核心步驟，將定義好的工具與 ChatGroq 模型整合到一個代理中，該代理將根據用戶的查詢選擇合適的工具來進行回答。 10. 處理代理的回應並顯示 with st.chat_message("assistant"): st_cb = StreamlitCallbackHandler(st.container(), expand_new_thoughts=False) response = search_agent.run(st.session_state.messages, callbacks=[st_cb]) st.session_state.messages.append({'role': 'assistant', "content": response}) st.write(response) • StreamlitCallbackHandler：這個回調處理器用來顯示代理在執行查詢過程中的「思維」和「動作」，讓用戶能夠即時看到代理的決策過程。 • search_agent.run(...)：代理會接收用戶輸入的聊天消息，並使用定義的工具來處理查詢，最終返回結果。 • 結果會被寫入 session_state 中，並顯示在聊天窗口中作為「assistant」的回應。所以，整個程式碼構建了一個完整的互動聊天機器人，讓使用者可以通過輸入問題與多種信息來源（如 Arxiv、Wikipedia 和 DuckDuckGo）進行交互。這個代理能夠根據使用者查詢的內容自動選擇最適合的工具來給出回應。