Building a RAG system using the HyDE method


I built a RAG using HyDE (Hypothetical Document Embeddings), a method to improve RAGs. This post summarizes my trial of HyDE. The LLM used was gpt-4o-mini to keep costs down.


  1. LLM Fine Tuning and RAG I learned about HyDE from this book and tried the samples listed there.
  2. RAG with langchain and Databricks (I) Learn RAG : RAG with HyDE I used the code in this article as a reference.
  3. [RAG] Try HyDE with LangChain I also referred to the code in this article.
  4. AutoHyDE: Next Generation Methodology for RAG Development (Introduction to AutoHyDE, an extension of HyDE) An article explaining AutoHyDE, an advanced form of HyDE. I would like to try this one in the future. This article is a Japanese explanation of the article here.


Many parts of the code were adapted from this post.

Load the vector database

WIKI_DB = "../20240813_RAG_MakeDB/wiki_vs.db"
import os
from langchain_huggingface import HuggingFaceEmbeddings
from langchain_community.vectorstores import FAISS

model_name = "intfloat/multilingual-e5-large"
model_path = f"/workdir/models/{model_name}"

embeddings = HuggingFaceEmbeddings(
    model_name = model_path,
    model_kwargs = {'device':'cuda:0'},

# 事前に構築したベクトルデータベースを読み込む
if os.path.exists(WIKI_DB):
    db = FAISS.load_local(WIKI_DB, embeddings, allow_dangerous_deserialization=True)
    print("You need to make vector database")

Build a retriever

# 検索器を構築
# ベクトル検索で2文書を得る(k=2)
retriever = db.as_retriever(search_kwargs={'k':2})

Set API key

# OpenAIのAPIキーを設定する
import os

os.environ['OPENAI_API_KEY'] = 'xxxxxxxxxxxxxxxxx'

Configure gpt-4o-mini as LLM

# LLMを設定
from langchain_openai import ChatOpenAI
from langchain.chains import LLMChain

llm = ChatOpenAI(model_name="gpt-4o-mini")

Prepare a HyDE template

# HyDEテンプレート(仮想検索用のテンプレート)

from langchain_core.prompts.prompt import PromptTemplate
from langchain.retrievers import RePhraseQueryRetriever

hyde_prompt_template = """以下の質問の回答を書いてください。
質問: {question}

# Hydeプロンプト
hyde_prompt = PromptTemplate.from_template(hyde_prompt_template)

# HyDE retriever
rephrase_retriever = RePhraseQueryRetriever.from_llm(
    retriever = retriever,
    llm = llm,
    prompt = hyde_prompt,

Prepare a RAG template

# テンプレートを準備

template = """



from langchain_core.prompts import ChatPromptTemplate

prompt = ChatPromptTemplate.from_template(template)

Define a function to format Documents

# Documentsを整形する関数

def doc_to_str(docs):
    return "\n---\n".join(doc.page_content for doc in docs)

from langchain_core.output_parsers import StrOutputParser
output_parser = StrOutputParser()

Build Chain

# LangChain(LCEL)によりChainを作成

from langchain_core.runnables import RunnableParallel, RunnablePassthrough

chain = (
    {"context": rephrase_retriever | doc_to_str, "question": RunnablePassthrough()}
    | prompt
    | llm
    | output_parser


question = "B2FH論文について教えてください"


B2FH論文BFH論文)は、元素の起源に関する記念碑的な論文で、題名は "Synthesis of the Elements in Stars" です。著者はマーガレット・バービッジ、ジェフリー・バービッジ、ウィリアム・ファウラー、フレッド・ホイルの4名で、彼らの頭文字を取って「B2FH」として知られています。この論文は1955年から1956年にかけてケンブリッジ大学とカリフォルニア工科大学で執筆され、1957年にアメリカ物理学会の査読付き学術誌"Reviews of Modern Physics"で発表されました。


The content of the response seems to be almost identical to the first part of the corresponding entry in wikipedia.


In the book “Source 1.”, the implementation was done using Hypothetical DocumentEmbedder(), but this time I used the from_llm() function of the RePhraseQueryRetriever class, referring to Source 2. etc. It is a very simple implementation using chain.

I would like to try AutoHyDE in the future. I would like to deepen my understanding of LCEL a little more.