Motivation
In this post, I built GraphRAG with Neo4j and LangChain on a trial basis. I am studying bedrock with this book. Since I’ve just started touching AWS, I decided to build GraphRAG on AWS.
Sources.
- Amazon Bedrock Part 3 Since this article was written a year ago, the package (library) structure has changed, and some of the code has been changed.
- Announced LlamaIndex support for building GraphRAG applications in Amazon Neptune AWS introductory article on building GraphRAG with LlamaIndex.
- LlamaIndex + Amazon Neptune GraphRAG I tried GraphRAG built using LlamaIndex and Neptune. Qiita article.
Building GraphRAG with Neptune
Referring to source 1., I proceeded with the following steps.
-
Start and configure Neptune
-
Create database
-
Create a notebook (JupyterLab) from Neptune
-
Input code from JupyterLab and try it out.
The data to be stored in the database is text data about U.S. movies written in Cypher.
Building Neptune
Start Neptune from the management console
Select the “Neptune” service from the management console of your IAM account and start Neptune by selecting “Start Amazon Neptune”.
Database Creation
In the management console, specify Neptune > Databases > Create Database and configure as follows.
It appeared that the IAM role specified above was not set. I re-specified the same settings as for the notebook creation above and created the notebook (see below).
Now click on “Create Database” to create the database.
The following error occurred during database creation.
DBクラスター作成中にエラーが発生しましたgraph-llm-1.
The DB Subnet group doesn't meet Availability Zone (AZ) coverage requirement. Current AZ coverage: us-east-1a. Add subnet to coer at least 2 AZs.
In the VPC I was using, the reason seems to be that I had created only one AZ. It seems they use multiple AZs for high database reliability. I learned one more thing here. I added AZs referring to this article. When I created AZs, I paid attention to the IP address (CIDR). The first private I had created for the bedrock study was 10.0.128.0/20, so I set it to 10.0.192.0/20.
The database was then created again.
At this time, I got an error message that “could not create notebook aws-neptune-graph-llm-1,” but I ignored it and proceeded, thinking that I would create it later.
The message said it was created successfully.
Starting a notebook
Creating a notebook
From the Neptune side menu, I selected “Notebooks” and made the same settings as in the “Notebook Settings” section, which already has a screenshot.
Notebook aws-neptune-graph-llm-1 was created, but without the “graph-llm-DELETE” role name I should have specified, since the IAM role had a role name of “AWSNeptuneNotebookRole-1730423203115”, In the policy editor, I added Bedrock. In the policy editor, “bedrock:*” was created.
Open JupyterLab
From Neptune notebooks, click on the notebook created above and open JupyterLab by clicking “Actions” > “Open JupyterLab”.
Try the code
From here, try the code by the usual JupyterLab operations.
Installing the package (library)
In source 1., boto3 and botocore langchain are installed, but that is not necessary because they are already installed on the system I built with the above procedure. I installed the following packages.
%pip install -U langchain-aws langchain-community
Prepare Neptune graphs
from langchain.graphs import NeptuneGraph
host = "graph-llm-1.cluster-ro-******.us-east-1.neptune.amazonaws.com"
port = 8182
use_https = True
graph = NeptuneGraph(host = host, port = port, use_https=use_https)
Populating a graph database
Create a graph database from text written in OpenCypher syntax.
%%oc
CREATE (TheMatrix:Movie {title:'The Matrix', released:1999, tagline:'Welcome to the Real World'})
CREATE (Keanu:Person {name:'Keanu Reeves', born:1964})
・・・
This was the first time I learned about the magic command “%%oc” !
Prepare LLM
from langchain_aws import BedrockLLM
from langchain.chains import NeptuneOpenCypherQAChain
modelId = 'anthropic.claude-v2'
model_kwargs = {
"max_tokens_to_sample": 512,
"temperature": 0,
"top_k": 250,
"top_p": 1,
"stop_sequences": ["\n\nHuman:"]
}
llm = Bedrock(
model_id=modelId,
model_kwargs=model_kwargs
)
The first line “from langchain_aws import BedrockLLM” was changed from the code in source 1.
Graph database to answer questions
chain = NeptuneOpenCypherQAChain.from_llm(llm = llm, graph=graph,verbose=True,)
chain.run("who played in Top Gun ?")
When I run the above, I get a “ValueError”. I have looked into it, but so far the cause is unknown. I gave up at this point.
Execute OpenCypher query
%%oc
MATCH (p:Person)-[:ACTED_IN]->(m:Movie {title:'Top Gun'})
RETURN p.name
The Cypher query produces successful results, so the graph database seems to be ready.
Summary
This time, I was not able to actually use the information in the graph database to obtain answers using LLM, but I was able to install Neptune and build the graph database. In the future, I would like to investigate the cause of the failure and build a working system by referring to sources 2. and 3.
After using the hands-on assignment to touch AWS this time, I thought I need to learn a little more about AWS IAM roles and so on.;