Building GraphRAG with Amazon Neptune and LangChain (not yet completed)

Motivation

In this post, I built GraphRAG with Neo4j and LangChain on a trial basis. I am studying bedrock with this book. Since I’ve just started touching AWS, I decided to build GraphRAG on AWS.

Sources.

  1. Amazon Bedrock Part 3 Since this article was written a year ago, the package (library) structure has changed, and some of the code has been changed.
  2. Announced LlamaIndex support for building GraphRAG applications in Amazon Neptune AWS introductory article on building GraphRAG with LlamaIndex.
  3. LlamaIndex + Amazon Neptune GraphRAG I tried GraphRAG built using LlamaIndex and Neptune. Qiita article.

Building GraphRAG with Neptune

Referring to source 1., I proceeded with the following steps.

  1. Start and configure Neptune

  2. Create database

  3. Create a notebook (JupyterLab) from Neptune

  4. Input code from JupyterLab and try it out.

    The data to be stored in the database is text data about U.S. movies written in Cypher.

Building Neptune

Start Neptune from the management console

Select the “Neptune” service from the management console of your IAM account and start Neptune by selecting “Start Amazon Neptune”.

Database Creation

In the management console, specify Neptune > Databases > Create Database and configure as follows.

Neptune1

Neptune2

Neptune3

Neptune4

Neptune5

It appeared that the IAM role specified above was not set. I re-specified the same settings as for the notebook creation above and created the notebook (see below).

Neptune6

Now click on “Create Database” to create the database.

The following error occurred during database creation.

DBクラスター作成中にエラーが発生しましたgraph-llm-1.
The DB Subnet group doesn't meet Availability Zone (AZ) coverage requirement. Current AZ coverage: us-east-1a. Add subnet to coer at least 2 AZs.

In the VPC I was using, the reason seems to be that I had created only one AZ. It seems they use multiple AZs for high database reliability. I learned one more thing here. I added AZs referring to this article. When I created AZs, I paid attention to the IP address (CIDR). The first private I had created for the bedrock study was 10.0.128.0/20, so I set it to 10.0.192.0/20.

The database was then created again.

At this time, I got an error message that “could not create notebook aws-neptune-graph-llm-1,” but I ignored it and proceeded, thinking that I would create it later.

The message said it was created successfully.

Neptune7

Starting a notebook

Creating a notebook

From the Neptune side menu, I selected “Notebooks” and made the same settings as in the “Notebook Settings” section, which already has a screenshot.

Notebook aws-neptune-graph-llm-1 was created, but without the “graph-llm-DELETE” role name I should have specified, since the IAM role had a role name of “AWSNeptuneNotebookRole-1730423203115”, In the policy editor, I added Bedrock. In the policy editor, “bedrock:*” was created.

Open JupyterLab

From Neptune notebooks, click on the notebook created above and open JupyterLab by clicking “Actions” > “Open JupyterLab”.

Try the code

From here, try the code by the usual JupyterLab operations.

Installing the package (library)

In source 1., boto3 and botocore langchain are installed, but that is not necessary because they are already installed on the system I built with the above procedure. I installed the following packages.

%pip install -U langchain-aws langchain-community

Prepare Neptune graphs

from langchain.graphs import NeptuneGraph

host = "graph-llm-1.cluster-ro-******.us-east-1.neptune.amazonaws.com"
port = 8182
use_https = True

graph = NeptuneGraph(host = host, port = port, use_https=use_https)

Populating a graph database

Create a graph database from text written in OpenCypher syntax.

%%oc
CREATE (TheMatrix:Movie {title:'The Matrix', released:1999, tagline:'Welcome to the Real World'})
CREATE (Keanu:Person {name:'Keanu Reeves', born:1964})
・・・

This was the first time I learned about the magic command “%%oc” !

Prepare LLM

from langchain_aws import BedrockLLM
from langchain.chains import NeptuneOpenCypherQAChain

modelId = 'anthropic.claude-v2' 
model_kwargs = {
    "max_tokens_to_sample": 512,
    "temperature": 0, 
    "top_k": 250, 
    "top_p": 1, 
    "stop_sequences": ["\n\nHuman:"] 
}

llm = Bedrock(
    model_id=modelId,
    model_kwargs=model_kwargs
)

The first line “from langchain_aws import BedrockLLM” was changed from the code in source 1.

Graph database to answer questions

chain = NeptuneOpenCypherQAChain.from_llm(llm = llm, graph=graph,verbose=True,)

chain.run("who played in Top Gun ?")

When I run the above, I get a “ValueError”. I have looked into it, but so far the cause is unknown. I gave up at this point.

Execute OpenCypher query

%%oc
MATCH (p:Person)-[:ACTED_IN]->(m:Movie {title:'Top Gun'})
RETURN p.name

Neptune8

The Cypher query produces successful results, so the graph database seems to be ready.

Summary

This time, I was not able to actually use the information in the graph database to obtain answers using LLM, but I was able to install Neptune and build the graph database. In the future, I would like to investigate the cause of the failure and build a working system by referring to sources 2. and 3.

After using the hands-on assignment to touch AWS this time, I thought I need to learn a little more about AWS IAM roles and so on.;