I am summarizing here what I have found out through research and hands-on work on my area of interest.

First steps to RAG using knowledge graphs

Introduction

A while ago in this post, I described how I installed nao4j in a local environment (as a docker container) in order to use the knowledge graph.

In this post, I would like to summarize the contents of the simple knowledge graph that I built and used as a RAG, referring to an article on the Internet. I titled this post as first steps because I did exactly what the article on the internet said.

[Read More]

Try PLaMo Beta Version

Introduction

I read this article on August 8th. According to the article, a subsidiary of Preferred Networks (PFE) will start offering a free trial of LLM, which has Japanese language performance that exceeds GPT-4, prior to offering a commercial version.

I immediately applied for the free trial, received an email of acceptance, and waited for the notification of account issuance. I had received the notification e-mail on August 9th, but I had overlooked it and completely forgot that I had applied for it. Recently, after reading this post, I remembered about the free trial, rechecked my email, and found the account notification.

In this post, I will summarize what I tried of the free trial version.

[Read More]

install neo4j and try knowledge graphs

Motivation

So far, we have built RAG system using FAISS and BM25. Although vector search is relatively easy to construct, there are cases where the necessary information is not found in “k” documents, and I was looking for ways to improve the accuracy. I happened to read this article and became interested in the knowledge graph and decided to try it myself.

In this post, I will summarize the process of installing nao4j in my local environment and trying to use it from a browser in order to use the knowledge graph.

[Read More]

Build RAG system

Introduction

By yesterday, I had extracted astronomy-related entries from Wikipedia and created a vector database and keyword base for RAG. Here, I will use those databases to build the RAG system.

The LLMs used are ChatGPT (gpt-4o) and Llama-3-ELYZA-JP-8B.

[Read More]

Creating text data for RAG from Wikipedia dump data

Motivation

I am experimenting with RAG using LangChain and was thinking about what to use for data for checking and decided to use wikipedia dump data. Since the volume of the whole is large, I decided to use data from the astronomy-related categories that I am interested in.

Here, I summarized a series of steps to extract only specific categories of data from the wikipedia dump data.

[Read More]

llama-cpp-python - impact of numpy version upgrade

Introduction. NumPy 2.0.0 was released on June 16. I first noticed it the other day when I tried RAG with using langchain and got an error when building the docker container. Later, I encountered another error in CMake when trying to incorporate llama-cpp-python. This article summarizes my responses to the two errors I recently experienced. Dealing with errors related to NumPy 2.0.0 Background I recently decided to learn RAG properly, I purchased a japanese book called [LLM fine tuning and RAG](https://www. [Read More]

Try RAG with LlamaIndex

Motivation

In this post where I tested Chatbot UI, I mentioned that one of my future challenges is to work with RAG (Retrieval Augmented Generation). In this post, I summarized how to achieve RAG using LlamaIndex.

Actually, I tried RAG using Langchain late last year. Since then, I have heard a lot of keywords with LlamaIndex, so I decided to realize RAG using LlamaIndex this time.

[Read More]

Try the Chatbot UI

Introduction

In a recent post, I ran the ELYZA 7B model in a local environment using llama-cpp-python. In that post, I mentioned that “about the future” I would like to try to build a system that can chat like ChatGPT.

This time, I built a system that can chat like ChatGPT on a docker container, and I summarize its contents here.

[Read More]

Running Elyza models on GPU using llama-cpp-python

Motivation

Quantization is essential to run LLM on the local workstation (12-16 GB of GPU memory). In this post, I summarize my attempt to maximize GPU resources using llama-cpp-python.

The content includes some of my mistakes, as I got into some areas due to my lack of understanding.

[Read More]