Run Japanese LLMs on an on-premise environment

Motivation

Looking back on the year 2023, it was a year in which many Japanese LLMs (Large Language Models) were released. I have tried to run some of them in my home environment, so I will summarize them here.

[Read More]

Modify CNN training code to work with Horovod

Introduction

With Try Horovod in Docker, you can now use Horovod in your own environment (on-premises) and in a Docker environment. The next thing to do is to modify the training code running on a single server to apply it to distributed training using Horovod! For starters, I modified a relatively simple CNN code to allow distributed learning using Horovod, which is summarized in the following article.

[Read More]

Try Horovod in Docker

Motivation

I have been interested in Distributed Training for about a year. I have been experimenting with a distributed learning framework called Horovod on multiple TITAN-V-capable machines. I finally got a distributed training sample working, so I am posting it here.

[Read More]

Uninstall Rootless Docker

Introduction

In May of this year, I posted an article Building Rootless Docker, but for some reason I decided to uninstall rootless docker. The following is a summary of the uninstallation procedure. I will post the details of the circumstances that led to the uninstallation of rootless docker later.

[Read More]

Trying NVIDIA Modulus - Introduction to PINNs

Introduction

Over a month ago, I became interested in NVIDIA Modulus at the How to Speed Up Simulation with AI Surrogate Models? seminar I attended, I became interested in NVIDIA Modulus, so I bought the book and started studying it. As a prerequisite for future study of Modulus, I installed Modulus in my environment, so I summarized the installation process as “Introduction to PINNs”.

[Read More]

running rinna 3.6b on a docker container

Motivation

I wanted to try out a large-scale language model (LLM) for Japanese, so I used rinna, which was released in May. To save installation time, I ran rinna under a docker container environment.

I ran into some problems in doing so, which are summarized below.

[Read More]

Install ubuntu 22.04 LTS

Introduction

It has been a year since Ubuntu 22.04 LTS Jammy Jellyfish was released, and I decided to switch from my previous 20.04 to 22.04, thinking that I would be able to get used to the various software and other features.

I installed ubuntu 22.04 Japanese Remix on several workstations so that I can use Japanese easily. I have gotten a bit stuck trying to install ubuntu 22.04 on a NVIDIA GPU-equipped workstation, so here is the situation and how I responded.

[Read More]

Build a private docker registry

Motivation

In a previous post I summarized how to start a docker container in user mode.

When using multiple PCs (WorkStation, hereafter WS), it is necessary to consider how to realize a mechanism to share containers among multiple WS. In the case of singularity, we could store sif files on an NFS server and use them from other WS.

In case of docker, I decided to build a registry server, thinking that I could set up a private registry in my home network and operate it.

[Read More]

Building a Rootless Docker

Motivation

I have been using singularity containers instead of docker. the reason I have avoided using docker is that it requires root privileges to launch containers. I don’t have to worry about it because I’m running it at home.

I forgot the reason why, but I found out that docker has a rootless mode that allows you to run docker in normal user mode, so I decided to give it a try this time.

[Read More]

Speed-up Learning and inference on Pytorch

Motivation

In my home environment where I use NFS to store JupyterLab notebooks, I measured the performance when the NFS server is a RaspberryPi or HP Z240, and found that in the learning loop (state in which epochs are stacked), there is no significant difference whether the NB is stored on an NFS server or locally. I found that there was no significant difference between NBs stored on an NFS server or locally.

Therefore, I have challenged to speed up the learning process, and I summarize the progress/results here.

[Read More]