Using LLM based on DeepSeek-R1-Distill-Qwen-14B/32B with additional Japanese language training

Motivation

I have been studying PINNs and related OpenFOAM for a while, but yesterday there was a big news in LLM area and I decided to use Deep Seek-R1 which had an impact not only on LLM area but also on stock prices. Since I could not use it as it is in my environment, I used a compacted LLM with quantization.

This time, I used Ollama and Open WebUI to use the quantized model from a browser, and I will summarize the contents.

Sources

  1. Impact of DeepSeek-R1, the Chinese AI that surpasses ChatGPT - from ascii.jp. Cited as an example in an article on DeepSeek-R1.
  2. Announcement from CyberAgent - An announcement from CyberAgent that they have released an LLM based on DeepSeek-R1-Distill-Qwen-14B/32B with additional training on Japanese data.
  3. mmnga/cyberagent-DeepSeek-R1-Distill-Qwen-14B-japanese-gguf - Quantized model of the above CyberAgent version, published on HugginFace.
  4. Conversing with ollama’s LLM using Open WebUI as a front end - I put this together two months ago, but I forgot all about it and had a hard time remembering it. Open WebUI provides a good connection between the Ollama server and the web browser.
  5. Using ollama to run LLM in a local environment - This is an article from the day before the above post, about running Ollama with Docker.

Procedure

Download the model from

The 4-bit “Q 4_K_M”, a quantization model published in source 3, was downloaded using the following command.

$ wget https://huggingface.co/mmnga/cyberagent-DeepSeek-R1-Distill-Qwen-14B-Japanese-gguf/resolve/main/cyberagent-DeepSeek-R1-Distill-Qwen-14B-Japanese-Q4_K_M.gguf

Embedding into the Ollama server

First, start ollama as indicated in “Creating docker-compose.yml” in Resource 5.

$ sudo docker compose up -d

Next, enter the OLLAMA container started above, as described in “Downloading and Executing (the) LLM Model”.

$ sudo docker exec -it ollama /bin/bash

The following is an operation within a container.

# cd /root/.ollama
# ls -l
total 8777476
-rw-rw-r-- 1 1000 1000         68 Jan 28 05:20 Modelfile
-rw-rw-r-- 1 1000 1000 8988110464 Jan 28 05:19 cyberagent-DeepSeek-R1-Distill-Qwen-14B-Japanese-Q4_K_M.gguf
# cat Modelfile
FROM ./cyberagent-DeepSeek-R1-Distill-Qwen-14B-Japanese-Q4_K_M.gguf

Prepare the downloaded model (.ggu file) and Modelfile as described above so that they can be referenced from the container. When you are ready to this point, use the following command to incorporate them into the ollama server.

# ollama create cyberagent-DS-14b-japanese -f Modelfile

For the incorporation part, see also “Incorporating a gguf file as a model” in Source 5.

Launch Open WebUI

$ ls -l
drwxrwxr-x 5 kenji kenji 4096 11月 23 11:19 data
-rw-rw-r-- 1 kenji kenji  243 11月 26 08:33 docker-compose.yml
$ cat docker-compose.yml
services:
  open-webui:
    image: ghcr.io/open-webui/open-webui:main
    container_name: open-webui
    environment:
      - OLLAMA_BASE_URL=http://192.168.11.4:11434
    volumes:
      - ./data:/app/backend/data
    ports:
      - 3000:8080
$ sudo docker compose up -d

For the part about starting Open WebUI, please also refer to “Starting Open WebUI” in Information Source 4.

Connecting from a Web browser

Since the Ollama server and Open WebUI have already been started up, enter “http://192.168.11.8:3000/” from the browser at hand, and the Open WebUI startup screen will appear. After entering your name, e-mail address, and password, you will be prompted with a screen that accepts questions.

Once connected, change the LLM model in the upper left portion of the screen to ask a question. The following is an excerpt of the answer section.

Browser

Summary

I was surprised by the amount of answers to my questions, but when I threw other questions, the answers were relatively compact.

I tried the Japanese version of the popular DeepSeek-R1 Cyber Agent quantized model, using the Ollama and Open WebUI systems I had built up to 2 months ago, but I had forgotten many things, such as how to incorporate them into the model and the startup procedure, so it helped that I had summarized them in the past. It helped me a lot.