ChatGLM6b Install

April 10, 2023

The ChatGLM6b model has impressive results and can run on our current 12 GB GPU so we decided to download and install it while waiting for our new 48 GB GPU to come on the boat to Anguilla. Note it was trained on both Chinese and English and is still impressive in English.

To download the model from HugginFace.co we made a unsecure script hugginface.download.py that you can download and then use. It is unsecure as it will download and run code so don't use on a computer with any secrets or important stuff. To our script to download maybe 12GB with all the weights etc. do:

python3 hugginface.download.py THUDM/chatglm-6b

Then we used the English README file form GitHub as instructions and from the right directory to line up with the huggingface weights did:

git clone https://github.com/THUDM/ChatGLM-6B

Our GPU can only handle INT8 version of this model so we had to change one line of demo.cli.py to from

model = AutoModel.from_pretrained("THUDM/chatglm-6b", trust_remote_code=True).half().cuda()

To:
model = AutoModel.from_pretrained("THUDM/chatglm-6b", trust_remote_code=True).half().quantize(8).cuda()

Then we needed to install Nvidia Cuda so we went to:

https://developer.nvidia.com/cuda-downloads

We used type of OS as "Linux/X86-64/WSL-Ubuntu" as we have Microsoft's WSL for running Ubuntu on windows.
Then clicked "deb(network)" as type of installer and did what it said. This took more than a half hour to download/install.

Then we could run it with:

cd ChatGLM-6B

python3 cli_demo.py

Takes a minute to start up but then it runs fast and does well for a 6B model and only INT8.

It does make up lots of stuff.

Search This Blog

BestBot.AI

ChatGLM6b Install

Comments

Post a Comment

Popular posts from this blog

Orca

Chatbot Arena and Leaderboard

Amazingly easy to run LLMs with Ollama