ChatGLM6b Install

 

The ChatGLM6b model has impressive results and can run on our current 12 GB GPU so we decided to download and install it while waiting for our new 48 GB GPU to come on the boat to Anguilla.   Note it was trained on both Chinese and English and is still impressive in English.

To download the model from HugginFace.co we made a unsecure script hugginface.download.py that you can download and then use.  It is unsecure as it will download and run code so don't use on a computer with any secrets or important stuff.  To our script to download maybe 12GB with all the weights etc. do:

 python3 hugginface.download.py THUDM/chatglm-6b

Then we used the English README file form GitHub as instructions and from the right directory to line up with the huggingface  weights did:

git clone https://github.com/THUDM/ChatGLM-6B

Our GPU can only handle INT8 version of this model so we had to change one line of  demo.cli.py to from
 

model = AutoModel.from_pretrained("THUDM/chatglm-6b", trust_remote_code=True).half().cuda()

To:
model = AutoModel.from_pretrained("THUDM/chatglm-6b", trust_remote_code=True).half().quantize(8).cuda()

 

Then we needed to install Nvidia Cuda so we went to:

https://developer.nvidia.com/cuda-downloads
 

We used type of OS as "Linux/X86-64/WSL-Ubuntu" as we have Microsoft's WSL for running Ubuntu on windows.
Then clicked "deb(network)" as type of installer and did what it said.  This took more than a half hour to download/install.


Then we could run it with:

cd ChatGLM-6B
python3 cli_demo.py 
 
Takes a minute to start up but then it runs fast and does well for a 6B model and only INT8. 
It does make up lots of stuff. 
  
 

Comments

Popular posts from this blog

Chatbot Arena and Leaderboard

Amazingly easy to run LLMs with Ollama