“No GPU… Yet I Ran an LLM” — My Local AI Story 🚀

I had a simple computer — no GPU, just a CPU.

Initially, I thought of using free cloud resources and hosting an LLM there.

Then a friend casually said:

“You don’t even need a GPU. You can run an LLM locally.”

At first, I didn’t believe it.

But after doing a little research, I realized — he was completely right.

💡 Realization Moment

Not all LLMs are heavyweight.

Some models are so lightweight that they can run on a CPU, and heavy ones can be quantized to become smaller and faster.

The question then was:

If I’m running an LLM locally, which model will actually be useful for me?

🧠 Model Choice

My focus was on coding — debugging, code generation, and explanations.

So I chose: CodeLlama 7B

This model is based on Meta’s LLaMA architecture and specially trained for programming tasks.

Perfect tool for a developer.

🔧 Why Ollama?

There were many options for local LLMs, but I chose Ollama because it provides a zero-drama setup: install, pull the model, and run.

This time, I decided:

“Let’s run it on Docker — clean, isolated, and professional setup.”

🐳 Running an LLM on Docker (Step-by-Step)

Step 1: Install Docker

If Docker is not installed yet:


sudo apt update
sudo apt install docker.io -y
sudo systemctl start docker
sudo systemctl enable docker

📸 Screenshot idea:

Terminal showing docker --version ✅

Step 2: Pull the Ollama Docker Image


docker pull ollama/ollama

📸 Screenshot idea:

Docker image pulling progress ✅

Step 3: Run the Ollama Container


docker run -d \
 --name ollama \
 -p 11434:11434 \
 ollama/ollama

This means:

Ollama will run in the background
Port 11434 is exposed
Your system stays clean (no conflicts)

Step 4: Pull & Run CodeLlama 7B


docker exec -it ollama ollama run codellama:7b

⏳ Note:

The first time you run this, it will take a while — the model is being downloaded.

During this time, grabbing a coffee is completely justified ☕😄

📸 Screenshot idea:

Model download progress and the >>> prompt appearing ✅

Step 5: Talk to the Model 💬

Now you can directly write:

Explain this Python error
Write a REST API in Flask
Optimize this SQL query

If the model feels slow:

Try a smaller model than 7B
Or use a quantized variant

🎯 Final Thoughts

This whole journey taught me one thing:

LLMs are not just for people with big GPUs.

With a little understanding, the right tools, and a smart setup — powerful AI can run on a CPU too.

Whether you’re a developer, a student, or just curious —

local LLM + Docker = future-proof skill 💪

LLM Run

“No GPU… Yet I Ran an LLM” — My Local AI Story 🚀

🧠 Model Choice

🔧 Why Ollama?

🐳 Running an LLM on Docker (Step-by-Step)

Step 1: Install Docker

Step 2: Pull the Ollama Docker Image

Step 3: Run the Ollama Container

Step 4: Pull & Run CodeLlama 7B

Step 5: Talk to the Model 💬

🎯 Final Thoughts

Mutahir Shahzad

Share this article

Stay Updated