Artificial Intelligence
Article cover

How to Install and Run Ollama in Termux on Android (2026 Guide)

For a long time, running a Large Language Model (LLM) on your phone meant compiling complex C++ code or using highly specialized, locked-down apps. But it is 2026, and you can now run a full-fledged local terminal-based AI experience on Android.

By combining Termux (the powerful Android terminal emulator) and Ollama (the standard for local LLM management), you can run models like Qwen 3.5, Llama 3, and Gemma 3 directly in your pocket. This setup is 100% private, works offline in Airplane Mode, and costs nothing.

In this guide, I will take you step-by-step through installing and configuring Ollama inside Termux, running your first model, and troubleshooting common mobile errors.


Step 1: Prepare the Termux Environment

To ensure a smooth installation, you must use the version of Termux from F-Droid or GitHub. The Google Play Store version is severely outdated and will result in compile or package errors.

  1. Download and install Termux from F-Droid.
  2. Open Termux and update your packages to the latest versions by running:
    pkg update && pkg upgrade -y
  3. Install the essential utilities required for network connections and storage access:
    pkg install curl git proot -y

Step 2: Install Ollama in Termux

Ollama has official ARM64 support, making the Termux installation direct and straightforward in 2026.

Run the following command to download and install the package:

pkg install ollama -y

⚠️ Permission Warning: Always perform your Ollama directory commands and model storage in the home directory (~) of Termux. If you attempt to save or run models from external shared storage (like /sdcard), Android’s file system restrictions will trigger standard Permission Denied errors.


Step 3: Run the Ollama Server

Unlike a desktop machine where Ollama starts automatically in the background, you must launch the Ollama server manually in Termux.

  1. Start the Ollama background daemon:

    ollama serve &

    The & symbol pushes the process to the background, allowing you to continue using the same terminal screen.

  2. Verify that the server is active by running:

    ollama list

    You should see an empty table, indicating the server is running and ready to download models.


Step 4: Download and Chat with a Model

For mobile devices, you should stick to lightweight models designed for edge hardware. Models under 3 Billion parameters (3B) provide the perfect balance of fast token generation and minimal battery drain.

I highly recommend starting with the Qwen 3.5 0.8B model, which performs exceptionally well for its size.

To download and immediately start chatting with the model, run:

ollama run qwen3.5:0.8b

Once the download completes, you will be dropped into an interactive chat prompt. Type your question and hit Enter! To exit the chat at any time, type /exit.

Model NameSizeBest Use CaseExpected Speed
Qwen 3.5 0.8B~1GBGeneral Q&A / Fast Chat10–18 t/sec
LFM 2.5 1.2B~731MBThinking, for on-device use8–14 t/sec
Gemma 4 E2B~7GBMulti-Modal6–10 t/sec

You can find more models over at https://ollama.com/library


Connecting a Graphic Interface (Web UI)

If you don’t want to use the command line for everyday chatting, you can connect a graphic interface client.

Since your Ollama Termux server listens on http://127.0.0.1:11434, you can install standard mobile clients like PocketPal (Android) or configure your homelab’s Open-WebUI client to point directly to your phone’s IP address.

You can read more about different ways to run LLMs locally on a device in my blog post “Top 4 ways to Run LLM locally on Android and iOS” or on other devices in my blog post “Running 24/7 Local AI on an Old Android without Overheating”.


Frequently Asked Questions (FAQ)

Can I run a local LLM in Termux on Android?

Yes. Using the standard pkg install ollama package in Termux, you can host a fully local, offline Ollama server directly on any ARM64 Android device. This allows you to run quantized GGUF models privately with zero cloud fees.

How do I resolve permission denied or building errors in Termux?

To avoid permission errors, ensure that your models and configurations are located in Termux’s isolated home directory (~ or /data/data/com.termux/files/home). Avoid external directories like /sdcard or /storage/emulated/0, as Android blocks execution permissions on shared storage.

What is the best lightweight local LLM for Termux in 2026?

The best lightweight model for mobile is Qwen 3.5 0.8B. It requires only 1.2GB of RAM, downloads in minutes, and yields an impressive 10-18 tokens per second on older devices, retaining strong logical comprehension.

Do I need internet access to use Ollama in Termux?

Internet access is only required once during the initial setup and when running the ollama run command to pull the model. Once downloaded, you can enable Airplane Mode and chat with the local AI completely offline.

Recent Posts

View all posts →