Downshift Blog

A Блог about travel, technology and the life at a slower pace. The journey is as important as the destination.

Run AI Offline on Your Phone: HOWTO Guide for Mistral, Llama2 and Gemma LLMs on Xiaomi 13 Ultra

(Reading time: 4 - 8 minutes)
Thursday, 22 February 2024 16:00; Read 348 times

High 5!

Welcome to this step-by-step guide on how to run AI locally and offline on your Xiaomi 13 Ultra.

This method is applicable to both rooted and non-rooted smartphones, not just the Xiaomi 13 Ultra.

Run AI on your phone - hardware limitations

In order to run the bigger (and more smarter) LLM models with 7 Billion parameters for example you need 12 GB RAM for smooth operation.LLM 04 photo 2024 02 22 20 31 56

On my Xiaomi 13 Ultra I have 12 GB RAM and the 7B LLMs are working perfectly.

I have tested the procedure bellow on my old Redmi Note 10 Pro which has only 6 GB RAM and the 7B parameters models are NOT working ... actually they are working EXTREMELY SLOW.

Yes, on Redmi Note 10 Pro everything is working too slow because ov the low RAM on my phone.

For such a low RAM smartphone I recommend to test a smaller model like the Gemma 2B.

I have tested it on Redmi Note 10 Pro and it works as you can see on the second video below.

In this guide, I'll provide detailed instructions on how to run advanced Large Language Models (LLMs) such as Llama2:7B, GEMMA:7B, Mistral 7.3B, and many others locally and completely offline on your smartphone.

These instructions can be used to run any open-source LLM AI currently available.

For this demonstration, we'll use Termux, an amazing app available for both rooted and non-rooted Android users, as our foundation.

Once Termux is installed on the Xiaomi 13 Ultra, we'll install and run Ubuntu 20.04.

Following that, we'll install Ollama, a freely available open-source software, from 05 photo 2024 02 22 20 32 41

With this setup, you can run on your phone any advanced LLM, with up to 7 billion parameters (or smaller), such as:

LLMs for your phone available on Ollama

Mistral 7.3B: An LLM distributed under the Apache license, which has outperformed Llama 2 13B in all benchmarks.

Llama 2: An open-source LLM with certain restrictions, released by Meta Platforms, Inc.

Llama 2 Uncensored - Uncensored Llama 2 model by George Sung and Jarrad Hope.

Gemma 2B/7B: A new open model developed by Google and its DeepMind team, also open-source but with some restrictions.

LLaVA: The Large Language and Vision Assistant.

For a complete list of all available LLMs in the Ollama project, visit

AI Installation on your smartphone Step by Step:

Install Termux from GitHub or F-Droid

Install Ubuntu 20.04 on Termux using this step-by-step guide: Termux-Ubuntu Guide:

On Termux execute the following command in order to install a console-only version of Ubuntu 20:

NOTE: For all commands bellow the easiest way is to open this HOWTO on the browser on your phone, copy the commands from the HOWTO and paste them on the console.

pkg install wget curl proot tar -y && wget && chmod +x && bash nde

During the Ubuntu installation you have to answer few times with Y on the questions on the console.LLM 02photo 2024 02 22 20 06 01

The process on the console is fully automated and will take few minutes.

Just before the end you have to add user and password for your user, also u have to enter a root password.

After each input hit Enter on the keyboard to continue.

After the end of the Ubuntu 20 installation you will be logged in automatically in Ubuntu.

For the NEXT time in order to start the  Ubuntu OS you have to run this command on the Termux console:


After that only the first time in Ubuntu on the console run this commands:

sudo apt updateLLM 03 photo 2024 02 22 20 06 02

sudo apt upgrade

sudo apt install curl

On all questions on the console enter Y

At the moment of writhing of this HOWTO u can Install Ollama by executing this command:

curl -fsSL | sh

It's higly advaisable to visit the web site and follow the Linux installation instructions

Running the Ollama Service and LLMs. Start the Ollama service with this command:

ollama serve > /dev/null 2>&1 &LLM 01 photo 2024 02 22 19 29 04

Download on your smartphone and run the desired LLM.

For this guide, we'll install and test two different LLMs:

Mistral:7.3B - a 7.3 Billion parameters LLM:

Ensure you have 4.1 GB of space on your memory card.

The first execution of the following command downloads the LLM.

Subsequent executions run the already downloaded LLM:

ollama run mistral:instruct

The download time depends on your internet connection speed.AOSP by Mistral LLM photo 2024 02 22 17 34 56

Once downloaded, the command automatically starts the LLM, presenting a console prompt for your queries.

After the initial LLM download you will see on the terminal something like that:

pulling e6836092461f... 100% ▕▏ 42 B

pulling ed11eda7790d... 100% ▕▏ 30 B

pulling f9b1e3196ecf... 100% ▕▏ 483 B

verifying sha256 digest

writing manifest

removing any unused layers


>>>Lyneage OS by Mistral LLM photo 2024 02 22 17 35 57

This is the prompt where you can enter your questions to the AI LLM.

Ollama prompt options

Ending a session with the chatbot is as simple as typing /bye at the prompt.

For a full list of available prompt parameters, type /help.

To completely close the Ollama session, enter /bye,

and then exit your Ubuntu session in Termux with:


Test on your phone Gemma 2B LLM AI

Whit the command bellow you can install the much smaller and faster Gemma LLM with 2 Billion parameters released as a Open Source project from Google:

ollama run gemma:2b

This LLM is three times smaller and runs much faster, making it suitable for older smartphones.

That's all, folks!

You're now equipped to explore the vast universe of LLMs right from your smartphone.

Happy experimenting, and may your queries always lead to enlightening answers.

Good luck!


P.S.1 My personal choice for Ollama LLMs to test:

1. Mistral 7.3B: ollama run mistral

At the moment of writhing of this HOWTO probably Mistral is the best 7B LLM which you can install and run with no problems on your phone!Mistral LLM photo 2024 02 22 17 34 38

2. Llama 2 7B: ollama run llama2:7b

Very good high quality LLM!

3. Llama 2 7B UNCENSORRED :) Run: ollama run llama2-uncensored

The Uncensorred Llama 2 7B will answer you on any question like: "How to make Tylenol" ... and etc ... ........ .......... :)

4. Gemma 2B: ollama run gemma:2b - very small and super FAST LLM.

Unfortunately the bigger 7B Gemma LLM available immediately on the day of the release was a JOKE ... it answered me on some strange Language mix and etc ...  :(

 I hope that the developers of the Google DeepMind team will release soon a fixed smoothly working version of the Gemma 7B LLM AI.

P.S.2 Yoda the Hacker wisdom by Mistral 7B LLM AI 

For the end of this AI HOWTO I would like to share with you some Yoda the Hacker wisdom shared with me by the Mistral 7B LLM AI Installed on my Xiaomi 13 Ultra :)

Lupo Prompt: Tell me some Yoda the hacker wisdom

Mistral 7B LLM: Yoda, the wise Jedi Master from Star Wars, might not have been a hacker in the modern sense, but he certainly had valuable wisdom to share regarding efficiency, problem-solving, and adaptation.

Here are some Yoda-inspired "hacker" quotes:

1. "Luminous beings are we, not this crude matter." (Embrace the intangible nature of code and data.)Yoda the Hacker by Mistral LLM photo 2024 02 22 17 34 22

2. "Do or do not. There is no try." (Stay focused and persistent in your hacking endeavors.)

3. "Patience you must have, my young padawan." (Be patient while debugging issues and learning new skills.)

4. "Fear is the path to the dark side. Fear leads to anger. Anger leads to hate. Hate leads to suffering." (Avoid fear in your hacking journey and maintain a positive mindset.)

5. "Always in motion is the future." (Adapt and evolve with the ever-changing landscape of technology and hacking.)

6. "Much to learn, you still have." (Stay curious and continue learning new skills and techniques.)

7. "Size matters not." (Regardless of your age, experience, or resources, you can make a difference in the world of hacking.)

8. "Hmmm… A good question. You must ask yourself what does it mean to truly 'hack'?" (Reflect on the ethical implications and motivations behind hacking.)

Lupo :)