High 5!
Welcome to this step-by-step guide on how to run AI locally and offline on your Xiaomi 13 Ultra.
This method is applicable to both rooted and non-rooted smartphones, not just the Xiaomi 13 Ultra.
Run AI on your phone - hardware limitations
In order to run the bigger (and more smarter) LLM models with 7 Billion parameters for example you need 12 GB RAM for smooth operation.
On my Xiaomi 13 Ultra I have 12 GB RAM and the 7B LLMs are working perfectly.
I have tested the procedure bellow on my old Redmi Note 10 Pro which has only 6 GB RAM and the 7B parameters models are NOT working ... actually they are working EXTREMELY SLOW.
Yes, on Redmi Note 10 Pro everything is working too slow because ov the low RAM on my phone.
For such a low RAM smartphone I recommend to test a smaller model like the Gemma 2B.
I have tested it on Redmi Note 10 Pro and it works as you can see on the second video below.
Ollama LLM Mistral:7B and Llama2:7b run on Xiaomi 13 Ultra
In this guide, I'll provide detailed instructions on how to run advanced Large Language Models (LLMs) such as Llama2:7B, GEMMA:7B, Mistral 7.3B, and many others locally and completely offline on your smartphone.
These instructions can be used to run any open-source LLM AI currently available.
For this demonstration, we'll use Termux, an amazing app available for both rooted and non-rooted Android users, as our foundation.
Once Termux is installed on the Xiaomi 13 Ultra, we'll install and run Ubuntu 20.04.
Following that, we'll install Ollama, a freely available open-source software, from Ollama.com.
With this setup, you can run on your phone any advanced LLM, with up to 7 billion parameters (or smaller), such as:
LLMs for your phone available on Ollama
Mistral 7.3B: An LLM distributed under the Apache license, which has outperformed Llama 2 13B in all benchmarks.
Llama 2: An open-source LLM with certain restrictions, released by Meta Platforms, Inc.
Llama 2 Uncensored - Uncensored Llama 2 model by George Sung and Jarrad Hope.
Gemma 2B/7B: A new open model developed by Google and its DeepMind team, also open-source but with some restrictions.
LLaVA: The Large Language and Vision Assistant.
For a complete list of all available LLMs in the Ollama project, visit Ollama.com/library.
AI Installation on your smartphone Step by Step:
Install Termux from GitHub or F-Droid
Install Ubuntu 20.04 on Termux using this step-by-step guide: Termux-Ubuntu Guide:
https://github.com/tuanpham-dev/termux-ubuntu
On Termux execute the following command in order to install a console-only version of Ubuntu 20:
NOTE: For all commands bellow the easiest way is to open this HOWTO on the browser on your phone, copy the commands from the HOWTO and paste them on the console.
pkg install wget curl proot tar -y && wget https://raw.githubusercontent.com/tuanpham-dev/termux-ubuntu/master/ubuntu.sh && chmod +x ubuntu.sh && bash ubuntu.sh nde
During the Ubuntu installation you have to answer few times with Y on the questions on the console.
The process on the console is fully automated and will take few minutes.
Just before the end you have to add user and password for your user, also u have to enter a root password.
After each input hit Enter on the keyboard to continue.
After the end of the Ubuntu 20 installation you will be logged in automatically in Ubuntu.
For the NEXT time in order to start the Ubuntu OS you have to run this command on the Termux console:
./start-ubuntu20.sh
After that only the first time in Ubuntu on the console run this commands:
sudo apt upgrade
sudo apt install curl
On all questions on the console enter Y
At the moment of writhing of this HOWTO u can Install Ollama by executing this command:
curl -fsSL https://ollama.com/install.sh | sh
It's higly advaisable to visit the Ollama.com web site and follow the Linux installation instructions
Running the Ollama Service and LLMs. Start the Ollama service with this command:
ollama serve > /dev/null 2>&1 &
Download on your smartphone and run the desired LLM.
For this guide, we'll install and test two different LLMs:
Mistral:7.3B - a 7.3 Billion parameters LLM:
Ensure you have 4.1 GB of space on your memory card.
The first execution of the following command downloads the LLM.
Subsequent executions run the already downloaded LLM:
ollama run mistral:instruct
The download time depends on your internet connection speed.
Once downloaded, the command automatically starts the LLM, presenting a console prompt for your queries.
After the initial LLM download you will see on the terminal something like that:
pulling e6836092461f... 100% ▕▏ 42 B
pulling ed11eda7790d... 100% ▕▏ 30 B
pulling f9b1e3196ecf... 100% ▕▏ 483 B
verifying sha256 digest
writing manifest
removing any unused layers
success
This is the prompt where you can enter your questions to the AI LLM.
Ollama prompt options
Ending a session with the chatbot is as simple as typing /bye at the prompt.
For a full list of available prompt parameters, type /help.
To completely close the Ollama session, enter /bye,
and then exit your Ubuntu session in Termux with:
exit
Ollama LLM Mistral:7B and Gemma:2b run on Redmi Note 10 Pro 6GB RAM
Test on your phone Gemma 2B LLM AI
Whit the command bellow you can install the much smaller and faster Gemma LLM with 2 Billion parameters released as a Open Source project from Google:
ollama run gemma:2b
This LLM is three times smaller and runs much faster, making it suitable for older smartphones.
That's all, folks!
You're now equipped to explore the vast universe of LLMs right from your smartphone.
Happy experimenting, and may your queries always lead to enlightening answers.
Good luck!
Lupo
P.S.1 My personal choice for Ollama LLMs to test:
1. Mistral 7.3B: ollama run mistral
At the moment of writhing of this HOWTO probably Mistral is the best 7B LLM which you can install and run with no problems on your phone!
2. Llama 2 7B: ollama run llama2:7b
Very good high quality LLM!
3. Llama 2 7B UNCENSORRED :) Run: ollama run llama2-uncensored
The Uncensorred Llama 2 7B will answer you on any question like: "How to make Tylenol" ... and etc ... ........ .......... :)
4. Gemma 2B: ollama run gemma:2b - very small and super FAST LLM.
Unfortunately the bigger 7B Gemma LLM available immediately on the day of the release was a JOKE ... it answered me on some strange Language mix and etc ... :(
I hope that the developers of the Google DeepMind team will release soon a fixed smoothly working version of the Gemma 7B LLM AI.
P.S.2 Yoda the Hacker wisdom by Mistral 7B LLM AI
For the end of this AI HOWTO I would like to share with you some Yoda the Hacker wisdom shared with me by the Mistral 7B LLM AI Installed on my Xiaomi 13 Ultra :)
Lupo Prompt: Tell me some Yoda the hacker wisdom
Mistral 7B LLM: Yoda, the wise Jedi Master from Star Wars, might not have been a hacker in the modern sense, but he certainly had valuable wisdom to share regarding efficiency, problem-solving, and adaptation.
Here are some Yoda-inspired "hacker" quotes:
1. "Luminous beings are we, not this crude matter." (Embrace the intangible nature of code and data.)
2. "Do or do not. There is no try." (Stay focused and persistent in your hacking endeavors.)
3. "Patience you must have, my young padawan." (Be patient while debugging issues and learning new skills.)
4. "Fear is the path to the dark side. Fear leads to anger. Anger leads to hate. Hate leads to suffering." (Avoid fear in your hacking journey and maintain a positive mindset.)
5. "Always in motion is the future." (Adapt and evolve with the ever-changing landscape of technology and hacking.)
6. "Much to learn, you still have." (Stay curious and continue learning new skills and techniques.)
7. "Size matters not." (Regardless of your age, experience, or resources, you can make a difference in the world of hacking.)
8. "Hmmm… A good question. You must ask yourself what does it mean to truly 'hack'?" (Reflect on the ethical implications and motivations behind hacking.)
Lupo :)