Why Choose Local AI Mode?
Cloud AI is convenient, but it means every conversation you have with FEAIA passes through a third-party server. For a desktop companion, that's a meaningful privacy concern — you might share your emotions, personal challenges, and daily thoughts in those conversations.
The case for local mode:
- Complete privacy: conversation data never leaves your machine
- No network dependency: works offline, anywhere
- Low latency: first-token response in 100–300ms, far faster than cloud
- No API costs: no OpenAI account or subscription required
System Requirements
| Requirement | Minimum | Recommended |
|---|---|---|
| OS | Windows 10 64-bit | Windows 11 |
| RAM | 8GB | 16GB+ |
| GPU (optional) | — | NVIDIA RTX series (accelerated inference) |
| Disk Space | 10GB (model files) | 40GB+ |
FEAIA itself uses less than 150MB of RAM — most memory consumption comes from the language model running in Ollama.
Step 1: Install Ollama
- Go to ollama.com and download the Windows installer (~60MB)
- Run the installer with default settings
- Ollama starts automatically in the background (look for the tray icon)
Verify the installation:
Open Command Prompt (Win + R, type cmd) and run:
ollama --version
If you see a version number like ollama version 0.3.x, you're good to go.
Step 2: Download an AI Model
Ollama supports many open-source models. Here's the FEAIA recommended lineup:
Q: Which model should I choose?
| Model | Size | Best For | Rating |
|---|---|---|---|
qwen2:7b |
4.4GB | Chinese conversation, natural dialogue | ⭐⭐⭐⭐⭐ |
llama3:8b |
4.7GB | English conversation, general use | ⭐⭐⭐⭐⭐ |
mistral:7b |
4.1GB | English, strong reasoning | ⭐⭐⭐⭐ |
llama3:70b |
40GB | Flagship experience (requires high-end GPU) | ⭐⭐⭐ |
In Command Prompt, run (using llama3:8b as an example):
ollama pull llama3:8b
When complete, you'll see success. Download time depends on your connection speed — typically 5–30 minutes.
Step 3: Connect Ollama to FEAIA
- Open FEAIA and click the ⚙ Settings icon in the top right
- Go to the AI Engine tab
- In the "AI Provider" dropdown, select Ollama (Local)
- The service address defaults to
http://localhost:11434— leave it as is - In "Model Name", enter the model you downloaded (e.g.
llama3:8b) - Click Test Connection — if you see ✅ Connected, you're all set
FAQ
Q: Connection test fails with "Unable to connect to Ollama service" — what do I do?
Make sure Ollama is running in the background. Open Task Manager and look for the ollama.exe process. If it's not there, restart Ollama from the Start menu.
Q: The model is responding very slowly. What can I do?
Without a dedicated GPU, the model runs on CPU (approximately 5–15 tokens/second). An NVIDIA RTX-series GPU enables GPU acceleration and typically delivers an 8–20x speed improvement.
Q: Can I switch between cloud and local mode?
Yes. FEAIA supports saving multiple AI configuration profiles — switch between them anytime. For example: use local llama3:8b for everyday conversation, switch to GPT-4 for long-form writing assistance.
Q: Which FEAIA plan is required for local AI mode?
Local AI mode is available on all plans, including the free tier. Privacy should never be a paid feature — that's a core design principle for FEAIA.
What to Try Next
- Enable long-term memory in Memory Settings (Pro and above)
- Browse 50+ Live2D skins to give your local AI companion a unique look
- Join the FEAIA Community to share your model configuration setups
For any issues during setup, visit the FEAIA Help Center or ask in the community.
