Advanced: Ollama Setup¶
Ollama allows you to run large language models locally on your computer, providing completely private, offline AI inference. This guide covers installing Ollama and configuring it for use with MedicalFactChecker.
Who Is This For?
This guide is for advanced users who want to:
- Run AI models locally without cloud API costs
- Keep all data on their own hardware
- Use MedicalFactChecker in air-gapped environments
- Access Ollama from mobile devices on their network
Part 1: Installing Ollama¶
macOS Installation¶
- Download Ollama
- Visit ollama.com/download
- Click Download for macOS
-
Requires macOS 14 Sonoma or later
-
Install the Application
- Open the downloaded
Ollama.dmg - Drag Ollama to your Applications folder
-
Launch Ollama from Applications
-
Verify Installation Open Terminal and run:

Windows Installation¶
- Download Ollama
- Visit ollama.com/download/windows
-
Download the Windows installer
-
Run the Installer
- Double-click the downloaded
.exefile - Follow the installation wizard
-
Ollama will start automatically
-
Verify Installation Open Command Prompt or PowerShell:
Linux Installation¶
Run the installation script:
Or install manually:
# Download the binary
curl -L https://ollama.com/download/ollama-linux-amd64 -o ollama
chmod +x ollama
sudo mv ollama /usr/local/bin/
# Start the service
ollama serve
Part 2: Downloading Models¶
Recommended Models for Medical Fact-Checking¶
| Model | Size | RAM Required | Quality | Speed |
|---|---|---|---|---|
llama3.1:70b |
40GB | 48GB+ | Excellent | Slow |
llama3.1:8b |
4.7GB | 8GB+ | Good | Fast |
mixtral:8x7b |
26GB | 32GB+ | Excellent | Medium |
mistral:7b |
4.1GB | 8GB+ | Good | Fast |
qwen2.5:32b |
18GB | 24GB+ | Very Good | Medium |
Downloading a Model¶
# Download the recommended model (requires 48GB+ RAM)
ollama pull llama3.1:70b
# For systems with less RAM
ollama pull llama3.1:8b
# Or Mistral for good balance
ollama pull mistral:7b

Verifying Downloaded Models¶
Output example:
NAME ID SIZE MODIFIED
llama3.1:8b a2c89ceabc78 4.7 GB 2 hours ago
mistral:7b f974a74358d6 4.1 GB 1 day ago
Part 3: Using Ollama with MedicalFactChecker¶
macOS App Configuration¶
- Ensure Ollama is running (check for the llama icon in menu bar)
- Open MedicalFactChecker
- Go to Settings (
⌘+,) - Select Ollama as the AI Provider
- Set URL to
http://localhost:11434 - Select your model from the dropdown

Python Desktop App Configuration¶
Option 1: Environment Variable
Option 2: Settings GUI
- Open the Settings panel
- Navigate to AI Configuration
- Select Ollama as provider
- Enter the host URL and model name
Testing the Connection¶
In Terminal, verify Ollama is responding:
Expected output:
Part 4: Network Access (Advanced)¶
By default, Ollama only accepts connections from the local machine (127.0.0.1). To use it from other devices (like an iPhone on the same network), you need to configure network access.
Security Considerations
Exposing Ollama to your network allows any device on that network to use your AI models. Only do this on trusted networks, and consider additional security measures for sensitive environments.
macOS: Enabling Network Access¶
The standard approach of setting environment variables in .zshrc does not work for GUI applications on macOS. Use this method instead:
Method 1: LaunchAgent (Recommended - Persistent)
- Create a LaunchAgent plist file:
- Add this content:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN"
"http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>Label</key>
<string>com.ollama.env</string>
<key>ProgramArguments</key>
<array>
<string>sh</string>
<string>-c</string>
<string>launchctl setenv OLLAMA_HOST 0.0.0.0:11434; launchctl setenv OLLAMA_ORIGINS "*"</string>
</array>
<key>RunAtLoad</key>
<true/>
</dict>
</plist>
- Load the LaunchAgent:
- Restart Ollama (quit from menu bar and reopen)
Method 2: Manual (Temporary - Until Restart)
Then quit and restart Ollama from the menu bar.
Linux: Enabling Network Access¶
For systemd-managed installations:
- Create an override file:
- Add these lines:
- Reload and restart:
Alternative: Manual Start
Windows: Enabling Network Access¶
- Open Settings → System → About → Advanced system settings
- Click Environment Variables
- Under User variables, click New
- Add:
- Variable name:
OLLAMA_HOST - Variable value:
0.0.0.0:11434 - Add another:
- Variable name:
OLLAMA_ORIGINS - Variable value:
* - Click OK and restart Ollama
Verifying Network Access¶
From another device on your network:
Replace YOUR_MAC_IP with your computer's local IP address (e.g., 192.168.1.100).
Part 5: Connecting Mobile Devices¶
Finding Your Computer's IP Address¶
iOS App Configuration¶
- Ensure your iPhone/iPad is on the same WiFi network as your computer
- Open MedicalFactChecker
- Go to Settings → AI Provider
- Select Ollama
- Enter the URL:
http://YOUR_COMPUTER_IP:11434 - Select your model

iOS Limitation
iOS does not support running Ollama locally. You must connect to an Ollama instance running on another computer.
Troubleshooting Connection Issues¶
"Cannot connect to Ollama"
- Verify Ollama is running on your computer
- Check that both devices are on the same network
- Verify the IP address is correct
- Ensure
OLLAMA_HOSTis set to0.0.0.0 - Check your firewall settings
macOS Firewall
System Settings → Network → Firewall → Options → Ensure Ollama is allowed
Linux Firewall
Part 6: Ollama Cloud Services¶
For remote access outside your local network, consider these options:
Tailscale (Recommended)¶
Tailscale creates a secure mesh VPN between your devices:
- Install Tailscale on your Ollama server and mobile devices
- Connect both to your Tailscale network
- Use the Tailscale IP address (100.x.x.x) in your Ollama URL
Benefits:
- End-to-end encryption
- No public internet exposure
- Works across networks
- Free for personal use
ngrok¶
For temporary public access:
This provides a public URL like https://abc123.ngrok.io that tunnels to your local Ollama.
Security Warning
Public URLs expose your Ollama instance to the internet. Use authentication and rate limiting for production deployments.
Performance Tips¶
Memory Management¶
- Close unnecessary applications when running large models
- Models are loaded into RAM/VRAM on first use
- Unused models are unloaded after 5 minutes by default
GPU Acceleration¶
Ollama automatically uses GPU when available:
- macOS: Apple Silicon GPU (Metal)
- Linux/Windows: NVIDIA CUDA GPUs
- Fallback: CPU (slower but works everywhere)
Optimizing for Medical Queries¶
For best medical fact-checking results:
- Use larger models when possible (70B > 8B)
- Increase context length if your system supports it
- Keep models loaded by running periodic queries
# Keep model loaded
curl http://localhost:11434/api/generate -d '{
"model": "llama3.1:8b",
"keep_alive": "24h"
}'
Troubleshooting¶
"Model not found"¶
Out of Memory¶
- Try a smaller model (8B instead of 70B)
- Close other applications
- Check available RAM:
free -h(Linux) or Activity Monitor (macOS)
Slow Performance¶
- Ensure GPU acceleration is working
- Check CPU/GPU usage during inference
- Consider a smaller, faster model
- Reduce max tokens in your requests
Ollama Won't Start¶
# macOS: Check logs
log stream --predicate 'subsystem == "com.ollama"' --level debug
# Linux: Check service status
sudo systemctl status ollama
journalctl -u ollama -f