Advanced: Ollama Setup¶

Ollama allows you to run large language models locally on your computer, providing completely private, offline AI inference. This guide covers installing Ollama and configuring it for use with MedicalFactChecker.

Who Is This For?

This guide is for advanced users who want to:

Run AI models locally without cloud API costs
Keep all data on their own hardware
Use MedicalFactChecker in air-gapped environments
Access Ollama from mobile devices on their network

Part 1: Installing Ollama¶

macOS Installation¶

Download Ollama
Visit ollama.com/download
Click Download for macOS
Requires macOS 14 Sonoma or later
Install the Application
Open the downloaded Ollama.dmg
Drag Ollama to your Applications folder
Launch Ollama from Applications
Verify Installation Open Terminal and run:
```
ollama --version
```

Windows Installation¶

Download Ollama
Visit ollama.com/download/windows
Download the Windows installer
Run the Installer
Double-click the downloaded .exe file
Follow the installation wizard
Ollama will start automatically
Verify Installation Open Command Prompt or PowerShell:
```
ollama --version
```

Linux Installation¶

Run the installation script:

curl -fsSL https://ollama.com/install.sh | sh

Or install manually:

# Download the binary
curl -L https://ollama.com/download/ollama-linux-amd64 -o ollama
chmod +x ollama
sudo mv ollama /usr/local/bin/

# Start the service
ollama serve

Part 2: Downloading Models¶

Recommended Models for Medical Fact-Checking¶

Model	Size	RAM Required	Quality	Speed
`llama3.1:70b`	40GB	48GB+	Excellent	Slow
`llama3.1:8b`	4.7GB	8GB+	Good	Fast
`mixtral:8x7b`	26GB	32GB+	Excellent	Medium
`mistral:7b`	4.1GB	8GB+	Good	Fast
`qwen2.5:32b`	18GB	24GB+	Very Good	Medium

Downloading a Model¶

# Download the recommended model (requires 48GB+ RAM)
ollama pull llama3.1:70b

# For systems with less RAM
ollama pull llama3.1:8b

# Or Mistral for good balance
ollama pull mistral:7b

Verifying Downloaded Models¶

ollama list

Output example:

NAME              ID              SIZE    MODIFIED
llama3.1:8b       a2c89ceabc78    4.7 GB  2 hours ago
mistral:7b        f974a74358d6    4.1 GB  1 day ago

Part 3: Using Ollama with MedicalFactChecker¶

macOS App Configuration¶

Ensure Ollama is running (check for the llama icon in menu bar)
Open MedicalFactChecker
Go to Settings (⌘+,)
Select Ollama as the AI Provider
Set URL to http://localhost:11434
Select your model from the dropdown

Python Desktop App Configuration¶

Option 1: Environment Variable

export OLLAMA_HOST="http://localhost:11434"
export OLLAMA_MODEL="llama3.1:8b"

Option 2: Settings GUI

Open the Settings panel
Navigate to AI Configuration
Select Ollama as provider
Enter the host URL and model name

Testing the Connection¶

In Terminal, verify Ollama is responding:

curl http://localhost:11434/api/tags

Expected output:

{"models":[{"name":"llama3.1:8b","modified_at":"2024-01-15T..."}]}

Part 4: Network Access (Advanced)¶

By default, Ollama only accepts connections from the local machine (127.0.0.1). To use it from other devices (like an iPhone on the same network), you need to configure network access.

Security Considerations

Exposing Ollama to your network allows any device on that network to use your AI models. Only do this on trusted networks, and consider additional security measures for sensitive environments.

macOS: Enabling Network Access¶

The standard approach of setting environment variables in .zshrc does not work for GUI applications on macOS. Use this method instead:

Method 1: LaunchAgent (Recommended - Persistent)

Create a LaunchAgent plist file:

mkdir -p ~/Library/LaunchAgents
nano ~/Library/LaunchAgents/com.ollama.env.plist

Add this content:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN"
  "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
    <key>Label</key>
    <string>com.ollama.env</string>
    <key>ProgramArguments</key>
    <array>
        <string>sh</string>
        <string>-c</string>
        <string>launchctl setenv OLLAMA_HOST 0.0.0.0:11434; launchctl setenv OLLAMA_ORIGINS "*"</string>
    </array>
    <key>RunAtLoad</key>
    <true/>
</dict>
</plist>

Load the LaunchAgent:

launchctl load ~/Library/LaunchAgents/com.ollama.env.plist

Restart Ollama (quit from menu bar and reopen)

Method 2: Manual (Temporary - Until Restart)

launchctl setenv OLLAMA_HOST 0.0.0.0:11434
launchctl setenv OLLAMA_ORIGINS "*"

Then quit and restart Ollama from the menu bar.

Linux: Enabling Network Access¶

For systemd-managed installations:

Create an override file:

sudo systemctl edit ollama

Add these lines:

[Service]
Environment="OLLAMA_HOST=0.0.0.0:11434"
Environment="OLLAMA_ORIGINS=*"

Reload and restart:

sudo systemctl daemon-reload
sudo systemctl restart ollama

Alternative: Manual Start

sudo systemctl stop ollama
OLLAMA_HOST=0.0.0.0:11434 OLLAMA_ORIGINS="*" ollama serve

Windows: Enabling Network Access¶

Open Settings → System → About → Advanced system settings
Click Environment Variables
Under User variables, click New
Add:
Variable name: OLLAMA_HOST
Variable value: 0.0.0.0:11434
Add another:
Variable name: OLLAMA_ORIGINS
Variable value: *
Click OK and restart Ollama

Verifying Network Access¶

From another device on your network:

curl http://YOUR_MAC_IP:11434/api/tags

Replace YOUR_MAC_IP with your computer's local IP address (e.g., 192.168.1.100).

Part 5: Connecting Mobile Devices¶

Finding Your Computer's IP Address¶

macOSWindowsLinux

ipconfig getifaddr en0

Or: System Settings → Network → WiFi → Details → IP Address

ipconfig

Look for "IPv4 Address" under your active adapter

hostname -I | awk '{print $1}'

iOS App Configuration¶

Ensure your iPhone/iPad is on the same WiFi network as your computer
Open MedicalFactChecker
Go to Settings → AI Provider
Select Ollama
Enter the URL: http://YOUR_COMPUTER_IP:11434
Select your model

iOS Limitation

iOS does not support running Ollama locally. You must connect to an Ollama instance running on another computer.

Troubleshooting Connection Issues¶

"Cannot connect to Ollama"

Verify Ollama is running on your computer
Check that both devices are on the same network
Verify the IP address is correct
Ensure OLLAMA_HOST is set to 0.0.0.0
Check your firewall settings

macOS Firewall

System Settings → Network → Firewall → Options → Ensure Ollama is allowed

Linux Firewall

sudo ufw allow 11434/tcp

Part 6: Ollama Cloud Services¶

For remote access outside your local network, consider these options:

Tailscale (Recommended)¶

Tailscale creates a secure mesh VPN between your devices:

Install Tailscale on your Ollama server and mobile devices
Connect both to your Tailscale network
Use the Tailscale IP address (100.x.x.x) in your Ollama URL

Benefits:

End-to-end encryption
No public internet exposure
Works across networks
Free for personal use

ngrok¶

For temporary public access:

ngrok http 11434

This provides a public URL like https://abc123.ngrok.io that tunnels to your local Ollama.

Security Warning

Public URLs expose your Ollama instance to the internet. Use authentication and rate limiting for production deployments.

Performance Tips¶

Memory Management¶

Close unnecessary applications when running large models
Models are loaded into RAM/VRAM on first use
Unused models are unloaded after 5 minutes by default

GPU Acceleration¶

Ollama automatically uses GPU when available:

macOS: Apple Silicon GPU (Metal)
Linux/Windows: NVIDIA CUDA GPUs
Fallback: CPU (slower but works everywhere)

Optimizing for Medical Queries¶

For best medical fact-checking results:

Use larger models when possible (70B > 8B)
Increase context length if your system supports it
Keep models loaded by running periodic queries

# Keep model loaded
curl http://localhost:11434/api/generate -d '{
  "model": "llama3.1:8b",
  "keep_alive": "24h"
}'

Troubleshooting¶

"Model not found"¶

# List available models
ollama list

# Pull the model if missing
ollama pull llama3.1:8b

Out of Memory¶

Try a smaller model (8B instead of 70B)
Close other applications
Check available RAM: free -h (Linux) or Activity Monitor (macOS)

Slow Performance¶

Ensure GPU acceleration is working
Check CPU/GPU usage during inference
Consider a smaller, faster model
Reduce max tokens in your requests

Ollama Won't Start¶

# macOS: Check logs
log stream --predicate 'subsystem == "com.ollama"' --level debug

# Linux: Check service status
sudo systemctl status ollama
journalctl -u ollama -f