Skip to content

Advanced: Ollama Setup

Ollama allows you to run large language models locally on your computer, providing completely private, offline AI inference. This guide covers installing Ollama and configuring it for use with MedicalFactChecker.

Who Is This For?

This guide is for advanced users who want to:

  • Run AI models locally without cloud API costs
  • Keep all data on their own hardware
  • Use MedicalFactChecker in air-gapped environments
  • Access Ollama from mobile devices on their network

Part 1: Installing Ollama

macOS Installation

  1. Download Ollama
  2. Visit ollama.com/download
  3. Click Download for macOS
  4. Requires macOS 14 Sonoma or later

  5. Install the Application

  6. Open the downloaded Ollama.dmg
  7. Drag Ollama to your Applications folder
  8. Launch Ollama from Applications

  9. Verify Installation Open Terminal and run:

    ollama --version
    

Ollama macOS Installation

Windows Installation

  1. Download Ollama
  2. Visit ollama.com/download/windows
  3. Download the Windows installer

  4. Run the Installer

  5. Double-click the downloaded .exe file
  6. Follow the installation wizard
  7. Ollama will start automatically

  8. Verify Installation Open Command Prompt or PowerShell:

    ollama --version
    

Linux Installation

Run the installation script:

curl -fsSL https://ollama.com/install.sh | sh

Or install manually:

# Download the binary
curl -L https://ollama.com/download/ollama-linux-amd64 -o ollama
chmod +x ollama
sudo mv ollama /usr/local/bin/

# Start the service
ollama serve

Part 2: Downloading Models

Model Size RAM Required Quality Speed
llama3.1:70b 40GB 48GB+ Excellent Slow
llama3.1:8b 4.7GB 8GB+ Good Fast
mixtral:8x7b 26GB 32GB+ Excellent Medium
mistral:7b 4.1GB 8GB+ Good Fast
qwen2.5:32b 18GB 24GB+ Very Good Medium

Downloading a Model

# Download the recommended model (requires 48GB+ RAM)
ollama pull llama3.1:70b

# For systems with less RAM
ollama pull llama3.1:8b

# Or Mistral for good balance
ollama pull mistral:7b

Ollama Model Download

Verifying Downloaded Models

ollama list

Output example:

NAME              ID              SIZE    MODIFIED
llama3.1:8b       a2c89ceabc78    4.7 GB  2 hours ago
mistral:7b        f974a74358d6    4.1 GB  1 day ago


Part 3: Using Ollama with MedicalFactChecker

macOS App Configuration

  1. Ensure Ollama is running (check for the llama icon in menu bar)
  2. Open MedicalFactChecker
  3. Go to Settings (⌘+,)
  4. Select Ollama as the AI Provider
  5. Set URL to http://localhost:11434
  6. Select your model from the dropdown

macOS Ollama Settings

Python Desktop App Configuration

Option 1: Environment Variable

export OLLAMA_HOST="http://localhost:11434"
export OLLAMA_MODEL="llama3.1:8b"

Option 2: Settings GUI

  1. Open the Settings panel
  2. Navigate to AI Configuration
  3. Select Ollama as provider
  4. Enter the host URL and model name

Testing the Connection

In Terminal, verify Ollama is responding:

curl http://localhost:11434/api/tags

Expected output:

{"models":[{"name":"llama3.1:8b","modified_at":"2024-01-15T..."}]}


Part 4: Network Access (Advanced)

By default, Ollama only accepts connections from the local machine (127.0.0.1). To use it from other devices (like an iPhone on the same network), you need to configure network access.

Security Considerations

Exposing Ollama to your network allows any device on that network to use your AI models. Only do this on trusted networks, and consider additional security measures for sensitive environments.

macOS: Enabling Network Access

The standard approach of setting environment variables in .zshrc does not work for GUI applications on macOS. Use this method instead:

Method 1: LaunchAgent (Recommended - Persistent)

  1. Create a LaunchAgent plist file:
mkdir -p ~/Library/LaunchAgents
nano ~/Library/LaunchAgents/com.ollama.env.plist
  1. Add this content:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN"
  "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
    <key>Label</key>
    <string>com.ollama.env</string>
    <key>ProgramArguments</key>
    <array>
        <string>sh</string>
        <string>-c</string>
        <string>launchctl setenv OLLAMA_HOST 0.0.0.0:11434; launchctl setenv OLLAMA_ORIGINS "*"</string>
    </array>
    <key>RunAtLoad</key>
    <true/>
</dict>
</plist>
  1. Load the LaunchAgent:
launchctl load ~/Library/LaunchAgents/com.ollama.env.plist
  1. Restart Ollama (quit from menu bar and reopen)

Method 2: Manual (Temporary - Until Restart)

launchctl setenv OLLAMA_HOST 0.0.0.0:11434
launchctl setenv OLLAMA_ORIGINS "*"

Then quit and restart Ollama from the menu bar.

Linux: Enabling Network Access

For systemd-managed installations:

  1. Create an override file:
sudo systemctl edit ollama
  1. Add these lines:
[Service]
Environment="OLLAMA_HOST=0.0.0.0:11434"
Environment="OLLAMA_ORIGINS=*"
  1. Reload and restart:
sudo systemctl daemon-reload
sudo systemctl restart ollama

Alternative: Manual Start

sudo systemctl stop ollama
OLLAMA_HOST=0.0.0.0:11434 OLLAMA_ORIGINS="*" ollama serve

Windows: Enabling Network Access

  1. Open SettingsSystemAboutAdvanced system settings
  2. Click Environment Variables
  3. Under User variables, click New
  4. Add:
  5. Variable name: OLLAMA_HOST
  6. Variable value: 0.0.0.0:11434
  7. Add another:
  8. Variable name: OLLAMA_ORIGINS
  9. Variable value: *
  10. Click OK and restart Ollama

Verifying Network Access

From another device on your network:

curl http://YOUR_MAC_IP:11434/api/tags

Replace YOUR_MAC_IP with your computer's local IP address (e.g., 192.168.1.100).


Part 5: Connecting Mobile Devices

Finding Your Computer's IP Address

ipconfig getifaddr en0
Or: System Settings → Network → WiFi → Details → IP Address

ipconfig
Look for "IPv4 Address" under your active adapter

hostname -I | awk '{print $1}'

iOS App Configuration

  1. Ensure your iPhone/iPad is on the same WiFi network as your computer
  2. Open MedicalFactChecker
  3. Go to Settings → AI Provider
  4. Select Ollama
  5. Enter the URL: http://YOUR_COMPUTER_IP:11434
  6. Select your model

iOS Ollama Settings

iOS Limitation

iOS does not support running Ollama locally. You must connect to an Ollama instance running on another computer.

Troubleshooting Connection Issues

"Cannot connect to Ollama"

  1. Verify Ollama is running on your computer
  2. Check that both devices are on the same network
  3. Verify the IP address is correct
  4. Ensure OLLAMA_HOST is set to 0.0.0.0
  5. Check your firewall settings

macOS Firewall

System Settings → Network → Firewall → Options → Ensure Ollama is allowed

Linux Firewall

sudo ufw allow 11434/tcp

Part 6: Ollama Cloud Services

For remote access outside your local network, consider these options:

Tailscale creates a secure mesh VPN between your devices:

  1. Install Tailscale on your Ollama server and mobile devices
  2. Connect both to your Tailscale network
  3. Use the Tailscale IP address (100.x.x.x) in your Ollama URL

Benefits:

  • End-to-end encryption
  • No public internet exposure
  • Works across networks
  • Free for personal use

ngrok

For temporary public access:

ngrok http 11434

This provides a public URL like https://abc123.ngrok.io that tunnels to your local Ollama.

Security Warning

Public URLs expose your Ollama instance to the internet. Use authentication and rate limiting for production deployments.


Performance Tips

Memory Management

  • Close unnecessary applications when running large models
  • Models are loaded into RAM/VRAM on first use
  • Unused models are unloaded after 5 minutes by default

GPU Acceleration

Ollama automatically uses GPU when available:

  • macOS: Apple Silicon GPU (Metal)
  • Linux/Windows: NVIDIA CUDA GPUs
  • Fallback: CPU (slower but works everywhere)

Optimizing for Medical Queries

For best medical fact-checking results:

  1. Use larger models when possible (70B > 8B)
  2. Increase context length if your system supports it
  3. Keep models loaded by running periodic queries
# Keep model loaded
curl http://localhost:11434/api/generate -d '{
  "model": "llama3.1:8b",
  "keep_alive": "24h"
}'

Troubleshooting

"Model not found"

# List available models
ollama list

# Pull the model if missing
ollama pull llama3.1:8b

Out of Memory

  • Try a smaller model (8B instead of 70B)
  • Close other applications
  • Check available RAM: free -h (Linux) or Activity Monitor (macOS)

Slow Performance

  • Ensure GPU acceleration is working
  • Check CPU/GPU usage during inference
  • Consider a smaller, faster model
  • Reduce max tokens in your requests

Ollama Won't Start

# macOS: Check logs
log stream --predicate 'subsystem == "com.ollama"' --level debug

# Linux: Check service status
sudo systemctl status ollama
journalctl -u ollama -f

Additional Resources