claude-home/ollama-model-testing.md
Cal Corum 4b7eca8a46
All checks were successful
Reindex Knowledge Base / reindex (push) Successful in 3s
docs: add YAML frontmatter to all 151 markdown files
Adds title, description, type, domain, and tags frontmatter to every
doc for improved KB semantic search. The description field is prepended
to every search chunk, and domain/type/tags enable filtered queries.

Type values: context, guide, runbook, reference, troubleshooting
Domain values match directory structure (networking, docker, etc.)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 09:00:44 -05:00

130 lines
3.5 KiB
Markdown

---
title: "Ollama Model Testing Log"
description: "Testing log tracking Ollama model evaluations with performance observations, VRAM requirements, and suitability ratings for different use cases on a 16GB GPU workstation."
type: reference
domain: development
tags: [ollama, llm, model-testing, vram, gpu, deepseek, glm]
---
# Ollama Model Testing Log
Track models tested, performance observations, and suitability for different use cases.
---
## Quick Summary
| Model | Date Tested | Primary Use Case | Rating | Notes |
|-------|-------------|------------------|--------|-------|
| GLM-4.7:cloud | 2026-02-04 | General purpose | ⭐⭐⭐⭐ | Cloud-hosted, fast, good reasoning |
| deepseek-v3.1:671b-cloud | 2026-02-04 | Complex reasoning | ⭐⭐⭐⭐⭐ | Cloud, very capable, slower response |
| | | | | |
---
## Model Testing Details
### GLM-4.7:cloud
**Date Tested:** 2026-02-04
**Model Info:**
- Size/Parameters: Unknown (cloud)
- Quantization: N/A (cloud)
- Base Model: GLM-4.7 by Zhipu AI
**Performance:**
- Response Speed: Fast
- RAM/VRAM Usage: Cloud (local minimal)
- Context Window: 128k
**Testing Use Cases:**
- [x] Code generation
- [x] General Q&A
- [ ] Creative writing
- [x] Data analysis
- [ ] Task planning
- [ ] Other:
**Observations:**
- Strengths: Fast response, good at general reasoning
- Weaknesses: Cloud dependency
- Resource requirements: Minimal local resources
- Output quality: Solid for most tasks
- When to use this model: Daily tasks, coding help, general assistance
**Verdict:** ⭐⭐⭐⭐
---
### deepseek-v3.1:671b-cloud
**Date Tested:** 2026-02-04
**Model Info:**
- Size/Parameters: 671B (cloud)
- Quantization: N/A (cloud)
- Base Model: DeepSeek-V3 by DeepSeek
**Performance:**
- Response Speed: Moderate (671B model)
- RAM/VRAM Usage: Cloud (local minimal)
- Context Window: 128k+
**Testing Use Cases:**
- [x] Code generation
- [x] General Q&A
- [ ] Creative writing
- [x] Data analysis
- [x] Task planning
- [ ] Other:
**Observations:**
- Strengths: Very capable, excellent reasoning, great with complex tasks
- Weaknesses: Slower response, cloud dependency
- Resource requirements: Minimal local resources
- Output quality: Top-tier, handles complex multi-step reasoning well
- When to use this model: Complex coding tasks, deep analysis, planning
**Verdict:** ⭐⭐⭐⭐⭐
---
## Models to Test
### Local Models (16GB GPU Compatible)
**Small & Fast (2-6GB VRAM at Q4):**
- [ ] phi3:mini - 3.8B params, great for quick tasks ~2.2GB
- [ ] llama3.1:8b - 8B params, excellent all-rounder ~4.7GB
- [ ] qwen2.5:7b - 7B params, strong reasoning ~4.3GB
- [ ] gemma2:9b - 9B params, Google's small model ~5.5GB
**Medium Capability (6-10GB VRAM at Q4):**
- [ ] mistral:7b - 7B params, classic workhorse ~4.1GB
- [ ] llama3.1:14b - 14B params, higher quality ~8.2GB
- [ ] qwen2.5:14b - 14B params, strong multilingual ~8.1GB
**Specialized:**
- [ ] deepseek-coder-v2:lite - 16B params, optimized for coding ~8.7GB
- [ ] codellama:7b - 7B params, coding specialist ~4.1GB
---
## General Notes
*Any overall observations, preferences, or patterns discovered during testing.*
**Initial Impressions:**
- Cloud models (GLM-4.7, DeepSeek-V3) provide excellent quality without local resources
- Planning to test local models for privacy, offline use, and comparing quality/speed trade-offs
- Focus will be on models that fit comfortably in 16GB VRAM for smooth performance
**VRAM Estimates at Q4 Quantization:**
- 3B-4B models: ~2-3GB
- 7B-8B models: ~4-5GB
- 14B models: ~8-9GB
- Leaves room for context window and system overhead
---
*Last Updated: 2026-02-04*