All checks were successful
Reindex Knowledge Base / reindex (push) Successful in 3s
Adds title, description, type, domain, and tags frontmatter to every doc for improved KB semantic search. The description field is prepended to every search chunk, and domain/type/tags enable filtered queries. Type values: context, guide, runbook, reference, troubleshooting Domain values match directory structure (networking, docker, etc.) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
3.5 KiB
3.5 KiB
| title | description | type | domain | tags | |||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Ollama Model Testing Log | Testing log tracking Ollama model evaluations with performance observations, VRAM requirements, and suitability ratings for different use cases on a 16GB GPU workstation. | reference | development |
|
Ollama Model Testing Log
Track models tested, performance observations, and suitability for different use cases.
Quick Summary
| Model | Date Tested | Primary Use Case | Rating | Notes |
|---|---|---|---|---|
| GLM-4.7:cloud | 2026-02-04 | General purpose | ⭐⭐⭐⭐ | Cloud-hosted, fast, good reasoning |
| deepseek-v3.1:671b-cloud | 2026-02-04 | Complex reasoning | ⭐⭐⭐⭐⭐ | Cloud, very capable, slower response |
Model Testing Details
GLM-4.7:cloud
Date Tested: 2026-02-04
Model Info:
- Size/Parameters: Unknown (cloud)
- Quantization: N/A (cloud)
- Base Model: GLM-4.7 by Zhipu AI
Performance:
- Response Speed: Fast
- RAM/VRAM Usage: Cloud (local minimal)
- Context Window: 128k
Testing Use Cases:
- Code generation
- General Q&A
- Creative writing
- Data analysis
- Task planning
- Other:
Observations:
- Strengths: Fast response, good at general reasoning
- Weaknesses: Cloud dependency
- Resource requirements: Minimal local resources
- Output quality: Solid for most tasks
- When to use this model: Daily tasks, coding help, general assistance
Verdict: ⭐⭐⭐⭐
deepseek-v3.1:671b-cloud
Date Tested: 2026-02-04
Model Info:
- Size/Parameters: 671B (cloud)
- Quantization: N/A (cloud)
- Base Model: DeepSeek-V3 by DeepSeek
Performance:
- Response Speed: Moderate (671B model)
- RAM/VRAM Usage: Cloud (local minimal)
- Context Window: 128k+
Testing Use Cases:
- Code generation
- General Q&A
- Creative writing
- Data analysis
- Task planning
- Other:
Observations:
- Strengths: Very capable, excellent reasoning, great with complex tasks
- Weaknesses: Slower response, cloud dependency
- Resource requirements: Minimal local resources
- Output quality: Top-tier, handles complex multi-step reasoning well
- When to use this model: Complex coding tasks, deep analysis, planning
Verdict: ⭐⭐⭐⭐⭐
Models to Test
Local Models (16GB GPU Compatible)
Small & Fast (2-6GB VRAM at Q4):
- phi3:mini - 3.8B params, great for quick tasks ~2.2GB
- llama3.1:8b - 8B params, excellent all-rounder ~4.7GB
- qwen2.5:7b - 7B params, strong reasoning ~4.3GB
- gemma2:9b - 9B params, Google's small model ~5.5GB
Medium Capability (6-10GB VRAM at Q4):
- mistral:7b - 7B params, classic workhorse ~4.1GB
- llama3.1:14b - 14B params, higher quality ~8.2GB
- qwen2.5:14b - 14B params, strong multilingual ~8.1GB
Specialized:
- deepseek-coder-v2:lite - 16B params, optimized for coding ~8.7GB
- codellama:7b - 7B params, coding specialist ~4.1GB
General Notes
Any overall observations, preferences, or patterns discovered during testing.
Initial Impressions:
- Cloud models (GLM-4.7, DeepSeek-V3) provide excellent quality without local resources
- Planning to test local models for privacy, offline use, and comparing quality/speed trade-offs
- Focus will be on models that fit comfortably in 16GB VRAM for smooth performance
VRAM Estimates at Q4 Quantization:
- 3B-4B models: ~2-3GB
- 7B-8B models: ~4-5GB
- 14B models: ~8-9GB
- Leaves room for context window and system overhead
Last Updated: 2026-02-04