--- title: "Ollama Model Testing Log" description: "Testing log tracking Ollama model evaluations with performance observations, VRAM requirements, and suitability ratings for different use cases on a 16GB GPU workstation." type: reference domain: development tags: [ollama, llm, model-testing, vram, gpu, deepseek, glm] --- # Ollama Model Testing Log Track models tested, performance observations, and suitability for different use cases. --- ## Quick Summary | Model | Date Tested | Primary Use Case | Rating | Notes | |-------|-------------|------------------|--------|-------| | GLM-4.7:cloud | 2026-02-04 | General purpose | ⭐⭐⭐⭐ | Cloud-hosted, fast, good reasoning | | deepseek-v3.1:671b-cloud | 2026-02-04 | Complex reasoning | ⭐⭐⭐⭐⭐ | Cloud, very capable, slower response | | | | | | | --- ## Model Testing Details ### GLM-4.7:cloud **Date Tested:** 2026-02-04 **Model Info:** - Size/Parameters: Unknown (cloud) - Quantization: N/A (cloud) - Base Model: GLM-4.7 by Zhipu AI **Performance:** - Response Speed: Fast - RAM/VRAM Usage: Cloud (local minimal) - Context Window: 128k **Testing Use Cases:** - [x] Code generation - [x] General Q&A - [ ] Creative writing - [x] Data analysis - [ ] Task planning - [ ] Other: **Observations:** - Strengths: Fast response, good at general reasoning - Weaknesses: Cloud dependency - Resource requirements: Minimal local resources - Output quality: Solid for most tasks - When to use this model: Daily tasks, coding help, general assistance **Verdict:** ⭐⭐⭐⭐ --- ### deepseek-v3.1:671b-cloud **Date Tested:** 2026-02-04 **Model Info:** - Size/Parameters: 671B (cloud) - Quantization: N/A (cloud) - Base Model: DeepSeek-V3 by DeepSeek **Performance:** - Response Speed: Moderate (671B model) - RAM/VRAM Usage: Cloud (local minimal) - Context Window: 128k+ **Testing Use Cases:** - [x] Code generation - [x] General Q&A - [ ] Creative writing - [x] Data analysis - [x] Task planning - [ ] Other: **Observations:** - Strengths: Very capable, excellent reasoning, great with complex tasks - Weaknesses: Slower response, cloud dependency - Resource requirements: Minimal local resources - Output quality: Top-tier, handles complex multi-step reasoning well - When to use this model: Complex coding tasks, deep analysis, planning **Verdict:** ⭐⭐⭐⭐⭐ --- ## Models to Test ### Local Models (16GB GPU Compatible) **Small & Fast (2-6GB VRAM at Q4):** - [ ] phi3:mini - 3.8B params, great for quick tasks ~2.2GB - [ ] llama3.1:8b - 8B params, excellent all-rounder ~4.7GB - [ ] qwen2.5:7b - 7B params, strong reasoning ~4.3GB - [ ] gemma2:9b - 9B params, Google's small model ~5.5GB **Medium Capability (6-10GB VRAM at Q4):** - [ ] mistral:7b - 7B params, classic workhorse ~4.1GB - [ ] llama3.1:14b - 14B params, higher quality ~8.2GB - [ ] qwen2.5:14b - 14B params, strong multilingual ~8.1GB **Specialized:** - [ ] deepseek-coder-v2:lite - 16B params, optimized for coding ~8.7GB - [ ] codellama:7b - 7B params, coding specialist ~4.1GB --- ## General Notes *Any overall observations, preferences, or patterns discovered during testing.* **Initial Impressions:** - Cloud models (GLM-4.7, DeepSeek-V3) provide excellent quality without local resources - Planning to test local models for privacy, offline use, and comparing quality/speed trade-offs - Focus will be on models that fit comfortably in 16GB VRAM for smooth performance **VRAM Estimates at Q4 Quantization:** - 3B-4B models: ~2-3GB - 7B-8B models: ~4-5GB - 14B models: ~8-9GB - Leaves room for context window and system overhead --- *Last Updated: 2026-02-04*