04-09-2025, 02:24 AM
im running it at q4_k_m with only 16gb of vram, and on top of that i set 32000 tokens context window and it works just fine. meaning basically endless memory i could prompt like 200 pages of a book and then ask it about the first sentence and it would remember it

