If you're doing requests, I could use 4.0 bpw and 3.0 bpw. With my 3060, I can get 4k context at 4.0 bpw, and likely 6144 at 3.0 bpw.
done
· Sign up or log in to comment