I benchmarked 3 local LLMs on 50 factual questions -here's what failed

· Dev.to