Testing Local Llms Via Rustlings

Using Rustlings as a simple benchmark to find the smartest local LLMs (at least for me).

While I haven't even tried to wrap my head around the world of coding agents (or agents in general), and I don't have a single AI plugin on my Neovim config, I usually keep a browser tab open for Claude or Gemini. I find that they're very helpful with some small tasks like finding typos, untangling error messages, generating some code or locating that one specific word that's on the tip of my tongue.

Recently, I stumbled upon this post titled "Friends Don't Let Friends Use Ollama". I actually had Ollama installed but rarely touched it and the post sparked my curiosity a bit. Not just about alternatives, but also about whether a local LLM could actually work for me. So, I decided to give it a try, and the tool I decided to go with by the way was LM Studio (mainly because it was really easy to install on my current OS).

Too many options, too many opinions

Something I sometimes do when there are too many options and too many opinions about something is to just ignore them all. Instead, I try to focus on what the "real goal" or purpose is, find the heuristics or rules of thumb I can use and then, and only then, I check the waters again.

One thing that was also really helpful was that, amidst the chaos of model variants out there, LM Studio provides some recommendations based on hardware and staff picks that I have found were a great starting point.

The "vibe check"

For those of you who don't know, Rustlings is a small collection of exercises that are meant to be completed as you advance through the Rust book.

Now, I avoid LLMs when I'm actually learning and especially when it comes to a new programming language or framework. But it was working on a few exercises when I thought that maybe they were a great way to check which models gave me the best results. A practical rule-of-thumb.

The method is simple: load a model, paste a couple of broken exercises, and see what it spits out. But here is the important bit: then I ask for explanations, alternatives and justifications.

At the end of the day, Rustlings is public and the solutions are available all over the Internet. Just checking that the LLM writes down the right code and call it a day would be a bit meaningless. But by probing the reasoning and checking the alternatives it suggests, I got a real taste of the model's actual flexibility and "intelligence".

In my case, the best one has been google/gemma-4-e4b. It's small and fast enough to not need to use the power-hungry dedicated GPU of my laptop, and, it has been working really great for me.

Final Thoughts

What I want to say with this article is that:

  1. Local LLMs can do the work perfectly fine if your requirements are realistic and focused (like code linting or some "rubber ducking").
  2. Instead of learderboards, focus on your specific use case and workflows.
Testing Local Llms Via Rustlings