AInauten.net
Posts
💻 Use AI language models free of charge without the internet

💻 Use AI language models free of charge without the internet

PLUS: Be careful when analyzing data with AI

Die AInauten
January 10, 2025

In partnership with

Hi AInauts,

Welcome to the latest issue of your favorite newsletter.

Today’s issue is packed with practical tips on language models. How to use them offline for free, and why it's important to know the strengths and weaknesses of each AI language model. Especially if you want to work with them!

Here are the topics:

🤖 Your own AI supercomputer from NVIDIA
💻 Use AI language models for free without the internet
⚠️ Be careful when analyzing data with AI

Here we go!

If you're frustrated by one-sided reporting, our 5-minute newsletter is the missing piece. We sift through 100+ sources to bring you comprehensive, unbiased news—free from political agendas. Stay informed with factual coverage on the topics that matter.

Join for free today!

🤖 Your own AI supercomputer from NVIDIA

We of course have to start by talking about NVIDIA CEO Jensen Huang's keynote at the CES in Las Vegas.

As always with NVIDIA, it was often very technical and definitely above our pay grade. Although we thought to ourselves the whole time, "Wow, this is magic!" - it's extremely fun to listen to a genius like him.

For all non-techies (like us), here's a summary of the most important points ... and if you want all the details, here's the whole keynote and here's the supercut in 10 minutes.

The picture above shows very well what stage of development we are at in terms of AI. While most of us are still busy with generative AI - i.e. content creation, marketing and chatbots - NVIDIA is already two steps ahead.

The next phase is AI agents, which we will also be focusing on this year. And because all the prerequisites are in place, physical AI is already at work at NVIDIA. In other words: robots, self-driving cars, automated factories that understand the world around them.

It's so cool to see what is already reality today and how quickly the field is developing.

Using AI locally with your own supercomputer

The most exciting topic for us, however, was "Project Digits". This is NVIDIA's AI supercomputer for individuals.

We'll spare you all the technical features like 1 petaflop compute power and 128GB memory... Just remember: this little thing really has power!

You can use it to run many AI language models, image models, etc. locally on your premises without the cloud or internet at a decent speed.

Best of all, it is due to be launched on the market in May 2025. Of course, with an estimated starting price of 3,000 dollars, it's not a bargain - but it's an interesting investment for real nerds.

And if you don't want to invest 3,000 dollars now or want to start using AI locally on your own computer today, please read on. 😉

💻 Use AI language models free of charge without the Internet

We regularly receive questions from the community about the best way to use AI models locally on your computer.

In other words, 100% data protection-compliant, without the Internet, etc. Before we describe the easiest way you can currently use it, there is one important caveat:

AI needs good hardware!

Without a good computer, you won't have much fun. In our opinion, the best deal in terms of price vs. AI power can be found at Apple of all places:

The cheapest version of the Mac Mini with the M4 chip should be more than sufficient for most requirements. It gives you incredible power for language and image models, at a very fair price.

Smaller language models and image models also run on much older computers. Make sure to give it a try before you invest.

How to quickly setup a language model locally

The easiest, fastest and free way is currently via Ollama.

Simply download the app and install it. Then open your terminal or console and enter the following command:

ollama run llama3.2

This automatically downloads the Llama 3.2 model and you can chat with it locally and free of charge directly in the console.

And at just under 2 gigabytes, the model doesn't take up much space on your hard disk either.

On the Ollama website you can see many other models that you can download and use immediately. Mistral or DeepSeek are also included, ideal for programming, and there are many, many more.

How to set up a proper chat interface

If you don't feel like writing via the console window and are looking for a ChatGPT style tool instead, you can take a look at Chatbot Ollama or Lobe Chat.

Both are open source and free of charge. If you are not so technically savvy, it is best to use the Pinokio browser and install the tools in 2 minutes.

⚠️ Caution when analyzing data with AI

Let's move on to another practical use case. We have often written about the fact that you should use different chatbots and language models depending on the use case.

For example, in our opinion, Anthropic Claude 3.5 Sonnet is still the best model if you need advertising and marketing texts.
For complex questions, it is best to use OpenAI's o1.
For extensive data and long conversations, use Google Gemini 2.0 Flash.
For math topics and programming, it is best to use the inexpensive DeepSeek V3 or Meta Llama 3.3.
And if you are looking for a rather uncensored model with few guardrails, Nous Hermes or xAI Grok 2 are good choices.

The importance of model selection shows itself when we analyze different types of data.

Here, for example, is a practical case: duplicates and possible test leads are to be filtered from a (previously anonymized) lead list in order to determine the total number per lead type and time period.

The correct result would be 42 in the example below.

How the different models perform

After a thoroughly plausible analysis, Claude confidently arrives at 41.

After asking again whether this is really the final answer, the model counts again and corrects herself to 38. 😁

Gemini is more conservative and comes in at 39.

Gemini also gives us one more lead on request.

Grok also comes up with 39, and remains confident, even after being asked if this is the final answer.

Did any model get everything right? Yes, OpenAI o1 concluded the correct result perfectly and was not dissuaded, even when asked.

Ok, sure, that's a bit unfair. o1 is a reasoning model, the others are not. But Gemini's thinking model also arrives at 45 leads instead of 42.

In practice, however, we see time and again that the same model is used for all kinds of requirements. As the saying goes: “If Your Only Tool Is a Hammer Then Every Problem Looks Like a Nail.”

How can errors and poor results be minimized?

There is (currently) no 100% certainty or guarantee of truth when using LLMs. But you can significantly improve your results in three simple ways.

Use different models, depending on the use case and based on their strengths.
We use model stacking in our automations, i.e. one model produces a result, a second one then checks it for plausibility and improves the result, if necessary.
For one-off tasks, you can also use OpenRouter to send a request to multiple models at once and compare the answers directly in the chat.

This approach should give you better results.

We made it! But no need to be sad. The AInauts will be back soon, with new stuff for you.

Reto & Fabian from the AInauts

P.S.: Follow us on social media - that motivates us to keep going 😁!
Twitter, LinkedIn, Facebook, Insta, YouTube, TikTok

Your feedback is essential for us. We read EVERY comment and feedback, just respond to this email. Tell us what was (not) good and what is interesting for YOU.

🌠 Please rate this issue:

Your feedback is our rocket fuel - to the moon and beyond!