- AInauten.net
- Posts
- 🤖 This AI can do more than just speak, hear and see
🤖 This AI can do more than just speak, hear and see
PLUS: Our new favorite (AI) app

AI-HOI and good morning, AInauts!
Ready for some exciting new AI topics and some practical insights?
That's what we have in store for you today:
🤖 This AI can do more than just speak, hear and see
🤓 Why everyone should work with AI APIs
🎧 New favorite (AI) app
📰 AI-News-Quickie: The HAI-lights
Let's go!
🤖 This AI can do more than just speak, hear and see
Large language models (LLMs) have basically read the entire Internet and devoured millions of books.
But if you ask them for details about current events in the real world, they are not very useful for the most part …
The start-up Archetype.ai is here to change that.

Their goal is to make AI useful not only for the digital world, but also for the physical world.
They’ve just presented Newton, "the first fundamental model that understands the physical world". Say what now? Let’s unpack this …
Why is it useful?
Imagine a factory with 100 sensors. Previously, you had to evaluate them all individually to check whether everything was running smoothly. Newton on the other hand can understand, evaluate and contextualize them all simultaneously in real time.
The cool thing is that you can interact with the model in very simple language, thanks to LLMs! Which could lead to fewer experts being needed in factories, because you can simply have a conversation in natural language.

Newton can record data from radar, motion detectors, chemical and environmental sensors and combines this input with a language model. It combines all the data from the various sensors into a model of the physical world.
Archetype has already found some practical applications and apparently counts Volkswagen and Infineon among its customers.
Combine this with humanoid robots and AI has really arrived in the physical world!
Check out this video to get a better understanding of the capabilities.
🤓 Why everyone should work with AI APIs
We're biiiiiiig fans of everything automation.
AI in particular opens up the floodgates to an incredible amount of possibilities!
There is a specialized tool for practically every use case - but at some point you have so many tool subscriptions that it costs a fortune.
This is where APIs come into play. Almost all major AI providers offer APIs. And don't worry, these three letters are half as technical as you might think at first glance.

APIs are a type of code interface which allow you to directly access the desired models such as GPT-4, or Claude Opus etc. Providers like Replicate make many models conveniently available via API.
The big advantage of APIs in general is that you pay based on usage and are not trapped with monthly subscription fees. They are generally very inexpensive.
In addition, with a few APIs, you can quickly put together many tool functions yourself. In other words, you no longer need these pricey tools!
The big disadvantage: Until now, you had to have a technical understanding on how to work with APIs.
Now, this is where no-code apps come into play! Anyone can use them, without much technical understanding.
You can build automations and workflows - and stitch together different AI models, tools, etc. with each with a few clicks. All without programming!

Here is a current example …
A friend of ours unfortunately doesn't speak English. But he really wanted to have some English lectures translated and listen to them in German, instead of just reading them.
How can this be achieved? The easiest and best way is: ElevenLabs
Just upload the audio file, select a target language, and off you go.
The disadvantage: You need a subscription with ElevenLabs, and since some lectures are quite long, this can get pricey.
This is where APIs come in handy! We built a very simple workaround for our friend with Zapier and OpenAI, in just 5 minutes.
This is how the workflow looks like:

Whenever he has a new lecture, he simply uploads the file to a special Google Drive folder.
We use OpenAI Whisper to create a transcript of the audio file.
ChatGPT then translates the transcript into German.
After that, Whisper creates a German audio file based on the transcript.
This MP3 file is then uploaded back to Google Drive and our friend can listen to it.
And all we need for this workflow is a Zapier account and an OpenAI API key!
This makes a difference in terms of price. We would have paid about 5 dollars for our test with ElevenLabs. On the other, this is how much it was with OpenAI:

Thanks to the APIs, we could also easily try out Claude Opus (the best Anthropic model) instead of OpenAIs GPT, without having a monthly subscription.
This was just a quick test, but as you can see, the combination of no-code tools and APIs is an incredibly powerful one. You can automate entire companies this way!
If you are interested in this type of content, please let us know and reply to this e-mail. We want to get a better feeling if this is relevant for you.
P.S. Sure, Zapier also costs something if you want to use the so-called Premium Apps. Make is cheaper, and there are also lesser known alternatives or open source tools. Our approach is that we prefer to pay for one tool that can be used to build any type of scenario - instead of paying for 50 subscriptions, which all just cover narrow and specific use cases.
🎧 New favorite (AI) app
We debated if his app was "newsletter-worthy" …
However, since we are absolute fans and use it every day, we thought this could also be of interest to you. We are talking about Snipd.

Snipd is a podcast player that makes it extremely easy for you to digest learnings from podcasts.
At the center is a non-AI feature that allows you to save short podcast highlights that you want to remember.
In other words: if you are listening to something exciting, simply press the snip button or activate it via your headset.
Then the AI comes into play.
Snipd automatically creates a transcript of the snippet, generates a summary and gives it a title.

Other cool features round off the tool:
Readable and audible summaries of entire podcasts
Chapter divisions also for podcasts that do not yet do this
Easily export your highlights to your favorite note-taking app.
Here's the app again. (The free version is also quite usable, and available for Android and iPhone.)
We love the appt, and it's a great example of how AI makes everyday life a bit better. At least for geeks like us.
📰 AI-News-Quickie: The HAI-lights
Before we come to a close, here are a some easily digestible news tidbits!
krea.ai has released a cool update that lets you combine up to three images and weight them in real time. Check out the video,
announcing multi-image prompts in real-time.
now you can use up to 3 images to condition your generations with our new "HD" model. twitter.com/i/web/status/1…
— KREA AI (@krea_ai)
6:36 PM • Apr 4, 2024
How good is OpenAIs Sora really? Here's an interview with the makers of the viral "Air Head" clip.
'Reggaeton Be Gone'! A resourceful programmer has built an AI machine out of necessity that automatically mutes his neighbor's music 😁 …
Cool OpenAI Sora video - with music! However, the music is not generated via AI, but rather from the musician August Kamp.
Video: What is Google doing in the healthcare field? Quite a lot, it turns out - and, of course, with the help of AI. Nevertheless, Big G has lost a lot of ground in the AI field if you look at this analysis by the Financial Times.
Apple Vision Pro Personas can now roam freely in other apps with VisionOS 1.1. These "Spatial Personas" convey the feeling of being in the same room as other users.

And then there was Higgsfield ... a new video AI app that we'll be keeping on our radar (iOS app here).
Hi! We’re Higgsfield - a Video AI company that's democratizing social video creation to everyone.
Our game changing foundational model excels at creating personalized characters with lifelike motion - with just 1 selfie and all on mobile.
We bring any story to life. Watch👇
— Higgsfield AI (@higgsfield_ai)
2:00 PM • Apr 3, 2024
The Indiana Pacers NBA basketball team used Snapchat AI filters to make it look like Los Angeles Lakers fans were crying during the game 😂.
🚨AI IN REAL-TIME
NBA Pacers used Snapchat AI filters to make it look like Los Angeles Lakers fans were crying during the game. twitter.com/i/web/status/1…
— Mario Nawfal (@MarioNawfal)
7:46 PM • Mar 30, 2024
Ghost Autonomy, an OpenAI-backed start-up for autonomous driving software, has thrown in the towel - it failed due to technical and financial hurdles.
Waymo, on the other hand, is making progress with self-driving cars. After the offering in California was recently expanded from San Francisco to LA, the company is now cooperating in Phoenix, Arizona, with Uber Eats as an autonomous delivery service.
🤭 Last, but not least … some AI-Fun
YAI!
That’s it for today! But no need to be sad. We’ll be back soon, with fresh insights and content for you.
See you soon, Fabian & Reto
Your feedback is essential for us. We read EVERY comment and feedback, just respond to this email. Tell us what was (not) good and what is interesting for YOU.
🌠 Please rate this issue:Your feedback is our rocket fuel - to the moon and beyond! |