• AInauten.net
  • Posts
  • 🎧 OpenAI now speaks dialect - and sounds amazingly real!

🎧 OpenAI now speaks dialect - and sounds amazingly real!

PLUS: New superpowers for Claude and uproar in Hollywood

Hello AInauts,

Welcome to the latest issue of your favorite newsletter!

There are a few updates that are very exciting, especially in the larger context: the latest OpenAI text-to-speech models are really strong, Claude is catching up with ChatGPT, China is shaking up the competition with predatory pricing and Hollywood is in a tizzy because Silicon Valley bosses want looser copyright rules ...

Exciting times! This is what we have in store for you today:

  • 🗣️ The best text-to-speech models - test them now!

  • 🔥 Claude gets two new superpowers

  • 🇨🇳 Cheapest China AI escalates feud between Silicon Valley and Hollywood

  • 😸 AI fun: Yarn cat

Here we go!

🗣️ The best text-to-speech models - test them now!

OpenAI has brought some innovations to the API in recent weeks, such as the agent SDK and three new audio models: two voice-to-text models that even outperform Whisper, and a new text-to-speech model.

You can try it out directly at openai.fm and download the result as an MP3 - very cool! And we thought it was even cooler that a Swiss German text was rendered in the best Bernese dialect. Here is an example of an exuberant Italian chef.

OpenAI declares war on market leader ElevenLabs

  • OpenAI has released the new models at extremely attractive prices!

  • It's much cheaper than the market leader ElevenLabs - even without a subscription!

  • While you pay between 72 and 132 dollars for 10 hours of audio at ElevenLabs, OpenAI only wants 36 dollars.

But why pay at all when the model with the most natural voice output has now even been released as open source? We introduced Sesame.com a few weeks ago - now you can integrate it into your apps.

Our favorite tool: Now finally available for Windows!

Trick question: What is our favorite AI tool? You probably think it's ChatGPT, Claude, or something along those lines. Wrong, it's Flow from Wispr!

If you're at home in the Microsoft Windows world, there's good news: The tool is finally available!

Over the last few months, this little voice-to-text helper has given us a real productivity boost. The words you read here were recorded with it - and we "write" almost everything with it. Super helpful and highly recommended!

🔥 Claude gets two new superpowers

Anthropic is FINALLY upgrading Claude and giving it what has long been the standard for others: real-time web search!

Claude responds with a short text and shows the links to the sources used underneath. Direct quotations are also integrated, which makes verification easier - similar to what we are used to from Perplexity and ChatGPT Search.

The feature is currently only available as a preview for paying Claude users in the US and only works with Claude 3.7 Sonnet. The Brave Search Index is apparently used in the background. An expansion to other countries and free accounts is planned.

Why is it relevant?

  • For one thing, Claude has addressed its biggest shortcoming compared to ChatGPT. On the other hand, Anthropic is also working on a voice functionality for Claude, which will bring it even more in line with the features of its competitors.

  • There are references in the code to further functions and even an agent called Harmony! The new "Think" command definitely makes the model smarter.

  • The next few weeks should be exciting. OpenAI has recognized the signs and has already lifted the credit system for Sora as a little treat - you can now generate unlimited videos (10 seconds and 720p).

  • Anthropic has also just completed a large financing round and received 3.5 billion dollars in fresh capital at a valuation of 61.5 billion dollars. Incidentally, Amazon holds over 35%, Google 14% and the founders together hold 17.5%.

Web search in chatbots becomes standard - what it means for Google

Now that web access is actually a standard feature of major chatbots, this raises the question of what this means for the future of web search.

Google itself counters with the "AI Mode"(currently only in English - it was ok in our initial tests, but we're sticking with Perplexity for now), but websites are still losing advertising revenue and visitors - the classic search engine business model is imploding in slow motion before our eyes.

But even if chatbots like Perplexity and ChatGPT send far fewer visitors to websites than traditional search engines, the traffic is often better qualified - and the majority of users still primarily use Google.

Speaking of chatbots: take a look at https://duck.ai! There are some powerful models available to you free of charge and anonymously - or use them directly via the DuckDuckGo browser for Android, iOS, MacOS and Windows.

via Duck.ai

🇨🇳 Cheapest Chinese AI escalates feud between Silicon Valley and Hollywood

After DeepSeek R1 and Manus, the Chinese AI giant Baidu has followed up with ERNIE 4.5 and ERNIE X1.

"So what ... these are just another two new Chinese models?"

What makes the new Chinese models interesting?

Yes, but ... they are state-of-the-art according to benchmarks, and dirt cheap in comparison:

  • ERNIE 4.5(note the version!) is a new multimodal model that understands text, images, video and audio. And it is supposed to be roughly equivalent to GPT 4.5, but costs only 1% of it. ONE PERCENT!

  • ERNIE X1 is comparable to the DeepSeek R1 reasoning model - it also solves complex problems, but is only half the price of DeepSeek R1.

  • The ERNIE bot is free for everyone to use, and Baidu has also announced that it will make the models open source at the end of June.

Price collapse of intelligence, but benchmarks ≠ practical use

Although the models are at the top of the benchmarks, in practical tests they sometimes fail at the simplest tasks (which could be due not least to the incredibly small context window of 8k tokens)... In addition, API access is very cumbersome for non-Chinese users.

But even if these models cannot be compared one-to-one with those of OpenAI and co., these dramatic price differences make one thing clear above all: intelligence is getting cheaper by the week, and we are in a price war!

The new o1-pro model is 270x (!) more expensive than the DeepSeek R1 model (or correspondingly over 500x more expensive than ERNIE X1). However, Baidu is struggling to gain widespread adoption despite the supposedly comparable performance and is struggling with limited market penetration.

Nevertheless, China's ambitions to take pole position from the USA in the AI race are unbroken. On Friday, the Hunyuan T1 model was released, which also shines in various benchmarks. And in the next few weeks, DeepSeek R2 will be released, which will probably also cause a sensation again...

The price only goes in one direction: down! The intelligence on the other side points steeply upwards.

More and more voices are therefore being raised to ban Chinese models altogether... Which seems kind of ridiculous for an open source model that runs on its own hardware.

Hollywood vs. Silicon Valley - copyright becomes a contentious issue

If only there wasn't still the issue of copyright: OpenAI and Google begged the Trump administration last week to declare the training of AI as "fair use" due to the copyright infringement lawsuits - otherwise, according to OpenAI, the race with China is over, because: regulation slows down innovation!

On the other hand, there are celebrities from Hollywood. Over 400 celebrities have signed an open letter to the White House warning that this could threaten the creative industry. Their message: AI companies must also pay for the use of copyrighted works, just like everyone else.

One thing is certain: the decisions on copyright will have far-reaching consequences for the future of the (American) AI industry!

via imgflip.com

😸 AI-Fun: Yarn cat

Alright, after all the feature updates and behind-the-scenes wrangling, let's wrap up today with this gem we found on Reddit. Have a nice day!

We made it! But no need to be sad. The AInauts will be back soon, with new stuff for you.

Reto & Fabian from the AInauts

P.S.: Follow us on social media - that motivates us to keep going 😁!
X, LinkedIn, Facebook, Insta, YouTube, TikTok

Your feedback is essential for us. We read EVERY comment and feedback, just respond to this email. Tell us what was (not) good and what is interesting for YOU.

🌠 Please rate this issue:

Your feedback is our rocket fuel - to the moon and beyond!

Login or Subscribe to participate in polls.