- AInauten.net
- Posts
- π Weekly AI news: Did you miss it?!
π Weekly AI news: Did you miss it?!
π¨βπ The most important AI updates at a glance
π Weekly AI news: Did you miss it?!
π¨βπ The most important AI updates at a glance
AI-HOI and happy weekend, dear AInauts!
Today we've designed the weekend edition a little bit differently.
We thought we'd briefly summarize each of our posts (instead of reproducing them in full length) - and also give you an overview of the most important news updates. Everything is happening sooo fast π€― ...
Please reply to this mail and let us know if you like this format with the news short messages!
A quantum leap in voice interaction, but still blocked in the EU! We're talking about ChatGPT's Advanced Voice Mode. But with a VPN on your mobile and a US IP address, you can easily trick the system to access the new features. Voice Mode offers numerous cool use cases, e.g. for learning assistance, conversation role play and coaching. You really have to try it out! Currently only for Plus users, but soon also available for free accounts.

OpenAI is in the spotlight! Sam Altman publishes his vision of the future; new models outperform human intelligence, and the ChatGPT Advanced Voice mode is rolled out.
At the same time, there is drama: top executives such as CTO Mira Murati resign, and rumors circulate about OpenAI becoming a for-profit company. What's more, a fresh 6.6 billion US dollar cash injection (!) is now in the bag.
Sam is playing 4D chess, but what does that mean for the original mission of developing AI for the benefit of humanity? Because his blog post on the "Age of Intelligence" highlights the dilemma: technological progress vs. ethical responsibility ...

π©βπ» The most important things from OpenAI Dev Day
At Dev Day, OpenAI presented the new Realtime API, which enables low-latency speech-to-speech apps. It processes speech directly (without converting it to text first) and responds quickly, including emotions.
The new interface is already available and promises exciting new possibilities for voice applications. This means that practically any app can now be equipped with Advanced Voice Mode! The A(P)I thus overcomes the last hurdle to be able to replace call centers (employees) almost completely ...
The ElevenLabs Reader app on cell phones is positioned as an alternative to Audible. It offers out-of-the-box access to various content and books that can be read aloud directly. But you can also import your own URLs and texts and convert them directly into audio files. The app supports several celebrity voices and is constantly evolving. Install and use it now!
We use Loom for asynchronous working and documentation - perfect for boosting your productivity. It enables video recordings of your screen and camera, automatically creates transcripts, chapters and summaries. You can use it to easily generate SOPs and document processes. Even cooler: By combining it with ChatGPT and automation tools such as Zapier or Make, tasks can be completely automated. A really valuable helper for more efficient work that we wouldn't want to miss!

π° AI news quickie: HAI highlights from the industry
OpenAI
Sam is consolidating his power and the biggest venture capital deal of all time has been finalized: 6.6 billion US dollars for OpenAI, at a valuation of 157 billion.
Elon had recently raised "only" 6 billion for xAI. Bubble? This is what OpenAI Chair Bret Taylor thinks.
Canvas was released directly after Dev Day (initially only for paying users) - the answer to Claude's Artifacts, so to speak. This allows you to work even more collaboratively with ChatGPT. Very exciting!
Two open-source libraries were used in the development of Canvas - and OpenAI is sponsoring the developer behind them.
Advanced Voice Mode is also coming for free users - and for the EU there is at least the update: "Stay tuned!"
Even more brain drain: Sora Research Lead Tim Brooks moves to Google DeepMind, and former OpenAI co-founder Durk Kingma joins Anthropic.
The researchers behind o1 provide insights into their research. Can you feel the AGI?
Video & Images
PIKA 1.5 has caused a stir with a clever new release - you can now use effects such as melting, blowing up, kneading, crushing objects - or even cutting them up as a cake. They have strategically captured a niche here, TikTok, Insta and co. will be flooded with such videos.
Hyper-fast AI videos are available with the LumaLabs Dream Machine v1.6 - full quality clips can be created in less than 20 seconds (for paying subscribers and API users).
HeyGen has released a new Avatar Looks feature - allowing you to put your digital twin in all kinds of situations. Practically Indistinguishable from a "real" video ...
Meta presents Movie Gen - and the entire AI influencer gang is full of superlatives. But see for yourself, it turned out really well - more details here. But you can't use it yet ...
Hedra has released Character-2. Horizontal and vertical videos are now supported, and the quality is better. Up to 4 minutes of video for free!
KLING has also received new updates, such as improved lip-syncing.
ByteDance, the TikTok parent company, has introduced the new PixelDance video generator - check out the sample video here, impressive.
You also want to make videos like these dancing shoelaces? You can find instructions in this Reddit thread and here on YouTube. Have fun!
Black Forest Labs has struck again and released the new FLUX 1.1 image model - accessible via Replicate and fal.ai, but also via their own API. 6x faster and even better image quality!
Adobe Photoshop and Premiere get more AI updates - and if you use Canva, you should get to know these features.
Chatbots & voice models
You can find out which model performs best at https://artificialanalysis.ai - for text, image and voice models.
Claude Artifacts can now fix its mistakes with one click - nice!
Google's Gemini is being rolled out more widely and is available on more devices in over 40 languages - including German, French, ...
Presumably, there will soon be a o1 competitor from Google that can think and act like humans.
Google Lens can process questions about videos, and Gmail is getting more Gemini power.
The Google Gemini 1.5 Flash 8B model is now "production-ready".
Microsoft pushes the ubiquitous AI assistants into their product range and onto Windows - and takes another run at it with the "Recall" feature that records your screen. Copilot takes center stage, with voice and vision - more details here.
Industry & Research
NVIDIA CEO Jensen Huang finds clear words: "We are at the beginning of the next wave of AI!"
In any case, prices are already getting cheaper from week to week. Since the last OpenAI Dev Day, the cost per token has been reduced by 98% (from GPT-4 to 4o-mini)!
Google Deep Mind's chip design method AlphaChip designs the new chips with AI!
Macron warns: If the EU does not catch up to the USA in the field of AI, it could have serious consequences!
Wikipedia is struggling with the flood of AI-generated content and has launched the "AI Cleanup" project.
Would it be good if AI replaced doctors? Opinions are divided, but the data shows which way it is going.
Your AInauts,
Fabian & Reto
Your feedback is essential for us. We read EVERY comment and feedback, just respond to this email. Tell us what was (not) good and what is interesting for YOU