Skip to main content

๐ŸŸก 11Labs AI Sound Effects In-Depth Review: Farewell to the Silent Film Era?

๐Ÿ˜€ I got the beta access to ElevenLabs' AI sound effects!

Sora's text-to-video continues to stir up the AI community. Everyone is eagerly awaiting the release of OpenAI Sora. Each demo released by Sora generates a high amount of shares!

AI has transitioned from text to images and now to videos, signaling the imminent arrival of the AI era. Simultaneously, another element gaining prominence recently is "sound effects"!

https://player.bilibili.com/player.html?aid=1151575440&bvid=BV1uZ421Y77V&cid=1468661973&p=1&high_quality=1&autoplay=0

  • On February 18th, ElevenLabs released a semi-automatic AI sound effects video, bringing "sound" to the Sora universe.
  • On February 27th, Pika (an AI video production platform) introduced sound lip sync functionality.
  • On March 10th, Pika integrated AI sound effects.

https://player.bilibili.com/player.html?aid=1051540048&bvid=BV1eH4y157Pa&cid=1468662147&p=1&high_quality=1&autoplay=0

Just as I was considering "investing" to experience Pika's sound effects and to feel the "cyber-fried bacon ๐Ÿฅ“," ElevenLabs surprised me with an invitation (internal access), updating their feature called "Sound Effects."

With the same name, coupled with Pika's previous lip sync, it's hard not to associate the "Sound Effects" I received internally with the technology behind Pika.

My first thought was to add sound effects to all the recent Sora videos with the new Sora and Apple Vision Pro experiences.

Sora's Latest Videos in March:โ€‹

  • "A dragon made of bubbles, perfectly rendered 8k."

https://player.bilibili.com/player.html?aid=1001613338&bvid=BV18x4y1Q7M8&cid=1468661963&p=1&high_quality=1&autoplay=0

  • "A dragon made of bubbles, perfectly rendered 8k."

A transparent landscape turtle crawls on the beach.

https://player.bilibili.com/player.html?aid=1401690844&bvid=BV1Ur421J7gL&cid=1468661970&p=1&high_quality=1&autoplay=0

  • "An alien blending in naturally with New York City, paranoia thriller style, 35mm film."

https://player.bilibili.com/player.html?aid=1851698528&bvid=BV16W421c7RN&cid=1468662035&p=1&high_quality=1&autoplay=0

Isn't it impressive? Sora's video scenes generated from prompts are realistic, hiding any trace of AI, yet the content is creatively mind-blowing!

However, the absence of sound effects makes the videos feel incomplete!

Hands-On Experienceโ€‹

Next, let me give you a sneak peek into the magic of AI sound effects!

Above is the interface after opening ElevenLabs Sound Effects, with no usage instructions, only a box for entering prompt words, apparently for generating sound effects!

Different from ElevenLabs: Pika first enters the video/picture and generates a description to generate sound effects, or uses "the same prompt word" to generate sound effects in the process of generating video.

Since this way, I want to tailor the sound effects of the newly released "dreamy bubble-spraying dragon:" video.

ElevenLabs Sound Effects Deep Testโ€‹

  1. First, I reused the current Sora video Prompt in the simplest way.

๐Ÿค  A dragon made of bubbles, perfectly rendered 8k.

https://player.bilibili.com/player.html?aid=1301710453&bvid=BV1tu4m1g7BY&cid=1468662155&p=1&high_quality=1&autoplay=0

After waiting a few seconds, 5 sound effects were generated. Although these 5 sound effects are related to bubbles, they are not very suitable for placing in the video, and some segments are only 1 second long.

So ElevenLabs is similar to Midjourney v5, and I should try to use the most representative words to express the sound effects I want?

This thought writing prompts indeed had a better effect (writing this article, and the prompt prompts the box "describe your sound effect" and the sound of bubbles burst and the water dripping from the sound are realized.

https://player.bilibili.com/player.html?aid=1901672117&bvid=BV1Dm411d7cF&cid=1468661975&p=1&high_quality=1&autoplay=0

This still takes the same thing to different frames easy to easier adjusted.

ElevenLabs + Sora!โ€‹

According to this idea, we can also put the description sound effects of different pictures in a prompt to generate a whole video sound effect in one go.

  • Gentle rustling sand, subtle shell movement, soft sea breeze, rolling wave whispers, distant bird calls.

https://player.bilibili.com/player.html?aid=1351631404&bvid=BV1T6421c7ZJ&cid=1468661969&p=1&high_quality=1&autoplay=0

In fact, it is more convenient to adjust the picture separately! This 20s video is divided into 5 different sound effects.

  1. City background noise: car horns, crowd noises, subway rumbles, pedestrian footsteps.
  2. The sound of silent footsteps and rubber friction.
  3. Electronic buzz, low-frequency buzz, rustling of clothes.
  4. The heart beats faster and the atmosphere is tense.
  5. The sound of sirens, the horns of fire engines, and the roar of helicopters.

Reinterpret this video through sound!

https://player.bilibili.com/player.html?aid=1101687647&bvid=BV1Dw4m1d7Av&cid=1468662142&p=1&high_quality=1&autoplay=0

Write at the endโ€‹

AI video is constantly filling up its jigsaw fragments, from Tusheng video, to Wensheng video, Wensheng music, and Wensheng sound effects.

I believe that, like many AI products we have seen before, they will be optimized and integrated at a very fast speed and become part of the AI video workflow.

At that time, it will be a "one-click" to generate AI videos in the true sense!

๐Ÿ’ก For questions about the installation or use of 11labs, you are welcome to leave a message in the comment area at the bottom and communicate together~