Categories: Overall

Meta’s Multisensory AI Model: Linking Six Types of Data for Immersive Experiences

Hello my tech-loving followers, it’s your favorite funny guy Nuked here to talk about Meta’s latest announcement. They have just open-sourced a new AI model that combines six different types of data to create a multisensory experience. This research project is a glimpse into the future of generative AI systems and shows that Meta is still sharing their AI research while others like OpenAI and Google have become more secretive.

The core concept of this research is to link multiple types of data into a single multidimensional index, which is the same concept behind the recent boom in generative AI. Multimodal AI models are the heart of this boom, like DALL-E, Stable Diffusion, and Midjourney, which all rely on systems that link together text and images during the training stage.

Meta’s new model, ImageBind, is the first to combine six types of data into a single embedding space: visual (including both image and video), thermal (infrared images), text, audio, depth information, and movement readings generated by an inertial measuring unit (IMU). This combination of data allows for future AI systems to cross-reference information in the same way current systems do for text inputs.

Imagine a virtual reality device that generates not only audio and visual input but also your environment and movement on a physical stage. For example, you could ask it to emulate a long sea voyage, and it would place you on a ship with the noise of the waves in the background, the rocking of the deck under your feet, and the cool breeze of the ocean air.

Meta notes in their blog post that other streams of sensory input could be added in future models, such as touch, speech, smell, and brain fMRI signals. This research brings machines one step closer to humans’ ability to learn simultaneously, holistically, and directly from many different forms of information.

While this is all very speculative, the immediate applications of this research will likely be more limited. However, Meta is open-sourcing the underlying model, which is an increasingly scrutinized practice in the world of AI. Those opposed to open-sourcing say it’s harmful to creators because rivals can copy their work and it could be potentially dangerous, allowing malicious actors to take advantage of state-of-the-art AI models. Advocates respond that it allows third parties to scrutinize systems for faults and ameliorate some of their failings.

Meta has been firmly in the open-source camp, though not without difficulties like their latest language model, LLaMA, leaking online earlier this year. With ImageBind, they continue with this strategy and provide a glimpse into a future of generative AI systems that create immersive, multisensory experiences.

Spread the AI news in the universe!
Nuked

Recent Posts

Microsoft Considers Unleashing Indiana Jones on PS5: A New Era of Multi-Platform Gaming

Hey there, my tech-loving followers! It's your funny guy, Nuked, here with some exciting news…

2 hours ago

Recall Alert: Bambu Lab Issues Safety Warning for A1 3D Printers

Hey there, my tech-loving followers! It's your funny guy Nuked here with some interesting news…

2 hours ago

IFixit Goes Deep: Unveiling the Inner Workings of Apple’s Vision Pro

Hey there, my fellow tech enthusiasts! It's your favorite funny guy who loves all things…

1 day ago

Apple’s Vision Pro: Don’t Lose It, Because You Can’t Find It!

Hey there, fellow tech enthusiasts! It's Nuked here, ready to bring you some news about…

2 days ago

Unveiling Apple’s Hidden Surprise: The Mega Lightning Plug in the Vision Pro Headset

Hey there, my fellow tech enthusiasts! It's your funny guy Nuked here, ready to share…

2 days ago

Unveiling the Illusion: The Truth Behind Samsung’s’Fake’ Photos

Hey there, my awesome followers! It's your favorite funny tech guy, Nuked, here to bring…

2 days ago