Google’s Gemini 1.5 and Meta’s V-JEPA Redefine AI Standards

Trending on Techiexpert

Google and Meta have announced new and exciting advancements in artificial intelligence (AI). The two tech giants are putting efforts to make AI better by introducing innovative models.

Google’s latest offering, Gemini 1.5, was unveiled lately by Demis Hassabis, CEO of Google DeepMind. This new AI model, made using the Transformer and Mixture of Experts (MoE) architecture, is better at understanding lots of information from different sources. Unlike its predecessor, Gemini 1.0, which had a context window of 32,000 tokens, the newer Gemini 1.5 comes with a substantial 1,28,000 token context window. Tokens are like the pieces used by the model to process information. They help the model give better and more helpful results. Google also announced a special version of Gemini 1.5 with a context window of up to 1 million tokens, catering to select developers and enterprise clients in a private preview.

Meta introduced a new model called V-JEPA, which helps machines learn better from videos. It is a big step forward in how machines understand things through pictures and videos. Unlike traditional generative AI models, V-JEPA functions as a teaching method, enabling ML systems to comprehend and model the physical world by analyzing videos. Through a unique masking technology, wherein parts of the video were strategically masked in time and space, V-JEPA learns to predict both the current and subsequent frames efficiently. This model is really good at recognizing actions like knowing the difference between small movements such as picking something up or setting it down.

Although V-JEPA currently relies solely on visual data, Meta plans to integrate audio input into the ML model in the future. The company also aims to improve the model so it can handle longer videos more effectively.

Google’s Gemini 1.5 and Meta’s V-JEPA are big steps forward in AI research.

Recent Stories

Related Articles