Yesterday, Microsoft Xbox took the wraps off Muse, which they describe as a “generative AI model designed for gameplay ideation.” To accompany this, they released an article on Nature.com, along with a detailed blog post and a YouTube video. Now, if you’re scratching your head over what “gameplay ideation” actually means, Microsoft defines it as the process of generating “game visuals, controller actions, or both.” However, in reality, Muse’s applications seem somewhat restrained—not to mention, it doesn’t come close to replacing an actual game development pipeline.
Nevertheless, some aspects of the data are intriguing. The AI training took place on H100 GPUs, involving approximately a million updates just to extend a single second of real gameplay into nine extra seconds of simulated gameplay that stays true to the engine. The bulk of the training data was drawn from existing multiplayer gaming sessions.
Instead of running the game solely on one PC, Microsoft had to utilize a cluster of 100 Nvidia H100 GPUs, which elevated the cost and power consumption significantly. Yet, this setup only managed to produce a low-resolution output of 300×180 pixels for roughly nine additional seconds of extrapolated gameplay.
The most captivating demonstration from the crew was how Muse managed to duplicate existing props and enemies within a scene while replicating their functionality. But one has to wonder about the hefty hardware expenses, significant electricity consumption, and AI training involved here, especially when conventional development tools could easily spawn enemies or props.
Sure, it’s interesting to see that Muse can maintain object permanence and mimic the original game’s behavior. But when you stack it up against the proven video game development methods, the application appears unnecessarily extravagant and inefficient.
While Muse’s future iterations might boast more groundbreaking capabilities, this endeavor currently lands alongside a plethora of other AI projects aiming to simulate the entire gaming experience. Although it’s commendable that some engine precision and object permanence have been retained, the approach seems an impractical way to create, test, or enjoy a video game. After sifting through the details for hours, it’s still baffling why anyone would choose to utilize this method.