Meta has announced its own AI image generation engine, ‘Make-a-Scene’, much like OpenAI’s Craiyon and Google’s Imagen. The engine can generate 2048 x 2048 resolution images by taking sketches and phrases as inputs.
Meta’s AI stands out when generating its own images as it gives the user more control over the image’s composition by providing a rough sketch as input alongside a phrase.
Different AI engines use various techniques to create images from a person’s text input. This indicates that even after training with millions of images to learn the association between words and their corresponding images, AI engines still aren’t spot-on when turning a phrase into an image.
It also means that the user doesn’t exactly have control over the output image’s composition. There is no easy way to say which angle things in the image should be, whether the described objects should be large or small compared to one another, and which direction the lights and shadows should be using just text.
With Make-a-Scene, users can convey their artistic vision to the AI engine with greater specificity using patterns, arrangements, depth of field, and their unique composition and structures. Meta’s AI engine uses a novel intermediate representation that the user chooses to control the scene layout, enabling detailed inputs like sketches.
The model primarily focuses on learning key aspects in the image that might be of more importance to the creator. According to Meta, this technique helped increase the generation quality of the FID score; a metric used to evaluate the quality of images generated by AI engines.
In Meta’s testing, a panel of human evaluators chose text-and-sketch images over text-only images generated by an AI engine as better aligned with their original input 99.54% of the time.
Meta has partnered with AI artists, including Sofia Crespo, Scott Eaton, Alexander Reben, and Refik Anadol, to develop Make-a-Scene further to use the engine and provide feedback. Currently, there’s no word on when the engine will be available to the general public.