Skip to content

AWS announces family of LLMs for text, image, video generation

  • by
  • 5 min read

Amazon’s cloud computing division, Amazon Web Services (AWS), announced its latest family of multimodal generative AI models, Nova. It includes four text-generating models, Micro, Lite, Pro, and Premier, and image and video-generating models, Canvas and Reel, respectively. All models are available on Amazon Bedrock starting December 3, except Premier, which arrives in early 2025.

AWS is pretty late to the AI party at this point and almost every major tech manufacturer or service provider now has their own horse in the race. However, Amazon hopes to tilt things in its favour with cost and latency. The company claims that Nova models are among the fastest in their class while being the least expensive to run — by as much as 75 percent.

They also support quite a few languages, with the announcement claiming support for as many as 200 languages. That said, some languages are better optimized for use than others, and users can expect the best results in English, German, Spanish, French, Italian, Japanese, Korean, Arabic, Simplified Chinese, Russian, Hindi, Portuguese, Dutch, Turkish, and Hebrew. The content generation models, however, only support English prompts.

Continuing with the industry trend, Amazon text generation models offer access to generative AI at four different price points and latencies. Micro, the first model in the Nova family can only process text but also delivers the fastest latency and lowest cost in the family.

Nova Lite compared to its major rivals. | Source: Amazon

Lite is one tier up and can process image, video, and text inputs, albeit slightly slower than Micro. Nova Lite also performed equally or better in 17 of the 19 benchmarks compared to GPT-4o Mini. Pro is the next step up and balances speed, cost, and capability. When measured up to OpenAI’s GPT-40 model, it performed equally or better on 17 out of 20 benchmarks.

Last but not least, Premiere is the most capable model in the family and can do everything the other three can. However, AWS is marketing it as a model better suited for creating custom models rather than being a standalone model.

The Micro model has a 128,000 token context window, which works out to around 100,000 words the model can process. Lite and Pro both have 300,000 token context windows, meaning these models can tackle as many as 225,000 words, up to 15,000 lines of code, or around 30 minutes of footage. These context windows are also evolving, and AWS says in early 2025, certain Noval models will support over two million tokens.

Canvas, the image generation model AWS debuted, allows users to generate and even edit images using simple prompts like just about every other AI image generator on the market. However, there are some control options for the generated image, including colour schemes and layouts.

Reel is arguably the more ambitious of the two models. It can generate videos of up to six seconds with either a text prompt or using reference images. Users can also adjust camera motions to add pans, rotations, and zoom, among other camera movements. Additionally, a version that can create videos up to two minutes long is also coming soon, according to AWS.

Users can add camera motion to static images using Nova Reel. | Source: Amazon

Regarding security, both Canvas and Reel have built-in controls for responsible use, including but not limited to watermarking and content moderation. AWS hasn’t disclosed what these preventive measures are, but the announcement reads the following:

The Amazon Nova foundation models are built with protections that match its increased capabilities. Amazon Nova extends our safety measures to combat the spread of misinformation, child sexual abuse material (CSAM), and chemical, biological, radiological, or nuclear (CBRN) risks.

As with almost every other generative AI model, we don’t quite know what Nova has been trained on. AWS continues to be vague about the training data, but the company did tell TechCrunch that the training data is a combination of proprietary and licensed data. That said, AWS is offering an indemnification policy that covers its users in case one of its AI models infringes someone else’s copyright.

With speech-to-speech and any-to-any models queued up for launch in 2025, Amazon has settled well into the ongoing madness of the AI market. It remains to be seen how well the Nova family will perform and how protected it is from the problems that have derailed just about every major AI model so far.

In the News: Socks5Systemz malware resurfaces, 250K systems compromised worldwide

Yadullah Abidi

Yadullah Abidi

Yadullah is a Computer Science graduate who writes/edits/shoots/codes all things cybersecurity, gaming, and tech hardware. When he's not, he streams himself racing virtual cars. He's been writing and reporting on tech and cybersecurity with websites like Candid.Technology and MakeUseOf since 2018. You can contact him here: yadullahabidi@pm.me.

>