Blog post 1 – AI Film making #1

Intro

Artificial intelligence has been developing at a rapid pace during the last few years, so much so that I needed to reevaluate what direction my master’s thesis was going. Initially I had planned on using simple text to image models to create style frames, mood and storyboards and possibly even AI generated image textures to help in creating a short film project in a genuinely new way to breathe fresh air into the motion graphic and media design industry.

However, not only have multibillion dollar companies, but also smaller teams and creatives around the world beaten me to it in spectacular ways, with Adobe having implemented many AI assisted tools directly into their software and companies like Curious Refuge having established fully fledged AI workflows for film making.

What this is

For the aforementioned reasons I have abandoned the idea of creating a genuinely new approach to AI film making and will therefore do my best to keep researching the technological state of AI going forward, and aim to create a short film project using cutting edge AI tools.

This blog post is supposed to be a repository for the most advanced tools available at the moment. I want to keep updating this list, though I’m unsure if I should come back to this list or duplicate it, time will tell.

In any case, whenever I decide to start work on the practical part of my Master’s thesis, I will use whatever tools will be available at that time to create a short film.

List of tools

Text To Image

  • DALL-E 3
  • Firefly 2
  • Midjourney
  • Adobe Photoshop & Illustrator

Curious Refuge seems to recommend Midjourney for the most cinematic results, I’ll be following the development of Chat-GPT which can work directly with DALL-E 3 as well as Midjourney to see what fits best.

Adobe Firefly also seems to be producing images of fantastic quality and even offers camera settings in its prompts, information that is crucial in the creative decisions behind shots. Moreover, Firefly is, in my opinion, the most morally sound option, since the AI was trained using only images Adobe themselves also own the rights to, an important point for me to consider since I am thinking about putting an emphasis on moral soundness for my paper.

Adobe’s Photoshop and Illustrator tools are remarkable as well, I have already planned on dedicated blog posts testing out their new features and will definitely implement them into my daily workflow as a freelance motion designer, but I am unsure how they could fit into my current plan of making a short film for my master’s thesis.

Scripting & Storyboarding

  • Chat-GPT 4 directly integrated with DALL-E 3

At the moment, Chat-GPT seems to be by far the most promising Text based AI. With the brand new Chat-GPT 4 working directly with DALL-E 3, this combinations is likely to be the most powerful when it comes to the conceptualisation phase. This is also a tool that I would confidently use in its current state.

Prompt to Video

  • Pika Labs with Midjourney

Both work through Discord Servers, I am not sure how well this can work as a specialised workflow, Midjourney has since published a web application. However, this means that the combination of Pika Labs and Midjourney is quite efficient, as users don’t need to switch applications as much. Results are still rough, Pika Labs is still in its early development stages after all, a lot of post processing and careful prompting needs to be done to achieve usable results.

3D Models (Image to 3D & Text to Image)

  • NVIDIA MasterpieceX
  • Wonder3D
  • DreamCraft3D

As far as 3D asset creation is concerned, a lot has happened since my last blog posts about the topic. There are a multitude of promising tools, most notably of which is MasterpieceX by NVIDIA, as it seems to be capable of generating fully rigged character models which could work well with AI powered animation tools. How well the rigs work needs to be tested but visually all three models seem advanced enough to use for, at least stylised, film making workflows.

3D Animation

  • Chat-GPT 4 & Blender
  • AI Powered Motion capture
    • DeepMotion
    • Rokoko Vision
    • MOVE.AI

In line with the 3D models, it seems that many AI assisted motion capture tools are already very capable of delivering usable results, I am not yet sure which one is the best, but time will tell. Non motion capture based animation knows almost no limits with the use of Chat-GPT, as it is able to program scripts, finish animations and create ones from scratch in a variety of tools.

Gaussian Splatting

  • Polycam

A very new technology that will surely spawn many other iterations is gaussian splatting. Using simple video footage, AI is able to determine and re-create accurate and photorealistic 3D environments and models. Some developments have even shown it working in real-time. While I am excited to see what the future of this technology will hold and that it will play a huge roll in the world of VFX, I am not sure how I would use it in my short film project.

Post

  • Topaz Gigapixel Image / Video AI
  • Premiere Pro

Unfortunately, if I wanted to use video AI tools in their current state, a lot of post processing would need to be done to the results to make them usable.

However, there is another, more traditional point to be made in favour of Topaz Labs, in that using its upscaling features saves a lot of time in almost any production phase, as using lower resolutions will always speed up processes, regardless of application. Due to its pricetag of 300USD I am not sure if I will use the AI for my educational purposes, but I am convinced it is a must when pursuing commercial projects simply because of the time saved.

Premiere Pro’s new features are impressive to say the least but I feel work best in a production that works with real shot footage and a more traditional media design workflow. I’m unsure how I could be using Premiere’s AI features to their fullest extent, but my work will need to go through an editing software of some kind, so I will see.

Conclusions

After today’s research session, it has become even more apparent that the world of AI is developing at mind boggling speeds. On one hand it’s amazing what technology is capable of already and even more exciting to think about the future, on the other hand the moral and legal implications of AI tools are also increasingly concerning, and with AI having transformed into a household name, I fear that the novelty of the technology will have worn off when I start to write my thesis.

So today I am left with a bittersweet mixture of feelings made up of the excitement of the wonderful possibilities of AI and concern that my thesis will be lacking in substance, uniqueness or worst of all, scientific relevance. I will definitely need to spend some time thinking about the theoretical part of my paper.

As far as the practical part of the paper goes, I must not succumb to FOBO and decide on how much I want to leave my comfort zone for this project. I fear that if I lean too much into the AI direction, my work will not only become harder, but also less applicable in real world motion and media design scenarios. Whatever solution I come up with, I want to maximise real world application.

Links

Leave a Reply

Your email address will not be published. Required fields are marked *