Episode 5: AI Production - Creation of a Holiday Video

 


In this episode of The Immersive Lens, hosts Paul Engin and Dave Ghidiu discuss the escalating competition in the AI sector, specifically OpenAI's "Code Red" response to the dominance of Google’s Gemini 3 model. They detail OpenAI's development of a new model, code-named "Camed Garlic," which reportedly relies on user feedback telemetry rather than professional evaluations to regain ground. They also highlight significant industry milestones, such as the announcement that the AI short "Critters" will become the first feature-length AI film next year, and the release of Meta's Hyperscape Capture, which allows users to create digital twins of physical spaces using Quest 3 headsets.

They go on a deep dive with engineer Jeff Kidd into the creation of an AI-generated holiday greeting for Finger Lakes Community College, a project that required between 30 and 40 hours of work rather than a simple "prompt and done" solution. The team utilized a complex "tech stack" including Google’s Nano Banana for generating character sheets, Google Flow for animation, ElevenLabs for audio isolation, and traditional tools like Photoshop and After Effects for "blue screening" characters into scenes. Ultimately, the hosts conclude that while AI provides animation efficiencies, producing professional-quality work still requires significant human intervention and traditional production skills to resolve issues like character consistency and lip-syncing.


Key Topics

OpenAI’s Strategic Pivot in Response to Competition. Following the release of Google’s Gemini 3, which dominated AI performance leaderboards, OpenAI CEO Sam Altman declared a "Code Red" to accelerate the development of a new model codenamed "Camed Garlic". To regain its competitive edge, OpenAI is reportedly shifting its training methodology to rely on user feedback telemetry rather than professional evaluations, a controversial move designed to cater to the preferences of average daily users rather than strict technical benchmarks.

The Complexity of High-Quality AI Video Production. Contrary to the perception that AI video creation is a simple "prompt and done" process, Paul and Jeff detail a holiday greeting project that required between 30 and 40 hours of labor to produce a two-minute clip. The production required a sophisticated "tech stack" that combined AI tools such as Google’s Nano Banana for imaging, Flow for video interpolation, and ElevenLabs for audio isolation. After all that, they still relied on traditional post-production software like Adobe Photoshop, Premiere, and After Effects.

Bridging AI Limitations with Traditional Animation Techniques. To overcome the inherent inconsistency of AI-generated characters, Paul and Jeff utilized traditional animation concepts, such as generating "character sheets" via Nano Banana to maintain visual continuity across different angles and ages. When the AI refused to follow specific negative prompts (such as instructions not to open a door), the team resorted to "blue screening" characters in Photoshop to manually composite them into scenes, proving that human production skills remain essential to bypass AI errors.

Emerging Technologies in Spatial Computing and Cinema. They highlight significant industry milestones, including the announcement that the AI short "Critters" will be expanded into the first feature-length AI film next year, offering potential efficiency gains in producing animation footage. Additionally, the hosts discuss Meta's "Hyperscape Capture," a new feature for the Quest 3 headset that allows users to scan physical rooms to create sharable digital twins, with broad implications for distance learning, game design, and real estate.


Transcript

Click here to view transcript
Paul Engin: Uh, Dave, this is fire. Let's cook, man. We're chopped. We're chopped.

Dave Ghidiu: Okay. Well, at least we're not 30 minutes in.

Paul Engin: Yeah. Can we fix that in post? Welcome to the Immersive Lens, the podcast exploring the technologies reshaping how we live, work, and learn. From AI and virtual reality to creative media and design, we are diving into the tools and ideas shaping our connected world. Join us as we uncover the people and ideas driving the next wave of interactive experiences. This is the immersive lens. So Dave, what's going on this week?

Dave Ghidiu: Code Red, Paul. Code Red.

Paul Engin: What happened? What did I do? Alert. What did I do?

Dave Ghidiu: Uh, you didn't do anything.

Paul Engin: Oh, okay. What what's what's Code Red?

Dave Ghidiu: Code Red is at OpenAI, which is the company that has chat GPT. Yeah. The CEO Sam Alman called a code red. So two weeks ago, if you'll recall, Google Gemini, and we talked about on the show, Google Gemini 3 came out. And it just dominated all the leaderboards for for AI performance. And that had the V the V3 V3.1. Yeah.

Paul Engin: For uh video production, right?

Dave Ghidiu: It had everything and it was it was just bonkers. So Sam Alman called a code red over at Open AI and said, "Folks, drop everything. This includes Sora 2, which you had talked about the video generation." And he's all stop what you're doing. We have got to make our chatbot bigger and better. So, they're working on right now, if you go to chat GPT, at least at the time of recording, 5.1 is the the model that people are using, chat GPT 5.1. So, 5.2 is in development right now, and that's camed garlic.

Paul Engin: Okay. Well, uh, it's going to be stinky.

Dave Ghidiu: I hope not.

Paul Engin: And it's going to be spicy.

Dave Ghidiu: Spicy. All right. All right. And garlic, it has a few different things. They're they're saying it's going to be better than than Google Gemini 3. But one thing, and this is a little controversial uh Sam Alman and this I'm taking this right from the uh Wall Street Journal. This article dropped on December 8th, okay? And we are recording on December 9th. And he said Alman was calling for turning the crank on a controversial source of training data, including signals based on feedback from users rather than the evaluations from professionals of chatbots responses. And the reason they're doing that is because they're not necessarily winning the race between Claude and Gemini, Microsoft or Enterprise. So they're their meat and potatoes is the average day users like you and I and and everyone listening all our fans. So they are trying to cater that. So they're using the telemetry from the users more than the professionals.

Paul Engin: Oh that's interesting. So just a different approach in their their training for their base model.

Dave Ghidiu: Correct.

Paul Engin: And is that going to be more efficient or you think more focused on the user versus um a general pool of training?

Dave Ghidiu: Yeah. And remember they're walking a tight a tight rope between being to Sicopantic, which was some of the feedback from one of their earlier models. So, they're I'm eager to see what this looks like when it comes out, and it should be coming out anytime today, tomorrow, next week.

Paul Engin: Yeah. I mean, it's crazy how fast they're doing production revs on these these models. I mean, it's almost almost every week there's something new.

Dave Ghidiu: It's it's bonkers. And it really is an arms race between all these companies. So, if this really does eclipse Gemini 3, and I am blown away by Gemini 3. If this eclipses it, I'll be like, "Whoa, man. What's next?" You know.

Paul Engin: I I'm very impressed with Gemini 3 and what what you can produce with it and I know we'll talk about that more later but yeah all right what's your hot take what's going on in uh Paul world.

Dave Ghidiu: So I just read yesterday I know we talked about uh Critters which was a AI short documentary style film um of Critters in a forest being um like a documentary.

Paul Engin: You talked about that a few early on in this podcast right.

Dave Ghidiu: Yeah so um apparently now they're going to try to do a featurelength film with critters that'll be the first AI feature length film. So, and next year.

Paul Engin: So, it's going to be a feature. So, that's like 88 minutes or more. That's feature length. Something like that.

Dave Ghidiu: Yeah, I'm guessing it's going to be like 60 or cuz I I don't know if they can. We'll see. We'll see. When you say featurelength AI film, does this mean it's something you or I in our basement could could make something like this or is this going to be still you need a level of production?

Paul Engin: You you still need a production company? I think it's still um you get a lot more efficiencies like everything else um with the techn technology that's out there. You know, there's they're always advancing, making things easier and quicker.

Dave Ghidiu: Sure.

Paul Engin: Um but you still need uh people who understand how you're going to be compositing and um how to create the characters. So they still have like character designers and story storyboard artists and there it just expedites the process if that makes sense.

Dave Ghidiu: Yeah. So is this an exercise in seeing what's possible or there still are some cost savings to be realized like your budget's a lot smaller?

Paul Engin: Yeah. Because then you can maybe not have a as many animators, but I think you still will have animators on staff. Um, or uh there might be efficiency gains where you could do um even if you had the same amount of animators, maybe you have the ability now to produce 5 minutes of footage instead of 5 seconds of footage. So, you have a larger uh pool that you'll be able to edit from, if that makes sense.

Dave Ghidiu: Yeah. And I think what I'm eager to see is what tools emerge from this process, you know? So, maybe they have another tool that's making AI in the background, then they have to come up with an AI tool to take the background and put the foreground. I don't know what's going to happen, right?

Paul Engin: And and I know we're going to talk about this, but we we did this in a smaller scale and uh and we're going to get into it in a little bit, but um I think that a lot of things will come out of it. And this is also OpenAI driven, too. Obviously, they're going to be using the OpenAI tools and working with the studio that created Critters. Even with the code red, it's, you know, that's probably the catalyst cuz they probably saw what Gemini was coming out with and they were like, uh, we got.

Dave Ghidiu: Yeah, Gemini doesn't have that.

Paul Engin: That's right. Um, the other thing that I saw um that I'm excited about is uh something called uh Hyperscape Capture, which is from uh Meta.

Dave Ghidiu: Okay.

Paul Engin: Um it's using uh the Quest 3 uh headsets.

Dave Ghidiu: That's like a VR headset you put on your head.

Paul Engin: The virtual reality headsets. Um and it's really cool. What you can do now is you can scan a room and it can just by like looking around.

Dave Ghidiu: Yep. You just put it on and you scan and walk around the room and and it'll create the um the space and then you can import it into New Horizon which is their social media 3D platform and then you can invite people into your space.

Paul Engin: So for instance um I'll be able to go upstairs we'll go into our TV studio and I'll be able to map the entire studio out.

Dave Ghidiu: Sure.

Paul Engin: And then if I and it's something that we have to play with but um I should be able to then integrate it into our new horizon world which is again a platform you would have to log into. But students or uh people who are here, we can all be in the studio together virtually or in person using the VR headsets and they'll look like they'll be in the same studio as we are.

Dave Ghidiu: That's so wild.

Paul Engin: Yeah.

Dave Ghidiu: So, this has some implications clearly for say like distance learning and online learning.

Paul Engin: Yes. Yes. I'm I'm really excited because otherwise in my head I'm like I have to model this space. I have to you know.

Dave Ghidiu: Can you imagine Zillow in like 6 months from now?

Paul Engin: Oh, it's good. Everyone's going to be walking around with the headsets scanning houses and um but yeah, that's actually another that's probably another big I mean huge, right? Yeah. And it'll be affordable to do it cuz it's these the quests are 800 bucks, right? Not even 500.

Dave Ghidiu: I want to say they're like 3.99 or something.

Paul Engin: I would be able to make like a digital twin of my house for three or 4 hundred bucks.

Dave Ghidiu: Yes.

Paul Engin: And put it online and everyone in the world can come into my house.

Dave Ghidiu: Yeah. Again, there's a few steps to it, but Yes. That that is the goal. That is the goal.

Paul Engin: That also seems to have some implications for say like video game design. So instead of the artists who would be creating the rooms, now you can just go in and scan a room like have a set and scan it in.

Dave Ghidiu: Yeah.

Paul Engin: Oh wow.

Dave Ghidiu: Yep. And uh so it's $4.99.

Paul Engin: $4.99 for the Meta Quest.

Dave Ghidiu: Yeah. Quest 3 right now. But um yeah, so I think that's uh that's something that I'm really We have a few quests here, so I'm going to be uh playing with that and see what what we can do with it.

Paul Engin: Bonkers, man. But yeah.

Dave Ghidiu: All right. Well, today's deep dive is AI video holiday greetings.

Paul Engin: All right. I like it. And it's kind of related to critters in a much smaller scale.

Dave Ghidiu: It is. That was a real nice teaser for this. So, here at FLCC, our president typically makes a video holiday greeting card for the college community and kind of sends it out in email. And this year, he came to the AI think tank and asked for help using AI to make part of the greeting. And I've seen the final draft.

Paul Engin: I saw it. I saw I saw you sent it to me and I was blown away and it really it really took me back to the nostalgia of growing up with the California raisins claimation Christmas and the the ranking bass Rudolph the red-nosed reindeer the stop motion.

Dave Ghidiu: And so I'm going to level with everyone I know nothing about how this was done this is so outside my belly wick uh video creation so today I really want to hear about the experience so I'm just going to kind of interrogate you about it so let's uh let's start the table by having Let's.

Paul Engin: Let me let me start by saying that um I'm going to bring Jeff into this because uh it started with uh Jeff, our engineer, and then um uh Jim Perry uh and he can Jeff, are you on?

Jeff Kidd: I'm on.

Paul Engin: Oh, so ladies and gentlemen, Jeff.

Dave Ghidiu: Jeff, can we get like the applause on?

Paul Engin: Okay. Um so Jeff, do you want to talk about how um this started and then um we can kind of go through the production?

Jeff Kidd: Yeah. Sure. So Jim and I have been tasked u annually for a few years now to create a holiday card uh at the behest of our president, our college president.

Paul Engin: And Jeeoff, real quick, Jim Perry is the technical theater and auditorium manager in visual and performing arts. Right.

Jeff Kidd: Yes. He and I have collaborated on many projects, video related projects for uh the president and other areas of the college uh over the years. And he came to us and said, "Yep, I want to do a holiday card this year and we said, "Okay, sure. Uh, what do you want it to be about? What's it about? What do you want to do?" And AI was a hot topic. He's like, "Yeah, something with AI." And Jim, he's the writer of the two of us.

Dave Ghidiu: He I've seen screenplays before. They're great.

Jeff Kidd: Yeah.

Paul Engin: Yeah.

Jeff Kidd: He he came up with the idea of what if, you know, if we did like an AI generated uh claimation kind of style of card where it's AI generated and it somehow ties into the college and with the head elf what looked kind of like Dr. Nye our our president our college president and what the message was and all that stuff we hadn't quite figured that out yet but what about that and they loved that and they said okay what do you need from us and that's when we kind of came to the two of you like okay we want to do this what do we need to generate this idea right.

Paul Engin: And um Dave you took it upon yourself to say first.

Dave Ghidiu: I said these guys can do it do a great job.

Paul Engin: Yeah, right.

Dave Ghidiu: I'm the executive producer of this uh the greeting card for this uh holiday season. So, my my concern was uh consistency and the ability to capture the likeness and you produced a a quick little image sample, right?

Paul Engin: Yeah. I took a a photo of Dr. Nye from the FLCC website and I had uh goo Google Gemini, you know, the the image nano banana.

Dave Ghidiu: Nano banana.

Paul Engin: And I said, "Convert this please to an elf." And it was great. It was it It already did that whimsical 3D animation claimation type thing. I didn't have to prompt I don't think. And then I went to Google V3 and I said, "Hey, can you make this elf do something?" It was maybe some of the script from Jim. And that was that was all I did for this project, right? It was it was a very like is this even feasible? And I think that was like when I saw that I was like, "All right, let's do this."

Paul Engin: Right.

Dave Ghidiu: Yeah, let's do it.

Paul Engin: Um so after that, um Jim basically wrote the script.

Dave Ghidiu: Yeah. And actually before I I do want to say so you had mentioned the character consistency and that's something we've I think we've talked about on the show before but it's just when you make cuz the video clips are only like eight or 10 seconds and it's very difficult to do another 8 second with the same exact character usually there's like some degradation or some changes.

Paul Engin: Yeah. Subtle things like uh their nose might kind of shift or they might have a a spot on their face or.

Dave Ghidiu: Blue eyes versus green eyes or something.

Paul Engin: Y so that was a risk.

Dave Ghidiu: Yes. Um and so uh what But I thought your proof of concept was good enough to try to move forward with it. Um, and so Jim wrote this great script. Um, he tried to, you know, we we spoke to him and, uh, Jeff, you can kind of chime in here, but um, we spoke to him about trying to keep it simple, try to limit the amount of shots we have. So, you know, uh, we don't want the camera moving cuz it's difficult to direct um, some of the camera motion and keep all the characters consistent and um so so he kind of wrote it with some things in mind so it wasn't like a a blank slate so we were working together as far as no limitations does that sound right to you Jeeoff.

Jeff Kidd: Yeah.

Paul Engin: Okay.

Dave Ghidiu: Do do you know so knowing he knew he must have known about the 8-second limitation so did that was that one of the other constraints that he had to keep in mind when he was doing the script.

Paul Engin: Yes. now he didn't I didn't want to him to be limited by the 8 seconds because I figured we can do it like cuts and extended out. But um he as far as like uh the environment, we weren't shifting a whole bunch of different environments. We were kind of locking it down a little bit just to I mean we're at a very condensed timeline for this too. So um yeah, so it it started with with Jim um wrote the script. Uh then we all thought it worked. Um we got permission from Dr. Ny the president of the college to uh use use him and um we used him as a starting point similar to what you did um we used him as a uh image and then we created a character from him of the head elf using nano banana.

Dave Ghidiu: Using nano banana.

Paul Engin: Um and I'll tell you all the tools we used and then we'll kind of highlight each of these but um we used a V3.

Dave Ghidiu: That's the Google video.

Paul Engin: Google video um and it was interesting because I started with V3 and it created a nice baseline for me. Um, but there's no way to create real consistency in V3.

Dave Ghidiu: Okay.

Paul Engin: So, then I moved over to Flow, which was uh using V3, but they have their own special editor.

Dave Ghidiu: That's also a Google product, right?

Paul Engin: It is. And I'll explain how that works um in a little bit. Um, I use the Google Nano Banana.

Dave Ghidiu: That's their image generation.

Paul Engin: Yep. And that um help create the characters. And again, I'll explain that in more detail.

Dave Ghidiu: No wonder there's a code red over at OpenAI.

Paul Engin: Um, we used 11 labs. Um, and again, these are all things we'll talk about, but it's another uh audio-based AI system, not not Google. So, you can capture someone's voice, capture someone's voice, uh, edit voices, uh, create music, a bunch of different things.

Dave Ghidiu: Music, too.

Paul Engin: Yeah.

Dave Ghidiu: Okay.

Paul Engin: Um, and then, uh, we use the Adobe production suite. So, we use Photoshop, Premiere, and After Effects. Um, so one of the things I want to stress about this is that this is not a prompt and done type production.

Dave Ghidiu: Sounds complicated.

Paul Engin: You need you need to have a production background to really still produce this. Um, so.

Dave Ghidiu: Do you can you estimate how long it took this the like how how many person hours were involved in this?

Paul Engin: Uh, I didn't so I should have tracked it but it was a lot of hours.

Dave Ghidiu: Like more than 20.

Paul Engin: Yeah.

Dave Ghidiu: Okay. More than 30.

Paul Engin: Um, I think around between 30 and 40 hours is for like a a one and a half or two minute video.

Dave Ghidiu: Two. What is it, Jeeoff? Two and.

Jeff Kidd: Right now it's the 2 minutes.

Paul Engin: 2 minutes. And that there's also.

Dave Ghidiu: And that's not that's not Jim's time or Jeff's time. Goodness. So this is just this is just my time.

Paul Engin: And there's some factf finding in here. So if you were to do this again, you would have there's lessons learned. Yes. Absolutely. And if you were to do this in a real world, it would have taken months. So by say real world, not using AI, using true stop motion characters or modeling or.

Dave Ghidiu: Okay. So there was a an economy of like you saved some time.

Paul Engin: Oh, 100%.

Dave Ghidiu: And I think 100% you had said wrote the script and when when I saw it last it was early on it was more like a screenplay so it gives you the camera shots and it gives you it paints like a picture it's very descriptive.

Paul Engin: Yes.

Dave Ghidiu: So did that help in.

Paul Engin: Cuz then it it helped me with the prompts for the different pieces. So um I think with the way I can structure this is how we started kind of some roadblocks we ran into and then um what Jeff did afterwards with some of the production stuff and we can kind of talk through how used multiple tools to.

Dave Ghidiu: That's what I want that's why I came today I want to hear all that stuff.

Paul Engin: So um so the way we started it out was um for cons consistent characters uh when you're doing animation you create these things called character sheets so it is taking one character and creating multiple views from that character so this way you can it in theory if I was modeling him in 3D I could do a profile of straightforward view over the shoulder or over.

Dave Ghidiu: Yeah. And I can have these so I can model him.

Paul Engin: If you're doing 2D animation, you need all of these these character sheets so you know the dimensions, proportions, and the different angles. So when you're drawing that.

Dave Ghidiu: Yeah.

Paul Engin: You can draw a consistent character. So what I did is I used Nano Banana and I created a character sheet.

Dave Ghidiu: Did you really?

Paul Engin: Yeah.

Dave Ghidiu: So the the character sheets um I did one for the head elf. Um and then I used a prompt that created a younger version. But it was still the same head. So I used the head elf and I said create a younger version with brown hair. And then I used that elf and then I said create a even younger rounderfaced blondhaired um character.

Paul Engin: So they all kind of look the same a little bit.

Dave Ghidiu: Interesting.

Paul Engin: So cuz we wanted to play on the president's look. So the head elf looks like the president and then a younger version of the president and then a real young version of the president.

Dave Ghidiu: Oh, that's wild.

Paul Engin: But they're all elves. So that was a it you might not see it, but that was the the thought process. Anyway, um so we created character sheets for all three of those characters.

Dave Ghidiu: This is just making me think that you really need to have AI is not going to replace animator anytime soon because you need to have that experience. Like I never would have thought of doing a character sheet and doing that. So that was probably like second nature to you, right?

Paul Engin: Right. Well, because I just wanted to make sure if we were to get a different angle, I had that that character ready to go. Um so then what I did is I um I had those characters, but I didn't have the uh abominable snowman character yet. So, what I did is I went to I am I going to really be like a geek here, but I have the uh ranking bath uh Rudolph the Rainbow Rain Rudolph the Rainbow Reindeer uh figurines in my house.

Dave Ghidiu: Nerd alert.

Paul Engin: So, uh I had a Bumble figurine. Bumble is that character.

Dave Ghidiu: Okay.

Paul Engin: And so, I took a picture of Bumble and then I brought him into Photoshop, you know. out the background.

Dave Ghidiu: Oh, so you did that in Photoshop before even going to Nano Banana.

Paul Engin: Correct.

Dave Ghidiu: Okay.

Paul Engin: And then I brought that into Nano Banana and I said create a character sheet and make it more claymation and have it match the style of the head elf.

Dave Ghidiu: Oh wow.

Paul Engin: So this way the con the style is still consistent.

Dave Ghidiu: Was that prompt enough to get a really good.

Paul Engin: Really?

Dave Ghidiu: Yep. Um so once I got it all right so I it took me two or three prompts cuz I there was slight variation. Yeah. Um once I was happy with it, then I created a character sheet of the Bumble um or the uh Abominable um for so now I had the character sheets for all the characters.

Dave Ghidiu: Okay. Wow. Oh, that's wild. I had no idea.

Paul Engin: So um yeah. So that was my my starting point. Then what I did is I created a background and I used Nano Nano Banana for that as well.

Dave Ghidiu: Really? Oh, because when you do the video, when I did the proof of concept, it had the background from behind the desk, but you need to do the rest of the like workshop area.

Paul Engin: Correct. Oh wow. So I prompted um At the time we were talking about uh characters walking in, so I needed a door. So I I was like prompting, you know, a big door in a Santa workshop that leads to the um head elf's uh office. And so it created a background for me and then um I brought that into Photoshop and then I layered the three characters into that scene so that I'd have their positioning. Does that Can you picture that?

Dave Ghidiu: So So you took the images Nano Banana created, brought them into Photoshop, arranged them, and then brought that into flow or video. Okay. The video production, right?

Paul Engin: Wow. Your text stack is wild already.

Dave Ghidiu: So, we um we did the uh flow. Uh so then I started with V3 first, but there was no way to keep a consistent um theme.

Paul Engin: Okay.

Dave Ghidiu: And what flow does is it allows you to put a first frame and a last frame.

Paul Engin: Oh, so if you can somehow get those from Nano Banana or Photoshop.

Dave Ghidiu: You got it. So with Photoshop, for instance, I put the three characters together, made them small, put them by the door, and that's my first scene.

Paul Engin: Yeah.

Dave Ghidiu: And then I made them big and the cropping that I wanted in front of the desk, and that's the last frame. And then I I said, you know, have the elves walking in or whatnot. And then that they go from that first frame to that last frame. Oh, wow. And it would do all the in-betweens for me. And so you had do that. So that might have been like a second and a half.

Paul Engin: Yeah.

Dave Ghidiu: And it's a two-minute video. So you had to do that kind of process for many of those scenes.

Paul Engin: Right. Right.

Dave Ghidiu: Wow.

Paul Engin: And I'll talk about that one scene because what um and this is the tough thing is it's tough to give direction to some of these characters. Right. Yeah. So um when I was talking and Jeeoff, I don't know if it was you or Jim or I don't know who brought it up, but we need to bring them in in a different way than walking.

Jeff Kidd: It was Jim.

Paul Engin: It was Jim.

Dave Ghidiu: Jim wanted to like manifest sort of like in a puff of evaporate. Yeah.

Paul Engin: Yeah. So, uh so I was like, "Okay, I think I can do this." So, I put them in and I I just had a frame with them not in it, but the door and everything and then a frame with them in their final position. And I I prompted, you know, a puff of smoke made out of snow and and snowflakes. But every iteration because there was a door in the scene had the door opening up and them coming through the door. And I said, even in the prompt, I was like, "No door. I do not want the door to open." It didn't matter. It just I mean, I guess that's what you get when you deal with AI, right?

Dave Ghidiu: Yeah. That must have been frustrating.

Paul Engin: Yeah. So, the good thing about this is that I was like, you know what? We're going to go old school. And what I did is I put the characters on a blue screen.

Dave Ghidiu: Did you really?

Paul Engin: And so, in Photoshop, I created a chroma blue screen. I put the characters in and then I created a screen that was just chroma. blue. So there was no background. It was just blue to to the characters in blue. And then it created the the effect. Yeah. First try. Boom. And then I said, "Okay, now I can composite this in After Effects or in Premiere." And composite is like when you combine those two things. Yeah. With the background. And then I was able to do it. Um, so those are things that like you need to have some kind of production background to know how you could accommodate some of those things.

Dave Ghidiu: I I I wouldn't even thought of like the blue screen. I wouldn't have thought of the character sheets. I wouldn't have thought like all that stuff. And I think people are going to see this video and they're going to love it and it's 2 minutes and it took 40 hours to make. And so when you say it's made with AI, people are going like, "Oh, it's one prompt and done." And it was like and that's not it. That's not it.

Paul Engin: So there's more. I mean, so with every video that was created, it created a voice over.

Dave Ghidiu: Yeah.

Paul Engin: So we we prompted it to have a voice over because the video generation does voices and background sounds and stuff. But the problem is is that every time you did it, it created a different music track behind it. It created sometimes different mixes of voice. So the voice isn't consistent. So what we had to do is we had to strip all of the audio out. And I'm going to throw this over to Jeff because uh Jeff did what? We did uh ADR um that's after dialog audio dialog replacement. That's okay. So it's like taking what existing recording and swapping it with like better.

Paul Engin: Yeah. And and this is done in a lot of productions like even Star Wars when they're they're like in this simulation like flying in the air, they have the wind blowing and it's like you can't really hear them and capture them. So then they go back into the studio and then they they try to lip sync what's happening. But I'll let Jeff talk.

Jeff Kidd: So yeah. Well, at that point we had basically we needed to do two things. The voices of the two two smaller elves nobody liked. So we actually didn't use any of that original audio at all. And Jim actually came in and pitched his voice a little bit, gave some cadence to his voice to make different um sounding voices for each elf. But um one of the things that we decided was I think the original length of the video was about 2 and 1/2 minutes and Dr. Nye actually wanted to come back on camera. The the idea was we transitioned from the AI video to Dr. Nye at his desk. So, In the end, we come back to the real life Dodger night and he could say a special message directly from him. And so we knew we needed that piece. And then Jim uh one of the things that we had to keep an eye on was the total length cuz the original total length was 30 seconds. Like originally it was like 30 seconds. Like okay yeah that's that's fine. But then we're reading the script and it's like there's no way it's going to be 2 minutes easy. So I think Paul I think the original length was well over 2 minutes.

Paul Engin: Yes it was. I think it was almost three.

Jeff Kidd: Yeah. So, one of the things that we decided to do or I decided to do was like, well, if we increase the length or like speed up the footage so we could get through it a little bit faster and we have to redub everybody anyway. So, the voices of the two smaller elves, I think I sp overall I think it was 125%. So, I basically added 20 subtracted 25% of the time by speeding it up by 25% if that makes any sense. So, it's going slightly slightly faster. So, We didn't use any of their audio at all. So Jim has to limp sync to those elves. So we do one take in Premiere, by the way. So this is in Premiere.

Dave Ghidiu: That's Adobe, right?

Jeff Kidd: The Adobe Premiere. So I m I mute all the tracks except the one that we're on. And what's nice about Premiere is it actually when you mark it in and out. An end point is where you want your clip to start and then where you want it to end. It'll kind of ceue it up a little bit. So it'll actually back it up 3 seconds and give you a countdown. So we start recording him and he tries to kind of watch the lips a little bit. He got a little distracting for him, but we just just kept doing multiple takes until the lip sync was right. And so by speeding it up, uh it didn't matter if we weren't because we weren't using their audio. So we didn't have to worry about it the pitch of their audio, the music that the the AI had put in, we didn't have to worry about any of that. So that was actually the the easy part. Um now, the voice of the head elf in the video, we actually liked. We liked it a lot. Um but there was music under under Some of it had music, some of it didn't.

Paul Engin: That's just the way the AI made the video.

Jeff Kidd: Yeah, it was just it was just random. So Paul found um 11 Labs. So we used 11 Labs to literally extract just the voice. So if there were some some sound effects, it took it out. If there was music, it took it out and it was clean. Like I have to say the thing that I was most impressed with was that that it cleanly like you can't hear anything. It's just that voice.

Paul Engin: And it's funny cuz I think to myself, it. And that's why I love doing these little productions because I think Jeff's probably the same way. I'm like, "Oh my gosh, if we have video that we shoot and the audio is not clear, we could separate it using 11 Labs so we could clear out the the noise."

Dave Ghidiu: That's better than like Audacity or whatever photo Adobe tools are out there.

Paul Engin: I mean, there's tools and you could work with them, but this is like you said, it makes it so easy. What did you uploaded his audio clip?

Jeff Kidd: Yeah. So, I basically exported out of Premiere just the audio for that clip and then upload that MP3 to 11 Labs. Hit a button and and it's only 8 because they're 8second clips. It didn't take very long and within seconds I it was it was there.

Dave Ghidiu: And what's 11 Labs like 10 or 20 bucks a month?

Jeff Kidd: Yeah.

Dave Ghidiu: Okay. Yeah, that's insane.

Jeff Kidd: And then you download that.

Dave Ghidiu: So cheap.

Jeff Kidd: And because it's identical, all I have to do is bring that into uh Premiere. So I export as an MP3 again. I bring it into um Premiere and then did I say export download it from 11 Labs, then bring it back into Adobe Premiere, and all I got to do is drop it into the that audio track. Easy.

Dave Ghidiu: That's awesome. So, let let me ask you two guys a question. Is this something and I've never used the Adobe suite before. I tried a few times, but I'm terrible at it. Is this something I could have done? And if so, what level? Like, I'm assuming I couldn't have done it to the caliber that you two did it with someone like me, right?

Paul Engin: So, I think that you could do a something something like what we produce but not to the level if that makes sense because I'm not insulted. No, I I teed you up for that one because uh I I really think you know we're not I think this is our whole philosophy, right? That it's AI is like a thought partner. It's a it's something that we're not assuming it's going to take something over, but it's going to you're going to work with it and you're going to be able to um enhance whatever your objective is. Um, but it's not a oneanddone type type production, right? Um, and I don't know what you think, Jeff, if you agree or.

Jeff Kidd: Oh, no. It's these are great. It's a great tool, right?

Dave Ghidiu: The 11 Labs sounds like it's fantastic.

Paul Engin: And we used 11 Labs to create uh the music for the background music um for it as well.

Dave Ghidiu: And and was that I saw an early version where it kind of was the outside of the village. Was that 11 Labs music or is that music from Gemini?

Paul Engin: That's awesome. From That's all 11 laps because we wanted to keep it consistent. So um otherwise the veil would have been different. It was it was I like the audio track but then it cuts to a different music and so um we need to try to keep it consistent.

Dave Ghidiu: All right. So what were some of the other pain points? So the consistent character sounds like a big thing. The character sheets helped and then you had the blue screening and then the audio seemed like it was an issue, but were there other pain points that you that you two experienced?

Paul Engin: Do you want to go Jeeoff or do you want me to?

Jeff Kidd: You can start.

Paul Engin: Okay. So cuz you believe me, no one put more work and time into this than you.

Dave Ghidiu: I want to go on record and say Paul sent Jim and I a version of this video at 1:30 in the morning.

Paul Engin: I'm not shocked at all. The problem is I love doing this stuff, so it's tough to stop.

Dave Ghidiu: And this I mean this really will help you in the future and I assume for your courses you're teaching like now you know the tools intimately.

Paul Engin: Yes. And I think that you have to be thrown into this and I think a lot of times I I try to push this on my students too. Sometimes you got to be thrown into it. Figure out what works and what didn't. So this way you know for next time and that you're in the safe educational environment where you can make these mistakes but learn from them hopefully so you can you know remedy them in the future. Um one of the things that we ran into that I never knew or thought we would is that remember how I said we went into a younger elf?

Dave Ghidiu: Yeah.

Paul Engin: So after a while several of my prompts were make the younger elf speak, make the younger elf do this, make the younger elf do that. After a while, it would not do any audio for the younger elf. It started blocking any audio when I referenced the younger elf. I even did younger. Some things that I didn't realize is that these AI systems don't recognize left or right. So, I can't say the elf on the right, the elf on the left. I ran into issues with like when I was doing dialogue, one elf would talk, then the Bumble would talk, and then the the the like their mouse would be moving the same.

Dave Ghidiu: Yeah. they like it would just be all over the place. So, you had all these weird things happening.

Paul Engin: So, um I had to sometimes crop in and and and this is the stuff like right in Photoshop, I would crop in on just the one elf so that the uh AI system knew that I wanted this elf to talk.

Dave Ghidiu: Gotcha. So, that's how you worked around that.

Paul Engin: Yeah. But there was no workaround for the young elf. I would have to have I I don't know what I would have positioned, but near the end we had to deal with whatever the lip sync was.

Dave Ghidiu: Yeah. And then we just try to make it work with the audio recording. Um, so that was a that was an interesting thing. And then um the other interesting thing was uh one of the scenes required a screenshot. Like right now the there's an iPad that kind of comes out in front of the screen and Jeff being so kind said, you know, it's kind of blank. It'd be nice if there was something on it.

Dave Ghidiu: Way to go, Jeff.

Paul Engin: I mean, it's like here, show him what you've done. And he turns the the tablet around so the screen's facing the audio. And there's nothing on there. So the the narrative was kind of broken. So Paul need to do 10 more hours of work.

Jeff Kidd: Not really. What he did, See, this is where Paul's experience comes in over mine. I don't have experience with After Effects. He does. So he was able to basically put an image on the screen that he generated with another AI that then he basically keyframed it. So as it goes into the to the audience, there's something on there.

Dave Ghidiu: All right. So I saw that scene. I was wondering how that was done because I was like, "Oh, if this is AI, it's actually really good." So it wasn't. It was manually.

Paul Engin: Yes, it was manually. I I used nano banana to create the infographic that was going to go into it based on the um the stuff the head elf was saying. So I put that in the prompt and then I um I did traditional key framing and mapped it to the so it would you know in integrate. Um so I think that uh control was an issue. Well, I'll let you go Jeeoff and then I'll.

Jeff Kidd: Well, a lot of it it's it's kind of like you want to be able to get nuance out of a performance or like with an edit or it could be something simple like a lot of the times they're the elves or the characters are looking off in a different direction than like like the head elf is asking them a question and should be straight ahead but they're looking in another direction. It's like can we like to try to get that new like just something simple like that. Hey have them look this way. Paul already talked about it. It's like hey just look here instead and no matter how many different ways you say it I know I've played with this too. It doesn't get it.

Dave Ghidiu: Yeah. What are they going to do with Tilly Norwood that AI actors you were talking about.

Paul Engin: I don't I don't know. Uh but I think that there this and I think this is where we're getting at, right? I think when OpenAI is doing the Critters movie, they're going to make efficiencies like.

Dave Ghidiu: Yeah.

Paul Engin: Like the way flow is structured right now for a production person like myself is not it's not very conducive. It I I understood it after I worked on it for a while, but there was no method for me to really do the type of edit I wanted or give the direction I wanted. Like in a perfect world, It'd be nice to like circle like allow me to do an overlay and like circle a character and say point an arrow and say like, you know, look in this direction or so that'll happen, right?

Dave Ghidiu: I think I think at some point it will. And you know, even with this production, it would have been nice to take it into another Adobe product and kind of fine-tune the the animation. But for what for the the the initial idea, I think that um I'm really happy with what we're producing and it's fantastic and you know Jeff actually did the filming of Dr. Nye too. So this is like you know a bunch of different worlds coming together and it's not just one person sitting in front of a prompt and just you know spewing it out and then it just produces everything for you. This is like still production. It just creates an efficiency just like um in the past you know uh contentaware Phil created an efficiency in Photoshop which is you could select something, do content aware fill, and then it replaces that with whatever's in the background. So, it fills that area. So, if you and I were in a picture, you could click me and say content, and it would I just be the background, right?

Paul Engin: You'd be gone, right? And you know, it that a lot of people were like, "That used to be my job at ad agencies, right? I used to have to do all this cleanup and removal and it was that it just made it a faster overnight." Yeah. Production and made it for so like I could do that Right.

Dave Ghidiu: Well, I thought you said you don't get into it, though.

Paul Engin: Well, I know I don't and that's why it's still too hard. Um any So, we're gonna wrap this up. Are there any last minute things that um Jeff, do you want to Do you have any last minute like piece of advice for people out there trying this?

Jeff Kidd: Well, it's definitely like to to your point, Dave, it's definitely not there for someone that doesn't have experience in video production yet.

Dave Ghidiu: But there's hope for me right there.

Jeff Kidd: It's as if Paul keeps saying this over and over again, it's only going to get better.

Dave Ghidiu: Okay.

Jeff Kidd: So, it's it's definitely coming. I invite the listening audience and viewing audience. I don't know, Paul, if you're going to just throw the ad in if what people that are watching on YouTube, you throw it throw it in there. I invite people to to watch it. It's really cool.

Dave Ghidiu: It is very cool. It's it's it it's it's something else.

Jeff Kidd: I got to It's to Paul's point. If we did this like even if it was CG in a computer or if we did it in claimation, we would, you know, I think they came to us with this idea in November. Like there's no way we would have been able to do this like properly with this vision without AI.

Paul Engin: Awesome. Thanks. Thanks. What about you, Paul? Last words. Yeah, so I definitely I I see the efficiencies. There's still a lot that needs to be done with it to make it more uh friendly to a general audience. But I think for what we use it for, I think it was perfect because it it's a production tool for me. I I looked at it as a production tool or a new software upgrade in 3D that allowed me to auto rig my character. I can start. So it was an efficiency for me that um I appreciated my hesitation is I and I think you know I Jim as the writer wanted more expression and more and yeah and me as an animator I wanted the same thing but you have to accept what it is and you know Jeff says it and I say it all the time but this is the worst it's going to be. So I think like to myself this is amazing the direction it's going. And I'm really excited to keep exploring the this area.

Dave Ghidiu: So, what I'm hearing is next year I can do the holiday greeting all by myself.

Paul Engin: Oh, we should do that. We should do that. You're not allowed to use any Adobe products and see what you could do.

Dave Ghidiu: Perfect. All right. So, tune in uh next year, folks. All right. Well, that's all the time we have today. My name is Paul Len.

Paul Engin: I'm Dave Gadoo. If you enjoyed today's conversation, be sure to subscribe so you never miss an episode. Let's be careful out there, folks. And share it with your friend and colleagues. Until next time, stay curious, stay connected, and thanks for through the immersive lens with us. This episode was engineered by Jeff Kidd. Recording at Fingerlakes Community College podcast studios located in beautiful Canondua, New York in the heart of the Fingerlakes region, offering more than 55 degrees, certificates, micro credentials, and workforce training programs. Thank you to public relations and communications, marketing, and the FLX AI hub. Eager to delve into passion? Discover exciting and immersive opportunities at www.flcc. edu. As part of our mission here at FLCC, we are committed to making education accessible, innovative, and align with the needs of both students and employers. The views expressed in this podcast are those of the hosts and guests and do not necessarily reflect the official position of Fingerlakes Community College. Music by Den from Pixabay. Jeff, thank you very much for joining us. And this is the immersive lens.

Jeff Kidd: It's official. It's official.

Paul Engin: Awesome. And nice job, guys. Jeff, can you speed us up like the Micro Machine Guys? Cuz this is kind of like a nostalgia episode. So like at the end it would just be like nostalgia like you know the Micro Machine Guy in the in the commercial he like.

Jeff Kidd: Oh yeah. Oh yeah.

Paul Engin: really. So now I'm going to resate. You see this is what I got to deal with people.



full-width