New AI: Next Level Video Editing! 🤯

  • Published on Jan 12, 2022
  • ❤️ Train a neural network and track your experiments with Weights & Biases here:
    📝 The paper "Layered Neural Atlases for Consistent Video Editing" is available here:
    🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:
    Aleksandr Mashrabov, Alex Balfanz, Alex Haro, Andrew Melnychuk, Angelos Evripiotis, Benji Rabhan, Bryan Learn, Christian Ahlin, Eric Martel, Gordon Child, Ivo Galic, Jace O'Brien, Javier Bustamante, John Le, Jonas, Kenneth Davis, Klaus Busse, Lorin Atzberger, Lukas Biewald, Matthew Allen Fisher, Mark Oates, Michael Albrecht, Michael Tedder, Nikhil Velpanur, Owen Campbell-Moore, Owen Skarpness, Peter Edwards, Rajarshi Nigam, Ramsey Elbasheer, Steef, Taras Bobrovytsky, Thomas Krcmar, Timothy Sum Hon Mun, Torsten Reil, Tybie Fitzhugh, Ueli Gallizzi.
    If you wish to appear here or pick up other perks, click here:
    Thumbnail background design: Felícia Zsolnai-Fehér -
    Wish to watch these videos in early access? Join us here:
    Meet and discuss your ideas with other Fellow Scholars on the Two Minute Papers Discord:
    Károly Zsolnai-Fehér's links:
    Instagram: twominutepa...
    Twitter: twominutepapers
  • Science & TechnologyScience & Technology

Comments • 199

  • Soos
    Soos 4 months ago +224

    This is so insane.. editing this stuff manually takes forever, this will be used in this future to save a lot of time. Cant wait to see this AI improve.

    • Martin Smolik
      Martin Smolik 4 months ago

      @Lewis Tyner "semi-intelligent person in their basement"Oh, so even you?

    • Harry Potter
      Harry Potter 4 months ago

      @Lewis Tyner 😂 right, i agree 💯

    • Lewis Tyner
      Lewis Tyner 4 months ago +3

      @Harry Potter you know that’s true. It’s really incredible what can be done with this new technology that used to take teams of people many hours to accomplish. Gotta think about NASA back in the 60’s and 70’s till now. Thru as much trouble and cost as it took them to mislead us, and now it can be done by one semi-intelligent person in their basement. 😂

    • Lewis Tyner
      Lewis Tyner 4 months ago

      @toluar what are you referring? The drops or the folks following them or the folks that don’t buy into it? I never can tell what folks are talking about when they say qanon.

    • Harry Potter
      Harry Potter 4 months ago +3

      And also the jobs of editors and vfx industry will be severely impacted by this, when even a solo can easily make a professional level video

  • Pkmatrix
    Pkmatrix 4 months ago +117

    I feel like I just watched the end of chroma key compositing. What need is there for green screens if you can just easily swap out any background automatically like this? This is going to do wonders for small indie filmmakers who may struggle to get even green screen to work properly. No need to pay even even a jury-rigged setup!

    • Nejc Lampič
      Nejc Lampič Month ago

      @Martin Smolik until it's improved witch it will be with time I guess not yet

    • Kevin Bissinger
      Kevin Bissinger 3 months ago +1

      lol at people implying green screens and chroma key is reliable

    • Ross Judson
      Ross Judson 4 months ago

      I feel like I just watched the end of truth.

    • Tsz Fung Li
      Tsz Fung Li 4 months ago +1

      @Martin Smolik Yes, good for youtubers. When it passes most youtubers' standard, Movies will follow

    • Martin Smolik
      Martin Smolik 4 months ago +5

      Honestly, this looks great. But it is very unreliable, and since it is machine learning, there is no telling if it will ever be reliable enough to replace human experts. It is already good enough to use in concept art, but for anything more it will probably always need a human expert to touch up the results.
      However, since human animators and CGI artists often say that touching up bad CGI is worse than making it from scratch, I am a bit sceptical that we will see this in movies soon.
      On the other hand, for home videos it is indeed an absolute gamechanger.

  • Idkartist
    Idkartist 4 months ago +26

    As someone who does a lot of vfx work, I legitimately jumped back and said "WHOA" at 2:25. It's obviously not perfect, but adding in that reflection normally would be quite a task. As a long time viewer, it's moments like those that keep me coming back -- the more time goes on, the more I'm amazed at the pace of progress in AI. I can't wait to see what incredible (probably world-changing) things it will be able to do in just ten years time!

    • Dan Fontaine
      Dan Fontaine 4 months ago

      In 10 years IMO we’ll probably see personal assistant software that doesn’t suck as a staple in society. A semi-sentient that can be customized to do any kind of work for you on a computer or phone that you can do short of actually compose thoughts to text.
      Depending on your profession this could be insanely helpful like not needing to hire two other people helpful

    • Zizo
      Zizo 4 months ago +4

      It's not a reflection... The artist painted the goose AND the reflection. After Effects has some pin/puppet tools. You pin the whole thing onto the goose and use either the pins to animate the reflection or maybe displacement tools.

  • Kavriel
    Kavriel 4 months ago +68

    The warping is pretty bad actually IMO. Removing items from videos is the only example that works "perfectly", and there has been stylization that worked much better than this in previous papers...
    hoping to see more about stylization, that could be very cool in games. Play doom like it's a nintendo game

    • Kas Berkhof
      Kas Berkhof 4 months ago

      I only look at these video manipulations through the lens of VR which is an extremely unforgiving display because of the 3D and high resolution. Any pixels popping in and out or even at the wrong depth will stand out like a headache. So unless the AI is that perfect, it's just a gimmick which sorta looks okay on en flat screen.

    • veggiet2009
      veggiet2009 4 months ago +1

      The worst example was the boat, but honestly that wasn't too bad. The water was impressive enough. If I had this today I'd take the water and the boat and then track in the icy land behind in the old fashioned way

    • Cosmia's Stash
      Cosmia's Stash 4 months ago +1

      good enough for memes!

    • Yo Él
      Yo Él 4 months ago +23

      You are forgetting the first rule of papers my friend

    • Exilum
      Exilum 4 months ago +8

      warping is not that bad of an issue. that's the easiest thing to improve two more papers down the line.

  • Luciano Rivera
    Luciano Rivera 4 months ago +26

    So it's a mix of overlaying in screen mode, and it would seem that so far it "only" works with pictures extraposed on video.
    Still, the fact that it can track the source footage and keep to some of those tracked points is amazing. I can't wait to see what they will be able to do in a few years with video layered over video.

  • DJ Kent
    DJ Kent 4 months ago +79

    Computer Science major here, with a game design concentration. Recently I've become super fascinated by computer graphics since so much of it is based on math. Kinda funny how this entire channel gained newfound respect and significance for me when you realize all the crazy stuff that goes into computer graphics.

  • Bill R
    Bill R 4 months ago +8

    2:26 "...the reflection of the swan is also computed correctly" - The swan's reflection was not rendered correctly. The downward swooping lines at the back should be reflected as 'upward' swooping in the mirror-image of the water's surface. The AI does not understand reflection. It mimics it.

    • Agnes Nutter
      Agnes Nutter 4 months ago +4

      And it wasn’t even the AI who made the reflection, it was a human artist, as he explained at the end…

  • João Gabriel
    João Gabriel 4 months ago +21

    Is it possible to have a high level understanding of what it means to separate the background from the foreground in practice, or is it a black box?

    • Potato Face
      Potato Face 4 months ago +10

      Study their research paper linked in video description. It's heuristics. The neural network is manually taught what to focus on and is trained on a data set of many scenarios.

    • MrGTAmodsgerman
      MrGTAmodsgerman 4 months ago +1

      I think its possible when you look at what Photogrammetry does.

    • W0tch
      W0tch 4 months ago +7

      It probably has to do with optical flow estimation, foreground having bigger flow than background

  • Burger FPV
    Burger FPV 4 months ago +1

    Very impressive! I hope we get a text to video model similar to clip/glide this year 😼

  • realcygnus
    realcygnus 4 months ago +3

    Nifty ! It really is amazing what math/science/technology gives us. I'd literally pay with years off my life for a single glimpse into the far future.

  • Marret W. Hosalser
    Marret W. Hosalser 4 months ago +14

    A lot of these clips are rough, and many are things we’ve seen other algorithms do just as well, if not better, like removing details from an image or video. The impressive thing is changing the background or foreground artistically, which means the AI is doing a good job of reinterpreting a 2D video as a 3D scene without actually digitally recreating it. Impressive! I don’t know how much better it can get without actually simulating the 3D space, which would probably make it much more expensive to run. I’m excited to see how much more this develops, if at all. I feel like this is the sort of thing that’ll get dropped for a year until some app like Snapchat rediscovers it, and reappropriates it for recreational use. Who knows!

    • Martin Smolik
      Martin Smolik 4 months ago

      I would love to see this on instagram or snapchat. Even right now it seems very impressive for recreational use.

  • SpicyMelon
    SpicyMelon 4 months ago

    This is exactly what I recommended to google to make as their next project after they made the teachable machine. My idea was a bit different thought but basically the same. I wanted to have a program that could be given two images, one real and one with the modifications and the AI would figure out the temporally consistent change that was made. Of course it would get more accurate with more real and modified images to train from, but the example I game was to have a dog with no spot on its eye, and then a picture of the dog with a spot on its eye (like the dog from TARGET store). Then you could put in a video and the AI would generate images with the spot in the correct place.
    Very difficult to accomplish but knowing google they could definitely do it. Seems like they were beat to it.

  • BrickStar X
    BrickStar X 4 months ago +3

    Wow!!! Is this " feature " already out!?!? I need to get my hands on it 🙂 does it have a name? If yes could someone please tell me? Also this is revolutionary every day people can finally start making professional video 😃😃

  • Zizo
    Zizo 4 months ago

    Well, for a first paper it ain't so great. In the future though this can be amazing.
    Depending on the software you use you can achieve the same or better results doing it manually.
    For simple fx like these, tracking a shot takes seconds, rotoscoping takes minutes.

  • Iamwolf134
    Iamwolf134 4 months ago +2

    This serves to make movie making that much more economical in the long run.

  • A Peckx
    A Peckx 4 months ago +20

    The reflection of the swan still isn't perfect though. The reflection doesn't actually match up to what it's reflecting. It's just carrying on the pattern below the water

    • Nelson Amaral
      Nelson Amaral 4 months ago

      Great, so we're already at secondary details most people fail to notice even upon review.
      Can't wait until we're going pixel by pixel comparison, filtering as if we're making TheXvid UFO videos.
      It's evolving fast, very fast.

    • Rexodel
      Rexodel 4 months ago +2

      the reflection itself was part of the part that was manually stilized, so the artist just put the same pattern on the reflection, as is show later in the video

    • Ardusk
      Ardusk 4 months ago

      @F M And it totally missed the neck

    • F M
      F M 4 months ago +1

      You're right. I don't think it's handling reflections at all, it's just applying the pattern to swan + swan reflection.

  • penguinista
    penguinista 4 months ago +1

    I think it is more impressive to consider where we were two papers ago than to try to imagine where we will be in two papers.
    The later feels like science fiction, but the former makes me struggle to believe.

    • TTTrouble
      TTTrouble 4 months ago

      This is such a good point. It's why I really enjoy when these videos highlight prior methods in many of his other videos

  • asmithgames
    asmithgames 4 months ago

    On one hand, this is truly amazing AI and it's a wonderful time to be alive. On the other hand, this is going to thrust us even more into a post-truth society once the government and other organization learn how they can abuse this technology.

  • Srinivasan Thirumalai
    Srinivasan Thirumalai 4 months ago

    Incredible! Thanks yet again for bringing this up.

  • AmazingHistoryOf Vlogging
    AmazingHistoryOf Vlogging 4 months ago +3

    A very appropriate use of 🤯. Truely amazing stuff.

  • MrVipitis
    MrVipitis 27 days ago

    It took about 1 year to get depth matte estimation into phone aps, and two years later it's in professional post production suites like DaVinci Resolve from Blackmagic Design. Also the "magic mask" segmentation.
    This seems to be a step ahead of compositing, but maybe we will know in two years

  • delpinsky
    delpinsky 4 months ago

    It never ceases to impress what AI can do... and we are still at the beginning of this process! We are creating something that is going to create a new reality in the end 😅 Soon it will be difficult to tell what's real and what not...

  • Про ИЗО
    Про ИЗО 4 months ago

    Amazing and incredible even in this state of quality!

  • TiagoTiago
    TiagoTiago 4 months ago

    I remember a similar work from a long time ago; was presented on Siggraph I believe, I remember they had a shot in the end where they wrote "Siggraph" on a Giraffe (or whatever it was the name of the conference). I don't think they were using AI though.
    edit: Found it, the technique is called "Unwrap Mosaics" seems it was presented in 2008

  • Alex Fraser
    Alex Fraser 4 months ago

    I like that this video had a little more technical detail than usual. A bit more, please? I'd love to see an overview of the network architecture.

  • Ansuman Mahapatra
    Ansuman Mahapatra 4 months ago

    Please create a tutorial on how to use weight and biases with an training and testing example. Thank you

  • Nelson Amaral
    Nelson Amaral 4 months ago

    I can't wait for the AI we'll need to develop soon to figure out what is AI and what isn't.
    We're screwed!

  • Andrius Valatavicius
    Andrius Valatavicius 4 months ago +3

    Another Two Minute Paper, what a time to be alive!

  • Erik Žiak
    Erik Žiak 4 months ago

    I wish there would be some AI that could convert vertical videos to horizontal ones. I hate vertical video and do not want to see it anytime again.

  • Yvana Luz
    Yvana Luz 4 months ago

    Finally, one good single reason that's worth the ridiculous price of their creative cloud services.

  • nightmisterio
    nightmisterio 4 months ago

    Now we need easy demos so all can use

  • tomahzo
    tomahzo 4 months ago +1

    What is the performance, though? I doubt this is realtime so how long does it take to render this?

  • Perceivedshift
    Perceivedshift 4 months ago +3

    This is great for creating clean plates!

  • Lettendo
    Lettendo 4 months ago +2

    Is it already possible to turn a 4:3 video into 16:9 by generating the missing image information on the sides by AI?

  • TheGamingChad
    TheGamingChad 4 months ago

    i wonder what AI is gonna be capable of doing in 10 years from now...

  • Umar Farooq Sakauloo
    Umar Farooq Sakauloo 4 months ago

    Amazing technology

  • Shayne Weyker
    Shayne Weyker 4 months ago

    That parallax on the fake far mountain in the boat video was too strong. Also until they get the warping problems fixed, people doing video editing are going to need to need to be allowed to and willing to manually tweak control points in the mask separating foreground from background. The dog tracking thing is cool and we're already seeing that kind of thing making its way into the Resolve video editing software.

  • psiga
    psiga 4 months ago

    Liking, commenting, and already subscribed! Yet another work of magic!

  • Emma Gamma
    Emma Gamma 4 months ago

    I feel like we're only about 2-4 papers away from seeing generated video edits, based on a normal human sentence/paragraph and input video, in which the results are indistinguishable from reality to us...

  • Jason S
    Jason S 4 months ago +2

    The robots are coming for my job! Let's see them deconstruct 1.5 hours of vlog footage into a meaningful ten minute video though 😅

    • Jason S
      Jason S 4 months ago +1

      @Water is Eternal then I just have to start using it before my boss finds out about it 😂

    • Water is Eternal
      Water is Eternal 4 months ago +1

      They already got AI summarising large articles into a few sentences ... soon it'll be able to review and summarise your video 😏

  • Potato Face
    Potato Face 4 months ago +2

    Goddamn. Didn't think i could be so amazed 😍. The future is gonna be insane

  • IamSH1VA
    IamSH1VA 4 months ago +1

    I really loved this, amazing 💪🤩.
    But…….,how can we detect the video has been tampered & consequences?

  • JayDizzy
    JayDizzy 2 months ago +2

    Amazing, but is this available for the public to use?

    • JayDizzy
      JayDizzy 2 months ago

      I dug through the links and found that the code they used is available for people to try, but that’s literally another language to me 😭 I’m praying a brave soul someday turns this into a program that has a few buttons like EBsynth 🙏

  • Andrew Rothman
    Andrew Rothman 4 months ago

    Can someone clarify for me the difference between a “paper” and an actual “application”? In my mind, a paper is just a write up about something, not an actual piece of functioning code. At what point is a “paper” actually a useable product?

  • dhillaz
    dhillaz 4 months ago

    0:27 Kasten et al: "We can conveniently forget about this speeding motorcycle"
    Tesla self driving AI: "You did what!?"

  • mizzleton mc
    mizzleton mc 3 months ago

    we need to be able to create a rough cut by an ai and then a human corrects the errors and warping manually. then it will be perfect.

  • Sevak Fair
    Sevak Fair 4 months ago

    That is the purpose after all, anything that improves chromakey is good. On the other hand, while great, the overhead involved is massively higher and not applicable to most workflows, yet.

  • Robert McGarry (mansion)
    Robert McGarry (mansion) 4 months ago +4

    Today, we enhance reality! So grab your papers.

  • Homeyworkey
    Homeyworkey 4 months ago

    At 2:28 its worth noting that the reflections don't work for the neck of the swan for some reason. I wonder why that is. irregardless, this is very promising work :O

    • Hernan Soto
      Hernan Soto 4 months ago

      That's because of this 3:50 - 4:08 It seems that the swan and it reflection where handpainted, and the one that painted it forgot about the neck. The AI seems to only stick and warp the new image to the video

  • Dennis C
    Dennis C 4 months ago

    Amazing, but still not usable for professional use

  • Ardusk
    Ardusk 4 months ago

    Ok, so I understand that the developers of this have put a lot of work into this technique, but it feels kinda like 2019 technology. Can anybody explain how this is new or exciting for my monkey brain?

  • RmaN
    RmaN 4 months ago

    I wonder why audio world is so much behind in advancements of AI compared to video 😐🤕🥺

  • Random Human
    Random Human 4 months ago

    Love your videos!

  • paresh bhangale
    paresh bhangale 4 months ago

    Can you pls make few videos on how to learn n develop our own AI?

  • Subham Burnwal
    Subham Burnwal 4 months ago

    Great! We are going to solve global warming with some awesome editing!
    (Take this is a joke guys..)

  • Chris Hayes
    Chris Hayes 4 months ago

    We're that much closer to bringing Black Mirror to life 😁

  • John Grey
    John Grey 4 months ago

    It's funny that in the digital age, many still call them "papers". Some older folks still say "taping" when they refer to recording video. There are many ancient relics in our language, hopefully they will phase out soon.

  • oisiaa
    oisiaa 4 months ago

    I don't know that AI will bring us much good, but I do believe that AI will sow mass chaos when it comes to what is to be believed as real vs. false. Trust will be hard to distinguish from fantasy for the rest of human history.

    • Shaun A
      Shaun A 4 months ago

      Do people believe everything they read? What about every tabloid picture you see? Ever watch a movie and suspect that dragon is not real? Many are critical of what they read and view already. I agree though that there needs to be education on this matter.

  • Ser Ta
    Ser Ta 4 months ago +1

    So awesome 🤩 👏

  • Unknown
    Unknown 4 months ago +1

    I laughed loudly when you said "This feels like BlackMagic" xd

  • Lardzor
    Lardzor 4 months ago

    I don't think the reflection on the swan was computed correctly. The pattern of stripes above water was Black above white above grey. It was the same in the reflection when it the order should have been reversed.

  • epizan Other
    epizan Other 3 months ago +1

    Does anyone know what software was used? I know he said it was by Adobe.

  • RoliTheOne
    RoliTheOne 4 months ago

    "This feels like Blackmagic"... "Sir, this is Adobe". :D

  • aqryllic
    aqryllic 4 months ago

    Isn't this basically cryptomatte but for videos? Mind-blowing!

  • PhelPer
    PhelPer 4 months ago +1

    Would be nice if you could tell us how to do it or name the Programms u use…

  • Adil Zia
    Adil Zia 4 months ago

    Wait so when is this coming to After Effects again?

  • Ben North
    Ben North 4 months ago

    So is this possible for me to use on shots i have?

  • Scott Simmons
    Scott Simmons 4 months ago

    Can't 90% of this be done with Lockdown in After Effects right now...?

  • John Coburn
    John Coburn 4 months ago

    Is there any way to use some of these techniques now?? 👀

  • Shikyo Kira
    Shikyo Kira 4 months ago

    2:28 the neck of the swam isn't computed correctly tho. Still, half done work is better than doing it from scratch. I'll take whatever it is free

  • Tünde Eszlári
    Tünde Eszlári 4 months ago +1

    Nagyon jó lett a videó.

    • Dóra Dávid
      Dóra Dávid 4 months ago +1

      A szokásos jó minőség. Nagyon érdekes videókat készít.

  • Jorge C. M.
    Jorge C. M. 4 months ago

    I tried to hold onto my papers but the AI made them disappear!

  • Dhinesh Kaviraaj
    Dhinesh Kaviraaj 4 months ago

    What a time it is to be alive !!!

  • Anas Lee Joon Gi
    Anas Lee Joon Gi 4 months ago

    how expensive this a.i software can we use ?

  • Secret Tutorials
    Secret Tutorials 4 months ago +7

    adobe is the last company that I taught would bring nice AI stuff 😅 Nice to see that they finally catch up

    • Tom
      Tom 4 months ago +1

      Don't they usually do quite a bit of AI research? I mean considering they have things like content aware filling and moving for photoshop and a while back they were working on synthesising voices. Seems like it makes sense for them to be pouring money into this sort of research considering they can actually integrate them into their existing products.

  • MrDuhVinci
    MrDuhVinci 4 months ago

    Can the AI generate a depth map of say a persons face?

    • AEJuice
      AEJuice 4 months ago

      Yes, we will have a tool to do that next week

  • Neo
    Neo 4 months ago

    So this is EbSynth's open source alternative.

  • brian bouf
    brian bouf 4 months ago

    I think when stuff like this become more eccessible the film industry will collapse cuz everyone can make a movie with out using even real human.

  • loginvidea
    loginvidea 4 months ago

    Everyone hunts megapixels and framerate how's that we still dont have video format with acceleration and orientation data within in every other phone. Would make these so much better.

  • Los Merengues
    Los Merengues 4 months ago +1

    My favorite research channel

  • Andrés Felipe Ramírez Bonilla

    how is not this on trending already??? it has more than 1.1M views

  • James
    James 4 months ago +1

    Is the processing in real time?

    • Malchir
      Malchir 4 months ago

      Of course not.

  • fischX
    fischX 4 months ago

    It somehow sounds like Karol was also replaced with an AI

  • RxbLxx
    RxbLxx 4 months ago

    now take this to the level the tech companies are using it and then u will never know what is real and what isnt ..hmmm sounds like the us for the past 20 years

  • Mercy
    Mercy 4 months ago

    Indistinguishable from magic

  • zander
    zander 4 months ago

    So we are in the matrix, creating another matrix 😎

  • Stiekeme Henk
    Stiekeme Henk 4 months ago

    The swan's neck reflection wasn't stylized, just the body.

    SERVANT TO FRIEND 4 months ago +1

    So, what program is being used here? Was that said?

  • Quast
    Quast 4 months ago

    2:38 looks like the neck reflection wasn't processed

    • Hernan Soto
      Hernan Soto 4 months ago

      4:06 looks like the reflection was handpainted (or made with a filter, but that it wasn't the AI)

  • Bill R
    Bill R 4 months ago

    At around 1:15, in the flowers-on-dress example, did the research really need to rainbowize the bench? What's the agenda here.

  • Arthur Arcturus
    Arthur Arcturus 4 months ago

    At what price do we add flowers to this dress? No longer having video evidence. This won't end well.

  • Kryplus
    Kryplus 4 months ago

    And that's how the authenticity of videos lost its value forever...

  • dwfalex
    dwfalex 4 months ago

    The reflection of the swan's neck was not computed correctly ;)

  • Not Oltrex clearly
    Not Oltrex clearly 2 months ago

    Ebsynth can get stylization better

  • superpos
    superpos 4 months ago

    "...Do not look at where we are. Look at where we will be..."
    Károly Fehér

  • #Digiverse
    #Digiverse 4 months ago

    Where I get this software ?

  • ooww
    ooww 4 months ago

    as big companies buy up the patents democratizing might not look like how you want it to look like

  • Ashtree81
    Ashtree81 4 months ago

    AI refreezes polarcaps

  • Dr. Stewart Lee
    Dr. Stewart Lee 2 months ago


  • Ser Ta
    Ser Ta 4 months ago

    Oh wow unbelievable