Google’s New AI: This is Where Selfies Go Hyper! 🤳
Embed
- Published on Dec 22, 2021
- ❤️ Check out Weights & Biases and say hi in their community forum here: wandb.me/paperforum
📝 The paper "A Higher-Dimensional Representation for Topologically Varying Neural Radiance Fields" is available here:
hypernerf.github.io/
🙏 We would like to thank our generous Patreon supporters who make Two Minute Papers possible:
Aleksandr Mashrabov, Alex Balfanz, Alex Haro, Andrew Melnychuk, Angelos Evripiotis, Benji Rabhan, Bryan Learn, Christian Ahlin, Eric Martel, Gordon Child, Ivo Galic, Jace O'Brien, Javier Bustamante, John Le, Jonas, Kenneth Davis, Klaus Busse, Lorin Atzberger, Lukas Biewald, Matthew Allen Fisher, Mark Oates, Michael Albrecht, Michael Tedder, Nikhil Velpanur, Owen Campbell-Moore, Owen Skarpness, Peter Edwards, Rajarshi Nigam, Ramsey Elbasheer, Steef, Taras Bobrovytsky, Thomas Krcmar, Timothy Sum Hon Mun, Torsten Reil, Tybie Fitzhugh, Ueli Gallizzi.
If you wish to appear here or pick up other perks, click here: www.patreon.com/TwoMinutePapers
Thumbnail background design: Felícia Zsolnai-Fehér - felicia.hu
Wish to watch these videos in early access? Join us here: thexvid.com/channel/UCbfY...
Meet and discuss your ideas with other Fellow Scholars on the Two Minute Papers Discord: discordapp.com/invite/hbcTJu2
Károly Zsolnai-Fehér's links:
Instagram: twominutepa...
Twitter: twominutepapers
Web: cg.tuwien.ac.at/~zsolnai/ Science & Technology
The possibilities of AI keep blowing my mind with things I couldn't even concieve were possible with computers. What a time to be alive!
Some commenters here do not understand what appeal to authority and appeal to popularity logical fallacies are.
@Bob Smithy Free will does not exist in science, simply because free will cannot be put on an algorithm and cannot be calculated. And that is exactly what differentiates us from AI. We do have free will, machines will never ever get free will. It is not possible. Machines will always be nothing else but advances calculators. If you think that humans are nothing but calculators, then think about what you did with that old Texas X-66 calculator from 76 when you got a newer one. Or what you do to your old phgone when you get a new one. Taking machines apart is not immoral. Taking humans apart *_absolutely is._*
@Muslims remember Apostacy Day 22nd of August Not believing in souls does not imply that they don't believe free will. Additionally it seems like you believe that the only way to have free will is to have a "soul" which is debatable.
Either way, whether we have free will or not or whether, you will still feel the exertion of effort to do things, you still feel happiness and sadness, frustration and anger, the experience of all these things are real. Regardless of whether souls exist or not, I do know my experience of things are real, even if we're all just in a simulation.
Personally speaking I'd say we do have free will as I can choose whether I can do something like lift up my bag, but perhaps it's just an illusion. Anyhow, I think it really depends on a lot of things like: what you consider free will to be, what you define the act of choosing to be, your opinions on the conscious and sub-conscious mind, etc...
I suggest all ML papers have nerfies of it's authors making silly faces on it from now on. This should just be standard practice.
The mind boggles at what this technique could achieve with existing footage and old movies. When I was a child I knew that one day you would be able to walk into films and interact with the people in them. Technology like this, upscaling, 60fps conversion, 3D and AI based character animation used collectively could make that a reality!
@Anton Liakhovitch You're very point supports Nestor's worry. If human error is already an issue with reciting history, imagine how that problem would compound if AI-estimation errors were also a factor.
@Prince Westerburg Gotta tell that to Japan first. With their advanced experiences and techniques in filmmaking, soon they'll find a way to let the world forget the Pearl Harbor, and other nasty stuff the Imperial Japan had once done.
I imagine upscaling to VR would be possible if you can offset images and such with this ai.. 🤔
If this technique could stabilize all the overly shaky fight scenes in modern movies that would be enough for me.
The possibility to interactively change the higher hyper space coords of the scene is really mind-blowing! Wondering how fine the resolution is for the different states, or how much data can get analysed and placed into these hyper space dimensions.
@denden Even if two more papers down the line it still can't run in real-time, you could still theoretically take a nerfie with your phone and then wait a few seconds for your phone to do some processing. Then it can generate a map of 3D pictures for you like in the demo. These NERF papers always have so much awesome potential!
@denden could be! not gonna lie haven't checked yet - but regardless, the fact is, that the "parameter" exists in the network so ultimately it COULD be 'easily' implemented if it wasn't :)
@denden I think that's more a result of having to put it on the web. I suspect the underlying space is continuous - but you don't want the browser trying to run the model for every frame! So they pre-render a few points in the space and interpolate in real time.
It’s actually a cool trick by the authors to make it looks like you’re “interactively” changing the hyper coordinates and view the result in real-time. Open the developer tool of your browser on the project page, and move your mouse to the rendered results to the right of the coordinate selector and you’ll see it’s actually a bunch of pre-rendered images. Still pretty cool tho!
Directors today: "Stop shaking the camera, I want a smooth path exactly like this!"
Directors tomorrow: "SHAKE THE CAMERA MORE! I need more data for when I direct the cinematography in post later!"
With the depth information, maybe you can see selfies, videos, etc in virtual reality, right?
that will be cool.
exactly what i was thinking
@Anton Liakhovitch I mean there is no need for a mesh and textures as long as you don't want them. But I do....how do you save or manipulate your models? I do want to have things in a format which is not propriatary and still usable in a couple of years. Just because of that I think texture+mesh is desirable - apart from obvious rendering benefits...but I get your point. For a movie or some sort of snapchat-app this is perfect and you dont need the 3D reconstruction. And you save yourself a ton of compute work
@thoeby With this technique, there is no need for a mesh and textures. Just get it to run fast enough and output two viewports at once. VR goggles are just screens, after all - they don't care if you render with polygons or with some crazy NERF technique from the future
@Sanane Farketmez 🤓
Yeah there is actually a thing like that and its called AR augmented reality. But you need lots of computational power to compute all the point cloud data.
the accuracy of the depth pass is mindblowing specially with when dealing with glass/transparent materials and holes! this is awesome! I wonder if it is able to extract as accurate depths on stills or videos with less paralax?
I may sound like an ignorant moron, but I don't think this is that interesting unless everything in the video used single-camera perspective
Is it using stereo depth cameras though? Or is it a single camera in this video?
I hope these are baby steps towards fixing the projection and parallax errors found in vr180 stereo captures and playback in a few years.
“Hold on to your papers + What a time to be alive” are my 2 favorite sentences of 2021
And now 2022 :D
And, hotyp, they even make perfectly readable acronyms. Wattba!
I can not wait for a future version of this method to be used to generate a smoother transition between google street view images.
Yeah this is a nice idea. It wasn’t really obvious to me as to where this would be currently useful but I might just have poor imagination.
Oh wow that's actually a really smart implementation of these. If this could run real-time on a browser then with a bit more data or an algorithm better at more intensive interpolation then this would be amazing.
Would be interesting to note how fast the processing times are for each technique. When making things appear real time on a youtube video, it should be noted if the data was rendered ahead of time. Thank you.
All of these were taken in some closed space with walls somewhat near. But how about if you're in nature, on top of a mountain. Would it recognize the massive distances? Also, could this be used in astronomy? I think it could be very useful for astronomers and space research! :)
@lystfiskerlars That's pretty cool!
@lystfiskerlars Sadly I don't think this type of stuff can run in realtime yet
@YTDekus No use at all in astronomy, stars barely move and compared to a live scene represent very very few points. The grunt work brute force maths has been done with effective apature equivalent to the diameter of the earths orbit round the sun. There simply isn't any more data or processing available to be done by any system that would improve our existing knowledge.
News just in: James Webb Space Telescope will effectively have larger apature and better optics. Since we are now idle couch potatoes it's likely machines will be used to refine our data. Unlikely to be AI though since brute force maths is easy in this case.
Or completely replace LIDAR for self driving cars... oh wait... :)
I really love how fast NeRFs are growing as a research direction. Kinda wish more of them would be trained on Raw images, but I guess the only thing that's really standing in the way of that is the relative lack of training data as most images are gonna be jpegs
0:31 NeRF looks like some Matrix rotational effect! Very cool looking, even if it isn't realistic.
Seems like a great way of digital video stabilization, without having any warping caused by the parallax effect
I just can't wait for AI photogrammetry! It will take 3d scanning to a higher, much more affordable level!
What would be even more amazing is to be able to choose a moment from the video, rather than a point in hyperspace, to parameterise the nerfie, ie to get a 3d reconstruction for every moment of the video!
thats amazing! it would be cool to send someone a video that they could move around a bit spatially, captured just like a normal video
What would be a good way to bring nerfies or HyperNerfs into the physical world? Something like a digital picture frame with an IMU/orientation sensor? Or with a touch interface to move the point of view?
Light field displays?
One of those looking glass displays :O
Does this technology need stereo depth cameras to work? Could it also be implemented with lidar cameras like the L515 from Intel ?
"A powerful deadline experience" is priceless, very relatable!
instead of doing volumetric frames in time, they do just two spatial dimensions as well as a 4th channel, a depth channel.
As the whole architecture is RGB in, RGBD out.
You can use the time dimension as a high dimensional latent space, which you simply animate through a loop.
Great to see such estimations getting better.
Dr. Károly, imádom a videóid! Életem egyik legjobb døntése volt rákattintani egy random YT által ajánlott videódra, azóta mindig feldobja a napom ha látom hogy van új. Szerintem nemes dolog az hogy megosztod ezeket a zseniális tanulmányokat. Továbbra is køvetem majd ezt a sorozatot, és remélem jól telnek számodra az ünnepek! ^^
(Ui: ha zavart az hogy tegezlek akkor elnézést)
I would love to view one of these Nerfies in VR
It'd be super cool for this to go mainstream!
they will, just in time for metaverse. this will be the selfie evolution in next gen meta users and will become staple of every meta device
Not something I'd ever use for Selfies but I can see this having applications elsewhere.
just brutal! it just keeps triggering new thoughts like "everything's gonna be possible"
This could have tons of amazing applications in photoscans, What a time to be alive!
I have a bad feeling that these kinds of wonderful neural networks will at some point end up being Snapchat filters or some similar crap
They already are, what do you think is funding this research papers ? :)
I remember seeing this kind of stuff in Star Trek tng and thinking “this could never be real”
I always remember that scene from Enemy of the State where they rotate around Will Smith and show he has an extra deformation in his bag where the Gameboy was.
I always remember people saying "that's impossible, it's just making things up, that'll never happen!" lol
Technology sure showed them. Equally the minor differences in reflected light is enough to help recreate imagery out of sight to a decent degree, even with a static 2D camera sensor and not a lightfield sensor setup. (but the latter sure improves things considerably!)
I kind of want to watch that again now that I remember it.
could you try keeping the new method on the same side for the whole video when doing comparison, it can be confusing if we do not read the label all the time.
But when can we use this in a product? This seems so good for VR.
This is so amazing... I can't wait to see how many papers I drop next time!
You gotta squeeze those papers
Hope to see the next generation of iPhone to use this technology on the camera, and people can change the picture's angle after the short.
wow this is incredible!! can't wait for it to show up on my smartphone
Can this be used for putting a 3d scan of an area into modelling program or game engine?
Can this be used to 3D scan stuff?
As a VRChat user - anybody who works on the next paper down the line, we need more 3D scan options just with a video!
A powerful paper deadline experience :)
This man just keeps uploading bangers after bangers
Incredible work
would be great for looking around old movies and changing the expressions on the film stars, imagine Bela Lugosi fang shots
haha the authors part! Really sells how fluid this system can be
In theory you could use this to create VR 3D photos from movie stills or live performances from the past.
Can you crop a Hyper Nerf to preserve relevant/interesting bits easily rather than show the whole frame and perhaps save on memory footprint? Can you zoom in to any extent rather than just wobble about? Could you screen capture a portion of a FPS game or movie and turn it into a HNerf or is certain picture data behaviour expected ?
This is mindblowing, but inventor never know what their creations will be used for.
do you think this could be used for making 2D video 3D?
I like seeing this specific application develop
Now we can simply extract models from games! Content creators will be so happy.
I see so the new way to make movies is to create a hyperspace that allows us to render the actions and camera movements out of.
Love the consistent uploads nearly everyday!
Now we just need a common format to share these things on the web like GIFs
WEBN, coming soon.
ok this was trippy as hell, it can record the space but at the same time the video itself and merge both
8 brilliant people making something I never knew I didn't need. What a time to be alive.
Now use those depth maps to actually create wireframe renditions, and you have essentially reinvented a 3D scanner.
What a time to be alive!
What a time to be alive!
omg I love the "researcher storyline" add-ins 🤣
This is great for cheap DIY Matrix style levitated camera 360 rotations probably without having to have an expensive circle of cameras and such
Again, this will be brilliant in VR
Real-time virtual 3D soijacks. What a time to be alive 😲🤳
I don’t know why this stuff never makes it to mobile apps. Let the general public have fun too!
Clearly this is in a marketable state, idk why this hasn't been implemented in anything because this is the future of capturing and storing "memories". First was drawings, then writings then pictures, then video. Only a matter of time until we can literally store the space-time of a memory of a device in 3D.
Am I the only one that immidatly grabs any paper nearby when he says "hold on to your papers"?
Basically we can make 3d pictures from video now! Incredible!
I hear our future AI overloards chuckle over our progress in AI development already ;-)
Only if Google's engineers actually worked on things that matter like electricity storage!
Video editing is now fun and easy with these AI researchs.
This is incredible
wow fellow scholars, i'm so excited! this will be a benefit to mankind
Does it need your SS# to work or does it just capture every physical feature that makes up your identity?
Can this be used to "undo" the motion of the camera? aka. stabilize the video? Would be a little more useful than selfies ;)
yes, or create a new motion of the camera within recorded volume, and/or create a new sequence of animation (using recorded sequence expressions in different order).
ok the tech is awesome, the usecase questionable though - even more selfies really isnt what the world needs ^^
(yes i know its only one application)
Oh you dummy. Didn't you see the food pictures? ;)
too bad that none of all of these amazing applications i´ve seen on this page are available for people.
That is the most beautiful depthmap of a breaking cookie I have ever seen.
this will be huge for VR videos (of a certain kind ;) )
Two minute papers: incredible!
Dear Fellow scholars: incredible!
Me: Your sleepwalking and programming your digital open-air prison
For better or worse, this paper has to do with entertainment, so I have a feeling it will get tons of attention. People will get to know about this more than most papers. It's a bit disappointing but ey it works
holy shit, this is the craziest thing I've seen here in a while, can't wait for Google to put it on the pixel 12
So... Are we going to see people waving their phones in the future? Jokes aside, this method can be used to record artifacts three dimensionally quickly. What a time to be alive!
Couldn't this be used to convert 2d video into 3d video?
I finally found out how Rick Deckard managed to discover that evidence from the picture in Blade runner. Nerfies
GIve them 3d scan of your face, body and your house, it's good for your privacy when they connect it with other technology that can see trough walls and track movement using wifi and other waves. And get 5g phone so they will have better resolution with millimeter waves. So they can have like 3d tour of your house.
This is really incredible
I am not sure if the "What a time to be alive!" will hold up to the nefarious human nature... but I guess we'll see.
Man, besides a nice video hyperNeRF idk where we will be 2 more papers down the line from here!
Inspirational as usual!
Technology man!💟
I wonder if this technology will arrive on the market as a brand exclusive deal. Only for iPhones and stuff because _Apple_ bought the rights, the company, the people.
NeRF is really nice. But it's still slow. Let's see if someone can make it faster.
This paper will be even more important in the future.
This literally means : that we can turn old movies into 3d-VR-reality-movies. And not just that. We can use our own family-videos, and turn it into 3d-VR-Reality-movies. Its like timetraveling back into our childhood, or other precious moments in our life, or the life of others.
if this can be done in nearly real-time, we can very easily create a fake holograph
Humans have been around for thousands of years, Ai only a few years. They will do great things soon, already are.
wow, its like iphone's live photos, but revived... amazing!!
let me have this on my phone now haha. awesome paper
Incredible!
That is some real Bladerunner CSI $#!+. Everyone laughed about how impossible it is to just yell ENHANCE and zoom into the picture and around corners. Turns out, we can, within reason
So we could render the doom style sprite characters by interpolating four nerfies, yea? Compared to the original sprites for 8 directions.
this channel always surprises me
I can't tell if it has it or not - but the ability to be able to modify the animation space of multiple entities at once would be amazing.
More so, being able to animate them separately would be even more amazing. Animate being a keyword as well, being able to retain the persons animated face in realtime (when it was recorded), or play it backwards, or jump around the timeline (something that looks natural instead of going instantly from closed to open, or poured to unpoured).
You could have a lot of fun with reality, or just correct a "wedding photo". Detecting body locations and being able to lock them in place at a specific time while leaving the rest untouched would be amazing. Creepier still, you could animate them all smiling at the same time, even if they never did. The latter might require a lot more work though.
I can imagine this just taking a lot more work and computing power, give it about 6 months
i can't imagine what it will be in ten ou twenty years....
00:43 Thank god that guy wore his facemask otherwise his phone could get a cold. What a good person.
Goddamn, that's literally the kind of photograph they show on the resumes of the Rocinante crew in the Expanse season 5. What the hell.