First-Person Hyper-lapse Videos

UnoriginalGuy · on Aug 10, 2014

The result is quite simply breathtaking. It looks like something shot for a movie using a stabilised dollycam, the fact they were able to achieve the same thing using nothing but a GoPro, their software, and likely a week of post-processing on a high end desktop PC is simply amazing.

I hope we see this technology actually become readily available. There might still be work to be done, but in general if they can reproduce the demo videos with other content then they're on to something people would want.

jjcm · on Aug 11, 2014

The technology is already available via photosynth: http://photosynth.net/preview/view/df869f96-2765-4939-8eb3-2...

It appears that what they're doing here is simply extracting keyframes from the video, using them to compose a photosynth, then converting the autoplay of the synth to a video. If you load a photosynth and press "c", you can even see a the same point clouds and scene reconstruction seen on the research page.

Source: I worked on photosynth.

EGreg · on Aug 11, 2014

To me it seems like they are just taking frames subject to three constraints: average must be one every 10 frames, maximum gap must be say 80 frames, and finally the aggregate distance is minimized. In other words minimizing that metric subject to those two constraints. That's all. It's a nonlinear minimization problem.

EDIT: After reading their description, I agree they are going the photosynth route. Why not, they have the technology that you worked on. And they say that the naive subsampling I described above doesn't work...

pron · on Aug 10, 2014

It's striking but it's far more believable when you realize that they need to play at a much faster speed than the source, so they have tons of extra samples from which to extract information. They basically use all that data in the extra frames (that would otherwise simply get tossed away in a regular time-lapse) to construct a 3D scene. This wouldn't look nearly as good if they had to play it at normal speed.

bane · on Aug 11, 2014

On the flip side, it sounds like shooting at higher fps and using this to end up with a 24 or 60 fps result might end up with some cool results.

Osmium · on Aug 11, 2014

Camera movement between each frame would be minimal though so there'd be a lot of overlap between frames so minimal extra information. I'd guess the key to improving this result would be multiple cameras at different angles; I imagine it only works as well as it does because the GoPro uses a fisheye lens.

z303 · on Aug 11, 2014

Take a look at an older paper 'Content-Preserving Warps for 3D Video Stabilization'. That seems to do an ok job http://createdigitalmotion.com/2009/06/magical-3d-warping-te...

vanderZwan · on Aug 11, 2014

EDIT: After clicking through to their website, I found this updated technique which shares one principal author that seems to work even better: http://people.cs.nctu.edu.tw/~yushuen/VideoStabilization/

Wow. Actually, if they that add that technique to the mix it might solve the deformed "pop"-effect you see in some videos, like the deformed building you see around 16 seconds into this videos:

http://research.microsoft.com/en-us/um/redmond/projects/hype...

z303 · on Aug 11, 2014

Cool, I'd not seen the updated work. I wonder how much can be done in realtime. I have no idea what the compute split is between between the different processes.

With sensors (gyros etc) the camera path would be trivial, instead of recovering that from the video. Rendering the results would be possible on a mobile GPU. Just leaving the frame conversion to a point cloud in terms of compute and memory.

Maybe some scheme where you down sample the input frames to create the deformation mesh, then apply that to the full size frame would be the way to go

vanderZwan · on Aug 11, 2014

> With sensors (gyros etc) the camera path would be trivial, instead of recovering that from the video.

Well... not quite trivial. They're calibrated differently per model, and it's actually quite tricky to reconstruct the path based on accelerometers and gyroscope alone. There's also the likely issue of synchronising the data from these sensors with the video input. If you solve that second issue however, it could in theory at least help with recovering the path from the video, creating better predictions where the point cloud has moved to for example.

z303 · on Aug 11, 2014

I was being a little cheeky with that. I know position and orientation data can be quite noisy and getting clean good quality data is not simple

derengel · on Aug 11, 2014

"This wouldn't look nearly as good if they had to play it at normal speed."

For normal speed you wouldn't need this :)

schafx · on Aug 11, 2014

This technique actually works great for normal speed and can mimic the smoothness of a cinematographer's expensive dolly: https://www.youtube.com/watch?v=3TlCGh5Pc90

dublinben · on Aug 10, 2014

It's certainly way better than the source video, but it's nothing close to what would come from a steadycam or dolly. You couldn't use this finished product in any kind of real production.

tshaddox · on Aug 10, 2014

That depends heavily on your definition of "real production," and probably quickly devolves to no true Scotsman. I absolutely think this could be used in productions I would consider real, particularly documentary/travel/reality programs and sports.

scrumper · on Aug 11, 2014

Yep, it has a definite 'look' to it, and it appears to work better for some types of material than others (the bicycling footage was far more watchable to my eyes than the climbing stuff), but the effect is engaging and not unpleasant to view at all.

bane · on Aug 11, 2014

Personally I thought the climbing video was breathtaking. With a decent film crew it could be an absolutely breathtaking scene in a climbing film.

mendort · on Aug 11, 2014

There was a mildly annoying effect somewhat reminiscent to pop-in seen when terrain geometries go from lower to higher detail in video games. It was particularly evident here:

https://www.youtube.com/watch?v=SOpwHaQnRSY#t=170

It would, I think, be even more distracting if the video was higher resolution.

bane · on Aug 11, 2014

Yeah, I noticed that too. I wonder how much of that is really an artifact of the lighting and (relatively) low resolution of the camera. Something shot with a better camera and lighting that reveals more terrain might give the algorithms something better to latch onto so the terrain models more cleanly.

zevyoura · on Aug 11, 2014

GoPros can shoot in 4K, they're not a low resolution camera by any means.

tomlu · on Aug 10, 2014

I wonder if you could apply it to higher-quality (i.e. already pretty stable) input to approach a steadycam though?

runeks · on Aug 10, 2014

I think a higher frame rate would help a lot. It's a higher sampling rate basically.

richdougherty · on Aug 11, 2014

Stereo input from two cameras might help with constructing the 3D scene.

mamoriamohit · on Aug 11, 2014

For hobbyists like me, it is amazing.

ww520 · on Aug 11, 2014

Yeah, it looks amazing. If the video is taken at faster speed (like 10X), then they can get a smooth realtime result when slowing it down at post processing.

sitkack · on Aug 11, 2014

Because they drop frames, they aren't stabilizing they are throwing away frames that move too much.

This is good stuff, I like it, but it isn't as wow as the structure from motion work.

And for the folks saying just up the framerate, that won't really help because the head motion needs to back in the same position as a previous frame. It is a function of how much and at what frequency the motion you want to remove is.

This was on my todo list, item removed.

skizm · on Aug 10, 2014

You think a week of post end processing? I didn't read the paper or anything so I could be way off base, but I would assume the algo simply has to choose which frames to keep and which to toss. I doubt this would take an enormous amount of time even with HD videos. The algo is most likely just really clever in how it chooses a good frame vs a bad one.

On the other hand, if it is actually generating a lot of "best guess" images to put between gaps that are too large to bridge (too many bad frames in a row) with the current frames I could see that taking a bit longer, but not a week.

Game_Ender · on Aug 10, 2014

It does 3D scene and camera path reconstructions then re-renders the scene from different perspectives. It's not just "picking the best frames". The technical explanation video goes into the details: https://www.youtube.com/watch?v=sA4Za3Hv6ng

sjtrny · on Aug 10, 2014

Maybe you should watch the technical video or read the paper before commenting.

ascorbic · on Aug 11, 2014

The paper says it took 23,700 minutes to do just the frame selection for the 13 minute bike input video. That's over two weeks.

phorese · on Aug 11, 2014

Actually it says that it took 13 minutes to process 23,700 frames! :)

ascorbic · on Aug 14, 2014

No, see table 1: "input duration (minutes and seconds), input frame count". It says the source file is 13 minutes 11 seconds and that it has 27,000 frames. In table 2 it says that source selection took 1 minute per frame. That's where I got 27,000 minutes from.

phorese · on Aug 14, 2014

Ah, good call, thanks!

I now think we both got it wrong (but me more so than you): Table 2 specifies "1 min/frame", but the source frame selection happens for output frames, not input frames. Table 1 lists a total of 2189 output frames for the 23700 input frames of the "BIKE 3" sequence, so I guess we're looking at 2189 minutes?

Correct me if I misread anything. Again.

Mithaldu · on Aug 10, 2014

Since it's not quite obvious, the supplementary page has videos with better bitrate than what youtube did to them: http://research.microsoft.com/en-us/um/redmond/projects/hype...

rkuykendall-com · on Aug 10, 2014

Interesting that the final video ( mostly the rock climbing ) resembles a video game, where shapes and textures "pop-in" as they are rendered. The technical explanation video was really well done.

msane · on Aug 11, 2014

If the MSR researchers are here -- I'm curious what does it look like when bordering hyperlapse with regular input? i.e., if there were a video consisting of input frames at the beginning and the end, with a stretch of hyperlapse in the middle, what does the transition look like? Does it need much smoothing?

Also you probably saw this over the past week: http://jtsingh.com/index.php?route=information/information&i... (disregarding the politics of that) Whatever he's doing (I assume a lot of manual work) it has a very similar effect and it has these beautiful transitions between speeds.

Amazing work and the videos are stunning.

jkopf · on Aug 11, 2014

Yes, I am here :-)

This would be possible. Although it would require providing some UI so the user could specify which parts should be sped up.

I've seen the Pjong Yang video, it is beautiful work. It requires very careful planning and shooting, and a lot of manual work to create such nice results. We're trying to make this easier for CASUAL, but it's still FAR away from the quality of professional hyperlapse.

vanderZwan · on Aug 11, 2014

Great, now we can ask you questions! :)

Some of the videos demonstrate unusual "popping" effects and deformations when standing still - especially notable in this video, top right, sixteen seconds in:

http://research.microsoft.com/en-us/um/redmond/projects/hype...

I understand how the extreme situation of climbing is a challenge, but what is it about standing still that causes this? Do you have any thoughts on how you might tackle this problem in future work? (although it appears you already combine an amazing breadth of techniques, so I'm not sure how many options you haven't looked at)

msane · on Aug 11, 2014

It would be cool to mark segments for different speeds of hyperlapse, or normal input. Thinking about that I see what you are saying about a casual product. Anything beyond a uniform render becomes complicated. People would want to tweak the knobs and see the results, so possibly a faster "preview" algo that allows you to see the timings. Or a feature to quickly render just the (t-N, t+N) around the borders.

jclarkcom · on Aug 11, 2014

The authors are at Siggraph in Vancouver this week, just saw them today. Likely they are too busy to read this thread today or tomorrow.

spindritf · on Aug 10, 2014

The videos don't load for me (due to mixed content, I believe), so here they are:

https://www.youtube.com/watch?v=SOpwHaQnRSY

https://www.youtube.com/watch?v=sA4Za3Hv6ng

The hyperlapse of the climbing video looks like an FPS game from a decade ago with texture refreshing as you get closer.

jplevine · on Aug 11, 2014

Hi spindritf -- I work for YouTube and have been looking into some mixed content issues with embeds. Mind if I email you and ask for some details about this scenario?

spindritf · on Aug 11, 2014

Barrym is probably right: HTTPS everywhere forces ssl for the site, but not for the embedded videos. Feel free to shoot me an e-mail though, no need to ask.

jkopf · on Aug 11, 2014

Hmm... Did I not embed these correctly? The project page as well as the embed links are plain http:// this is what it should be, no?

barrym · on Aug 11, 2014

People using the HTTPS Everywhere extension will have the site URL as https, but not the embedded videos.

0x006A · on Aug 11, 2014

since the page is also available via https, using protocol relative urls for the embeds should fix the issue (src="//www.youtube.com/embed/SOpwHaQnRSY" instead of src="http://www.youtube.com/embed/SOpwHaQnRSY")

jplevine · on Aug 11, 2014

As mentioned in another comment, we recommend using schema-less embeds, e.g. "//www.youtube.com/embed/...". Regardless, these embeds should still work when embedded over HTTP.

jahmed · on Aug 11, 2014

I walked around Boston once with some friends for 7 hours. When I remember it I see it as the hyperlapse, not moment for moment or sped up. Super interesting work.

ClassicFarris · on Aug 11, 2014

I was thinking the same thing as I watched the video, that, this is probably the closest to how most people remember long stretches of time.

steven2012 · on Aug 11, 2014

Okay please sign me up. I'm willing to pay hundreds of dollars for this software. I have hundreds of gigabytes of time lapse that I've taken that is just sitting there because of lack of ability to do something. I'd easily pay $200+ for this software right now just so I can have those videos and free up massive hard drive space.

NeilSmithline · on Aug 11, 2014

I'm not sure that this software is what you want. It takes regular speed and converts it to high speed. If you start with regular time lapse you'll need to speed it up even more - maybe as much as 10x. Now you'll have super fast time lapse.

steven2012 · on Aug 11, 2014

Actually, yes it's exactly what I want, I'm not sure why you would think I misunderstood what the software did. I have hundreds of gigabytes of 1 sec timelapse photos that could convert into pretty neat movies given what I saw.

zodiac · on Aug 11, 2014

I think what NeilSmithline is trying to say is, the microsoft technology is designed to convert regular 24fps videos to 2.4fps videos (10x time lapse). If you feed it 1 sec timelapse photos (ie, video at 1fps), the output would be 0.1fps.

lifeformed · on Aug 11, 2014

Yes, it requires the information from all the discarded frames, in order to reconstruct the 3D world for a smooth curve.

steven2012 · on Aug 11, 2014

I take 1 second time lapse photos and feed them at 60 frames per second for a 60x speed up. I'm not sure if you've made time lapse videos before, but it wouldn't make sense to film something at 1 fps and then play it back at 1 fps.

zodiac · on Aug 11, 2014

The frames per second I quote are all recorded frames per second (eg, for every second of real-life time that passes, how many frames in the final video were taken during that second?). Play-back rate is always 24fps (or some other fixed constant).

What we're trying to say is, if you feed your video to the program, you're going to get output that is sped-up 600x compared to real life. That's a ridiculously high speedup.

ascorbic · on Aug 11, 2014

The difference is that this requires you to record at full speed. With your timelapse you're probably taking a photo every 1 to 5 seconds or so. This system requires you to take 24fps video as your input.

moultano · on Aug 10, 2014

I wonder how far you can get by using a "naive" timelapse of selecting frames from the video, but being smarter about which frames you choose. Rather than just choosing every nth frame, try to choose visually consistent frames by making the intervals between the frames loose, then apply conventional stabilization after the fact.

jdmichal · on Aug 11, 2014

This was my initial thought about how they were doing this, but I don't think it's as applicable as it would seem. At 10x speed up, that's still ~3 frames from every second. I'd imagine a biker would spend at least a second turning to look down an intersecting road before continuing through. So that would be at least three frames where the perspective was heavily modified. It would have to select for right before and right after the head turn and ignore everything in between, which would probably create quite a jagged warp effect.

iamshs · on Aug 10, 2014

Bloody amazing! Fantastic work! Release it. Release it soon. This is something that everyone would want.

I see they have listed a Windows app coming. Is that Windows desktop app?

runeks · on Aug 10, 2014

If I were them I would offer this as a web service: upload your video and have it converted for $1.

bentcorner · on Aug 11, 2014

Embed it in OneDrive - let you apply it to your saved videos and add a watermark to it ("Hyperlapse Beta").

rplnt · on Aug 11, 2014

Someone above mentioned the processing took weeks. So.. I'm not sure that would work.

sp332 · on Aug 11, 2014

Well, the first version took weeks. I'm sure it will only get faster.

fastball · on Aug 10, 2014

$1 seems cheap.

doomrobo · on Aug 11, 2014

The processing is quite cheap for a company with its own datacenters and computation clusters. Not so cheap for an individual though. So a user could pay a dollar while Microsoft is only spending pennies.

iLoch · on Aug 11, 2014

I think fastball is talking about the value proposition. Capitalism would suggest a higher price point I'd imagine.

knodi123 · on Aug 11, 2014

That's obviously the most useful solution for us. Don't know if that's the best solution for microsoft. I'm surprised there weren't a bunch of logos and catch phrases like "Only on windows" or "powered by Microsoft!"

They said they'll offer it as a windows app, and I imagine it's for very corporate reasons.

Encosia · on Aug 11, 2014

Historically, Microsoft Research has been very disconnected from Microsoft corporate and don't position/frame their work in terms of a profit motive. Their work sometimes influences or filters its way into Microsoft products, but I've never seen them do anything like what you're suggesting.

bane · on Aug 11, 2014

Probably, Microsoft ICE is another amazing piece of software from MS Research that they've released and kept updated over the years.

jkopf · on Aug 11, 2014

A desktop app, yes.

readerrrr · on Aug 10, 2014

Great results but looks expensive. I wonder how many minutes of processing per minute of video.

sambodanis · on Aug 10, 2014

From the paper listed on the page it looks like it takes about 305 hours to process a 10 minute video. The vast majority of that is during the "source selection" phase which takes 1 minute per frame of video.

jkopf · on Aug 11, 2014

We've already managed to dramatically reduce the processing time compared to the numbers reported in the paper.

It'll still be slow (a couple hours for a 10 min video) but running on a singe standard PC (with GPU).

sp332 · on Aug 11, 2014

It looks like after the frame-selection step, the rest of the process never refers to the discarded frames. Is that right? Do you think making the frames available for blending in the later steps would results in smoother blends?

forrestthewoods · on Aug 10, 2014

Prediction: When Microsoft releases this as an app it will be heavily leveraged with Azure to create fast results. Especially if the bulk of the work is in the source selection which, I believe, can be easily done in parallel.

Locke1689 · on Aug 10, 2014

Throw a couple engineers solely at optimizing perf and I'm sure it could be brought down.

kristofferR · on Aug 11, 2014

Yeah, they say so themselves in the paper:

> "In this work, we were more interested in building a proof- of-concept system rather than optimizing for performance. As a result, our current implementation is slow. It is difficult to measure the exact timings, as we distributed parts of the SfM reconstruction and source selection to multiple machines on a cluster. Table 2 lists an informal summary of our computation times. We expect that substantial speedups are possible by replacing the incremental SfM reconstruction with a real-time SLAM system [Klein and Murray 2009], and finding a faster heuristic for preventing selecting of oc- cluded scene parts in the source selection. We leave these speed-ups to future work."

pavel_lishin · on Aug 10, 2014

Ouch. I'll be going on a long bike ride in a month, and I was wondering if I could generate a hyperlapse of it... but it's going to be at least four hours, which doesn't bode well.

jfoutz · on Aug 10, 2014

Back of the envelope, i get 7680 frames to process, one minute each which is just under $1300 for a medium windows azure instance. Not cheap, but probably doable. I'd bet you could spend a few hours fiddling with large memory instances versus medium, and find a sweet spot.

Set aside $50-100 a month, it'll probably be a lot cheaper in a year. (assuming optimizations and cheaper cloud services)

estebank · on Aug 11, 2014

Or you could just buy a computer and stick it in a corner running 24-7. After it has finished selecting/rendering, you have both a video and a computer you can use afterwards.

pavel_lishin · on Aug 11, 2014

Yeah, I'm more likely to go with this option, rather than spend over a grand on a video that may end up looking like crap anyway.

wierdaaron · on Aug 10, 2014

Sounds like something that would be good for a cloud service to provide. Upload your video and let their farm of tuned servers churn on it for a while.

iamshs · on Aug 10, 2014

With an option to host it too, considering they can provide higher bitrates:- https://news.ycombinator.com/item?id=8160890

Upload video, generate hyperlapse, generate a URL and view the higher bitrate video on iPhone, Android or Windows. Considering GoPro/Drone videos generate lots of interest, this will be a very highly useful service.

oliwarner · on Aug 10, 2014

But still cheaper than the specialist hardware and its operator... Probably.

adt2bt · on Aug 10, 2014

This is so insanely cool. I plan to get a GoPro some day soon and will take it on hikes in the Pacific NW. If I could turn my hikes into beautiful time-lapses like these, I'd be blown away.

oh_sigh · on Aug 10, 2014

Really great results. I wonder if this could be coupled with google street view data to give a smoother time-lapse of a path.

oniTony · on Aug 10, 2014

Right here — http://hyperlapse.tllabs.io/

Zaheer · on Aug 10, 2014

Just a note, this doesn't use the technology from the paper.

RankingMember · on Aug 11, 2014

Wow, that is pretty awesome.

L1fescape · on Aug 10, 2014

It appears as if the mountains are loading in the background (like in a video game) as you get closer to them.

Awesome idea/execution!

gus_massa · on Aug 10, 2014

I guess that it's not a proximity problem. For example, in the first video at 3:08 a gray mountain with snow appears on the upper right corner, and replaces a piece of sky. I think that a big rock was occluding the vision of the mountain in this frame, and the algorithm has to choose a texture from another frame to fill the void, and it made a mistake.

ascorbic · on Aug 11, 2014

If you watch the technical video, they say that they couldn't use the scene reconstruction for the climbing video as there were too many artefacts. This is why the rendering isn't as good as the others.

0x0 · on Aug 10, 2014

Haha, I was just going to post that. It really gives off a mipmapping&LoD game/progressive-loading effect. So weird.

hexasquid · on Aug 10, 2014

And the running speed of the original Doom / Doom 2 games.

31reasons · on Aug 11, 2014

Mind blowing results! Although the name Hyper-lapse doesn't really convey the goal, it should be named Smooth-Lapse, because thats what its doing. Too much hyper-x already.

Jack000 · on Aug 11, 2014

With the timelapse crowd, "hyperlapse" generally means a timelapse with a moving camera (eg. reddit.com/r/timelapse) . In that sense they are using the term correctly.

tehwebguy · on Aug 10, 2014

This is incredible! If you want a good look at how it handles moving objects / people check out the part at 2:15

https://www.youtube.com/watch?v=SOpwHaQnRSY&t=2m15s

itchmasterflex · on Aug 11, 2014

Would it be possible to do something like this for audio? It would be incredible to sample an hour-long album or mix in minutes.

limsup · on Aug 11, 2014

time is kind of an essential component to music...

itchmasterflex · on Aug 11, 2014

Time is supposed to be essential to video as well.

This video http://vimeo.com/13669078 has time-lapsed audio. I wonder what it would sound like using the hyperlapse effect.

socksy · on Aug 13, 2014

This is intriguing. What would you need to model as a the continuous input to try and get from the timelapse? With the imagery, a model was made of the 3D scene in which was then used for the 2D final output.

I'm having difficulty imagining what this more in-depth model would represent, and how you'd either strategically take the clips to "paint" this model.

photojosh · on Aug 11, 2014

I would use this for sure. I do timelapses of runs I do and set as challenges for our social running group. The source is a head-mounted GoPro. The problem with them is that a straight forward pick-every-nth-frame gives a motion-sickness inducing video, as well as blurry. If you could extract the frame from the top of each stride when the camera is most steady, I would imagine it would be very much more watchable.

washedup · on Aug 11, 2014

This is incredible. By watching the hyper lapse versions of the mountain climbing, you can clearly see which path is taken, are able to get glimpses of whatever paths are available. This would be a huge advantage for people learning how to rock climb. I can image that a similar situation would occur for many other activities. Great work!

jelveh · on Aug 10, 2014

left side video was shot here: https://www.google.com/maps/place/Altona+Town+Hall/@53.54767...

junto · on Aug 11, 2014

Was the other longer video also from Hamburg? For a moment I thought it was Berlin.

sabalaba · on Aug 11, 2014

This is great. I have weeks of footage from a camera that I wear around and would love to use that video to make a hyperlapse. I would also be interested in seeing how well this does with photos taken every few seconds as opposed to video. Although, after reading the paper, it looks like there would be a lot of optimization that would need to happen to make it more efficient. (Their original implementation took a few hours on a cluster.) Luckily, as they stated in the technical video, they haven't tried to do anything more than a proof of concept; so there is plenty of room to optimize. I'd be interested to see how well a single-machine OpenCL or CUDA implementation does compared to the CPU clusters they were using in the paper.

closetnerd · on Aug 10, 2014

That honestly has fantastic results.

Yuioup · on Aug 11, 2014

I'm surprised Microsoft never releases anything cool like that to their app store.

burkaman · on Aug 11, 2014

>We are working hard on making our Hyperlapse algorithm available as a Windows app. Stay tuned!

Yuioup · on Aug 11, 2014

Ah cool. Over the years Microsoft Research has also released demos of other cool stuff, like recreating a 3d view using multiple photos from different angles. How about those projects?

lucb1e · on Aug 11, 2014

> recreating a 3d view using multiple photos from different angles

I've been waiting for that for years. Searched for it a few months ago, still nothing.

grkvlt · on Aug 11, 2014

Er, http://photosynth.net/

Yuioup · on Aug 11, 2014

That's it! Why doesn't Microsoft add this to their store or, better yet, bundle it with Windows 8 and Windows phone ??

They may not be killer apps but it is one way for Microsoft to distinguish itself from Apple and Google.

lucb1e · on Aug 11, 2014

This is for panoramas, not 3d objects right? I meant merging photos of an object from different angles.

Edit: It's not even working. Photographed something from different angles, synth it, and it appears on their website as a slideshow. Like a normal jquery slideshow except you need Silverlight®.

burkaman · on Aug 11, 2014

I'm not sure why the site isn't working, but the software is definitely for 3D objects, not just panoramas. Maybe your photos didn't have enough in common for it to work?

pwenzel · on Aug 11, 2014

The city demos remind me of the Blues Brothers' car chase scene in downtown Chicago. https://www.youtube.com/watch?v=LMagP52BWG8

rumham · on Aug 10, 2014

Incredible results. I'd kill to try this with Oculus Rift.

31reasons · on Aug 11, 2014

What about motion sickness. I can't imagine trying lapse videos in the Rift without puking in one minute.

bitL · on Aug 11, 2014

Very nice! Is there software available I can use, or do I have to implement the algorithms from the paper myself?

I make a lot of 4K hyperlapse movies, it is tedious as AfterEffect's warp stabilizer is useful only in a small fraction of cases, Deshaker is more consistent but also not perfect, and the only option in the end is multi-pass manual tracking and stabilizing which is very time consuming and tricky for long panning shots.

adijr · on Aug 11, 2014

according to the website: "We are working hard on making our Hyperlapse algorithm available as a Windows app. Stay tuned!"

aceperry · on Aug 11, 2014

Pretty cool that they've included the raw video. Everyone can use that to compare their alternative method to the microsoft researchers' results.

bsimpson · on Aug 11, 2014

As others have commented, the videos look great, and much closer to how people remember journeys. However, there appear to be some image persistence problems (many street poles simply dissolve as they get closer to the camera).

I'm curious to see what happens if they insert more action-packed footage. An MTB course with trees, switchbacks, and jumps would be an interesting stress test of this technique.

sp332 · on Aug 11, 2014

They do frame selection first, then create a 3D scene from the fewer frames. So that light pole might not have been in enough of the chosen frames to get a good 3D model of it. And the first stage has a pretty low-res 3D model because the number of points just gets crazy, so the light pole probably wouldn't have been modeled anyway.

NicoJuicy · on Aug 10, 2014

What's the best alternative for this? I'm doing a GoPro video while cycling with some friends... And this would be insanely usefull

the_cat_kittles · on Aug 10, 2014

be sure to watch the technical exegesis at the bottom, its almost more amazing

jalopy · on Aug 11, 2014

Wow please release this either standalone or even better as a feature of some MS movie production tool. iMovie needs some competition.

issa · on Aug 10, 2014

I want to put something more meaningful into this comment, but all I can think to say is that this is really amazing. Well done!

RankingMember · on Aug 11, 2014

Pretty much my thoughts exactly. It's also nice to see Microsoft get a little press about something innovative/cool. :P

bellerocky · on Aug 11, 2014

I don't understand how computers figure out 3D from single camera video without parallax. How do they do that?

alok-g · on Aug 11, 2014

You track points in the scene across frames. This helps infer the 3D position though getting good accuracy is hard. Basically parallax gets created in time as the camera moves.

I recall those childhood days when we could not explain why moon appeared to come along as our car moved. :-) Those differences in apparent speed betrays something about the distance of the object/point under consideration.

estebank · on Aug 11, 2014

Close one eye. Move your head slightly sideways. The difference in speed of each element on the "scene" tells you how far away those objects are. Your mind makes note of all this and prepares a mental map of your surrounding area. That way you know, however roughly, where the exit door is.

This is what this system does, by using the movement of the camera to interpolate a 3D map of the area around the path taken, as well as the original images that best adjust to the part of the 3D map being seen.

hrjet · on Aug 11, 2014

I downloaded the .mp4 file and watched at half-speed. It looks great at half-speed too, and much more realistic. I wonder why they couldn't have slowed it down a notch. Perhaps, as a research result, they are just staying true to their algorithm's output frame rate.

rbanffy · on Aug 11, 2014

Couldn't it be done with a wider angle lens, a better imaging sensor and conventional image stabilization techniques? If such captures become commonplace, it's easy to imagine capturing with a wider field of view so that the stabilizer would not be so overtaxed.

ntenenz · on Aug 11, 2014

Szeliski also has written a comprehensive Computer Vision book for those looking to learn the field. It's a fairly good one (and available for free online - http://szeliski.org/Book/).

crahrah · on Aug 11, 2014

I'm thinking of trying to recreate this - does anyone know if this is covered by a patent?

GhotiFish · on Aug 11, 2014

awww. I always have mixed feeling about a new microsoft funded tech. Something really really cool that I will never get to use/exploit/play with/anything.

The technical video breaks down some of the techniques they used. Global match graph is particularly interesting. This technique alone could lead to a big improvement in timelapses, by trying to select consistent changes between frames.

http://cg.cs.uni-bonn.de/aigaion2root/attachments/FastSimila... <- maybe this?

tumes · on Aug 11, 2014

This looks like something Michel Gondry would've figured out manually 20 years ago with an 8mm camera and a paper cup or something.

That being said, it really does look amazing!

bobwaycott · on Aug 11, 2014

One of the most impressive things I can recall seeing from Microsoft in years. Would absolutely love to have this as a cloud service or desktop app.

sjtrny · on Aug 10, 2014

Does anyone know how the match graph is created? (The part that is used to identify redundant frames). The paper barely mentions it.

jcubic · on Aug 14, 2014

Now I wait for github repo with open source library or app that will create this videos.

shirman · on Aug 16, 2014

same here

michaelmachine · on Aug 11, 2014

Woah, very cool. I would love to see this applied to a SCUBA diving video.

joseman · on Aug 11, 2014

Very excited about this being released as a product. HURRY UP!

zobzu · on Aug 11, 2014

wow i wanted this for a while

now to implement it open source ;)

lukasm · on Aug 10, 2014

Is this Hamburg? "getrankemarkt" :)

gorekee · on Aug 11, 2014

Yes, it's altona nord.

Netcob · on Aug 11, 2014

A GoPro timelapse is also called a prolapse.

scentoni · on Aug 11, 2014

How unfortunate. http://en.wikipedia.org/wiki/Prolapse

ape4 · on Aug 11, 2014

smoothlapse makes more sense to me

unphasable · on Aug 11, 2014

wow this is incredible!

suchetchachra · on Aug 11, 2014

Excellent work!

gcb0 · on Aug 11, 2014

i would pay to have that on a video player fast forward feature.

manos_p · on Aug 11, 2014

still sucks

anvarik · on Aug 11, 2014

why paper is 35MB?

jaequery · on Aug 11, 2014

EvanL · on Aug 10, 2014

Cool technology, applications?

tibbon · on Aug 11, 2014

I'll be the immature one and say I'm curious how funny porn would turn out with this speeding it up by 10x. They don't show much of how it deals with people and I imagine the results would be terribly funny looking- but perhaps awesome.

Also, I will pay $$$ for this to use with my motorcycle footage from GoPros.

m_mueller · on Aug 11, 2014

My guess is it doesn't work well with footage with cuts in it, it's probably meant for continuous shots (there are a few long ones in Hugo and Gravity for example, so this might be interesting input material). From how the climbers appear, porn would probably decompose into static images.

tannerc · on Aug 10, 2014

Stunning effect, can anyone help me see the practical or entertainment-value use?

I'm also curious if anyone else got motion sickness while watching the video.

maxmcd · on Aug 10, 2014

If I could just take out my iphone and lazily records a skateboard ride down manhattan, and then get this result, that would be incredibly compelling.

For a slightly more practical use this could be a tool to give people previews of hiking trails, tours, or routes they're about to take.

l33tbro · on Aug 11, 2014

Great tech. But it really just looks like a steady-cam sped up x10. I get the technical brilliance, but visually it's not terribly innovative.

melling · on Aug 11, 2014

Now we need a little facial recognition so you can scan to where you meet your tagged friends... of course, there are other surveillance opportunities too.

philip1209 · on Aug 10, 2014

The server supports HTTPS, but the videos are improperly embedded resulting in mixed-content errors. This is disappointing.

apu · on Aug 10, 2014

Seriously, this is what you're going to talk about? Nothing about the technology? And it's "disappointing"?

rasz_pl · on Aug 10, 2014

Nice results, but missing forest for the trees.

One of the by-products of this algorithm is fully textured 3d model representing filmed environment. Offering that as pure data dump, or even a manual process allowing user to control camera would be as valuable as fully automatic one-off timelapse no one ever watches (except maybe your granny).

What sounds better - a video tour of a house, or a 3D model of a house you can traverse however you like?

I wonder if 3 letter agencies have better structure from motion implementations a la "Enemy of the State" (Isnt it sad that this film turned out to be a documentary?). I suspect something like a 3d reconstruction of Boston Marathon (FBI did collect all video footage of the event) would be very helpful to the investigation.

tshaddox · on Aug 10, 2014

Generating a 3D model of an environment from the output of a moving camera has been done. There is obviously a lot of improvement to be done in that department, and those projects are neat, but I think it's appropriate for this project to focus on what it adds to the scene, which is camera path smoothing.

pbhjpbhj · on Aug 11, 2014

Indeed it has, photosynth - I blogged about it before, http://alicious.com/making-3d-models-from-photos/, http://photosynth.net/preview/view/6de2eb38-4e7e-424b-94a3-9... - which came from Microsoft Research. It looks like this is "just" the video version of what they were doing before with photosynth walkthroughs. Very impressive.

boyaka · on Aug 10, 2014

Video stabilization + more FPS / slower rate than the "every 10 frames timelapse" + feel good inspirational music = this

I would guess that I could upload a shaky video to youtube to get it smoothed out, download it, and speed it up with similar to their rate and get similar results. The timelapse that they show that is so much worse uses way less frames of the raw footage (every 10th frame?) and goes way faster than their "hyperlapse". It isn't a fair comparison.

davmre · on Aug 10, 2014

If you'd looked at the site, the authors actually do the exact comparison you suggest: http://research.microsoft.com/en-us/um/redmond/projects/hype... It's better than a raw timelapse, but not (IMHO) nearly as good as their results.

From the paper intro:

    Video stabilization algorithms could conceivably help create smoother
    hyper-lapse videos. Although there has been significant recent
    progress in video stabilization techniques (see Section 2),they do not
    perform well on casually captured hyper-lapse videos. The dramatically
    increased camera shake makes it difficult to track the motion between
    successive frames. Also, since all methods operate on a
    single-frame-in-single-frame-out basis, they would require dramatic
    amounts of cropping. Applying the video stabilization before
    decimating frames also does not work because the methods use
    relatively short time windows, so the amount of smoothing is
    insufficient to achieve smooth hyper-lapse results.

And later on (section 7.1):

    As mentioned in our introduction, we also experimented with
    traditional video stabilization techniques, applying the stabilization
    both before and after the naive time-lapse frame decimation step. We
    tried several available algorithms, including the Warp Stabilizer in
    Adobe After Effects, Deshaker 1, and the Bundled Camera Paths method
    [Liu et al. 2013]. We found that they all produced very similar
    looking results and that neither variant (stabilizing before or after
    decimation) worked well, as demonstrated in our supplementary
    material. We also tried a more sophisticated temporal coarse-to-fine
    stabilization technique that stabilized the original video, then
    subsampled the frames in time by a small amount, and then repeated
    this process until the desired video length was reached. While this
    approach worked better than the previous two approaches (see the
    video), it still did not produce as smooth a path as the new technique
    developed in this paper, and significant distortion and wobble
    artifacts accumulated due to the repeated application of
    stabilization.

Renaud · on Aug 10, 2014

>I would guess that I could upload a shaky video to youtube to get it smoothed out, download it, and speed it up with similar to their rate and get similar results.

No you certainly wouldn't. Watch the technical video at the bottom of the page. It will explain why this is not trivial to do and why standard stabilisation technologies aren't useful to smooth out time lapses.

boyaka · on Aug 11, 2014

Well, I admit that I was pretty ignorant about the work being done on this project in regards to time-lapsed video. I guess I could add to my previous statement that they also cut out irrelevant frames (parts of the video that aren't in the camera path). I don't think this would be THAT difficult to do manually, but I admit that the technical video showing how they were able to graph/visualize the irrelevant frames is pretty cool, and the interesting resulting effects people are discussing in this thread (disappearing/appearing objects, the video game loading effects) are amusing.

I never said that it was trivial, just that similar stuff has already been done and made a "standard stabilization technology", automatically and easily just by uploading to youtube. It seems that youtube's techniques aren't necessarily completely different: there's a screenshot of an article from Google in this video [1] called "Auto-directed Video Stabilization With Robust L1 Optimal Camera Paths". However, I do appreciate and shouldn't disrespect the specialized work being done for time-lapsed videos. My apologies.

[1] https://www.youtube.com/watch?v=BgAdeuxkUyY