I'm not super familiar with your company. Does the approaching of building the s...

AndrewKemendo · on Sept 29, 2016

Does the approaching of building the smallest nucleus of your tech like Sam described work for Pair?

So far it is working - but as you say it is a real pain point.

The biggest problem we have is that venture people focus on our first product only so they are looking at traditional metrics such as MAU, Revenues, exits etc... which is reasonable if they assume that Pair the commerce platform is all we ever want to do - nothing else we are doing seems real to them. In theory, the commerce platform alone is a big product, but it's maybe 1/1000th of our whole stack we're building.

but I'm not sure if I would label it hard tech personally. I think the gray area is all around the real-time constraint.

Without going into a long technical explanation, our mobile monocular SLAM implementation, semantic segmentation, structure from motion and relocalization (loop closure) systems are firmly in the Machine Learning/Hard Tech sphere. Our team is ML/CV people and we're focused on building fundamental infrastructure for future computing platforms.

None of that is obvious from our marketing materials - because it would just be confusing for those in our first market/vertical.

As I said before, this approach is doubly hard than building just a product because you have to build it on top of all the other stuff.

Trust me, if I had 10s of millions of dollars, we probably wouldn't be working on this commerce platform (even though we have had good success and it's a great tool.)

rawnlq · on Sept 30, 2016

> Without going into a long technical explanation, our mobile monocular SLAM implementation, semantic segmentation, structure from motion and relocalization (loop closure) systems are firmly in the Machine Learning/Hard Tech sphere.

I am sure it was hard for you guys but aren't these systems no longer technical unknowns? There are lone PhD students who managed to build them and open sourced their work. [1]

This is no longer hard in the sense of "I'll need a research team and five years -- and we might not figure it out." It seems like it's just academic technology transfer where you need people who can read the latest papers and implement them.

[1] http://vision.in.tum.de/research/vslam/lsdslam (The author is now at oculus research)

AndrewKemendo · on Sept 30, 2016

Well, first off no Lone PhD student has built a SLAM system. LSD itself was years and a whole team. Similar for ORB-SLAM.

Notice also that those systems are built for robots and standup machines to run it, not mobile handsets. Sounds like a small difference if you aren't in it, but it is critically harder, so the approach is actually different. It's not about just "reading the latest papers and implement them." Somewhat offensive to assume that is the case. I'd challenge you to find a mobile monocular SLAM system that is up to our capabilities, let alone usable. They have all been acquired (Apple, Facebook, Intel). The reason you can't just copy paste implementations is because they are non-deterministic in optimization.

Second, SLAM isn't the only thing we do. In fact it's not even the hardest thing we do. The majority of what we do isn't something I'll go into in depth, but it actually falls into the category of "I'll need a research team and five years -- and we might not figure it out" - though now we're in year three and have made enough progress that it's starting to come out of the realm of "we might now figure it out."

tedmiston · on Sept 30, 2016

> It's not about just "reading the latest papers and implement them." Somewhat offensive to assume that is the case.

I wouldn't take too much offense to it. From my (much smaller) experience in computer vision and pattern recognition, everything in CV sounds easy until you actually do it AND make it work in the real world. Real data, poor lighting, low contrast, realtime, etc. There are just so, so many factors that make this an extremely challenging field.

When I was in grad school (2012–2013) the textbooks we used said things like "generalized object recognition is an unsolvable problem". In a lot of ways since then "unsolvable" problems have been solved. The field is just changing so rapidly.

x0x0 · on Sept 30, 2016

If you've never tried to implement academic papers, I think you'll find reimplementations have a very high failure rate. Some of it is probably due to different datasets, some of it is due to not disclosing the entire algorithm or crucial implementation choices, and some of it is probably due to the bits in the code where "magic goes here", but those parts don't get published... Crucial tweaks to optimization algorithms, smart choices of initialization for iterative algorithms, etc. eg it's not really like saying L-M really tells a practitioner precisely what you did; it's more of a family of techniques.

tedmiston · on Sept 29, 2016

You're correct that it wasn't obvious (at least to me), and given your explanation it makes much more sense, especially from knowing more about the implementation.

The video reminded me of Ikea's virtual furniture app [1]. Tech-wise the comparison is superficial, as your tech is so much harder / more sophisticated, but my first thought as a consumer was "sounds like an iteration of that".

I wish I had suggestions. Computer vision research on the edge is hard x starting a startup is hard. That's tough. Maybe someone else will have more insight.

[1]: https://www.wired.com/2013/08/a-new-ikea-app-lets-you-place-...