DropBox was hard-tech when it came out; they reverse engineered the Finder so th...

DannyBee · on Sept 29, 2016

"DropBox was hard-tech when it came out; they reverse engineered the Finder so that your DropBox folder would appear like a normal Mac folder. "

This, IMHO, is a really bad example.

This is pretty much just a few days of sitting in GDB for the right engineer[1]. Now, maybe it require people experienced with debugging tools, but it's really not "hard tech". Now, productionizing it so it works on all versions, yeah, a bit trickier. but again, none of this is at the level of basically "understanding how to make custom bacteria that do a thing", etc. If this is the example you mean for "did stuff people assumed was impossible", then i strongly disagree.

""Software where you need to use scientific-method trial-and-error to build core pieces of the product.""

This, IMHO, is way too low a bar. By this definition, the clang compiler we built for windows is "hard-tech". While it requires time and energy and trial and error, that is not hard, in the same way the dropbox stuff is not hard.

It is known that it is possible, and requires the reasonable application of good engineering skill. That engineering skill may often involve the scientific method trial-and-error, but you know you will eventually get there.

The same is true of dropbox, and in particular, your finder example. The only thing unknown is the timeline, and even that you can take a reasonable stab at if you have good enough engineers.

[1] I did it before they did, and i wasn't even the first. Plenty of people have made this happen :)

bradleyjg · on Sept 29, 2016

I like the xkcd definition "I'll need a research team and five years" with the implication made explicit -- "and we might not figure it out."

DannyBee · on Sept 29, 2016

Yeah, this is pretty much what i'd say qualifies as hard software tech

nostrademons · on Sept 29, 2016

Unlikely Google would've qualified then, but I would certainly have put Google into the hard software tech category.

rayiner · on Sept 29, 2016

If we're talking about Google before Brin's PhD thesis, I think it would have qualified. It was not at all clear back then that using backlink data would yield more useful results than mere textual analysis of page content. One can definitely imagine a scenario where you try to build a search engine based on going down the rabbit hole of natural language processing as the key feature and then end up with something that doesn't work all that great.

nostrademons · on Sept 29, 2016

Perhaps I'm just not as conversant in reverse-engineering as some of the people here, but my understanding is that if a key part of your product relies on patching somebody else's software for which you don't have the source code, this is also fraught with potential dead-ends and uncontrollable risks. What if they're using ASLR? What if they change the functions involved in the next version? What if the functions you're trying to patch have side effects that you can't afford to ignore?

That's why I prefer to put the dividing line at "must figure out things by poking at them rather than by reading documentation". The definition of "research" can be pretty vague - is a security team poking at a product conducting research? How about a UX team trying to figure out how their users behave? A search team doing language modeling? All of these would count in my head, and if a startup built their product around one of these results I would consider it "hard tech", but evidently not everyone agrees.

meanduck · on Sept 30, 2016

I think Research = Observe-Theorize-Experiment works. Reverse engineering is not research because someone already has the answer.

tcoppi · on Sept 30, 2016

> Reverse engineering is not research because someone already has the answer.

Someone potentially had the answer at one point. That person/organization may be dead/defunct, or the knowledge may otherwise be lost to time.

snovv_crash · on Sept 30, 2016

Yes, but an answer exists.

tcoppi · on Sept 30, 2016

...an answer always exists, even in pure scientific research. Nothing is inexplicable.

tedmiston · on Sept 29, 2016

Why not? The research just happened in grad school [1] before the startup.

[1]: http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.109.4...

nostrademons · on Sept 29, 2016

"Research team and 5 years" != "Two [1] grad students and 2.5 years".

Dropbox, BTW, was 4-5 engineers and 1.5 years.

[1] Actually more like 4-5, Larry & Sergey had help.

x0x0 · on Sept 30, 2016

Google's other big innovation was building with commodity machines instead of high-end servers. That definitely puts them in the hard software tech category.

mrDmrTmrJ · on Oct 2, 2016

I'm personally sympathetic to "I'll need a research team and five years" missions. I think both university research and xkcd are great! But it's a critically different from YC's definition that:

Hard tech = "There is doubt that the technology can be built at all."

Though many pieces of technology face huge doubts, what's key is often a team can get to a working prototype or partial release in way less than 5 years! (E.g. Dropbox.) YC is exactly the type of environment to refocus a 'research style' team exclusively on demonstrable progress.

The problem with the xkcd definition (if attempted in a startup) is very few research teams can continue to fundraise for 5 years without a product or significant prototype.

A partial solution to the "doubted" tech is often good enough to build a great company.

aojwida083ajfo · on Sept 29, 2016

>This is pretty much just a few days of sitting in GDB for the right engineer[1]. Now, maybe it require people experienced with debugging tools, but...

As someone sort-of familiar with gdb (but not extensively so) I have no idea how I'd do that. Can you point me in the right direction?

arghnoname · on Sept 30, 2016

I don't use a Mac, but assuming Finder will detect that if its view is of a folder and items are added to the folder elsewhere, Finder's view will be updated to reflect this:

Use dtrace and create a lot of such events. They're presumably using kqueue or some event mechanism to be notified when the file arrives. Do this with many file types if they look different in Finder. Somewhere in there should also be a read that corresponds to the dirent. You can break on these things.

attach the debugger and create the events. Step through the code to find when these things are read. Attempt to discern how what is read differs between file types. Do stuff like make files with conspicuous attributes (e.g., file size), because it's easier to correlate from traces. The data is probably a file containing file metadata somewhere.

This is probably mostly looking at the bytes coming off the read. dtrace makes this easy because you can trigger it to set a flag when the kqueue event fires and then just dump bytes and locations from file reads/opens. If it's more integrated into the OS Finder would have to have its own special syscalls to read stuff off inodes or whatever. You'd be able to see those happening too.

Once you think you know how it works, give it a try. Rinse and repeat.

Now it may be you have surprises here and there and it's kind of annoying, but I'd be surprised if I couldn't do it.

AndrewKemendo · on Sept 29, 2016

My definition of hard tech, as applied to software, is "Software where you need to use scientific-method trial-and-error to build core pieces of the product."

I think that's a great definition actually and fits with how ML systems are built.

x0x0 · on Sept 30, 2016

How about things like

* jvm -- it wasn't clear you could build a vm that was fast enough to compete with c++;

* c2 compiler inside the jvm -- also not clear you could do this, fast enough, or one that optimized enough to care about

* azul systems / zing -- could you build a gc that tolerates terabit allocation rates w/o stop-the-world pauses?

FullyFunctional · on Sept 30, 2016

Sorry, no, JVM was not even innovative, let alone "hard tech". There were lots and lots of previous examples of doing the same (and better) than JVM, see for example the Smalltalk and Lisp world.

IMO, the only thing on your list that comes close is the Azul GC which, in my limited understanding, actually advanced the state of the art.