Stacksort – Searches StackOverflow for sorting functions and runs them (2013)

dahart · on Nov 5, 2018

Hilariously awesome.

I'm curious whether the multiple warnings about running untrusted code in the browser are necessary. I feel like all websites are already untrusted code, and the browser is quite well sandboxed and protected from anything too bad happening. What is the worst case scenario here for the user within the JS ecosystem, under known avenues of attack, not counting an unknown zero day browser exploit?

zeven7 · on Nov 5, 2018

Some worst case scenarios:

- Your tab freezes due to an infinite loop for a while until your browser notices and asks you if you want to kill the script.

- The script downloads illegal content, and your ISP notifies authorities.

- CSRF attack against a site you're signed into that's not properly secured.

All of these things are things any site you go to can do. There's nothing special about `eval` that makes this site more dangerous.

Uehreka · on Nov 5, 2018

Given that this site is not hosted on a domain where you have important session cookies (or where you’ve granted permissions like camera or location usage) I don’t see much problem with eval-ing untrusted code. Most of the common XSS threats just don’t apply here.

astrodust · on Nov 5, 2018

It's all fun and games until the next JavaScript sandbox busting zero day drops.

kpcyrd · on Nov 5, 2018

If I had that kind of exploit I surely wouldn't post it on stackoverflow.

jsmith99 · on Nov 5, 2018

Why not? They even have a tag for it

https://stackoverflow.com/questions/tagged/exploit

crystaln · on Nov 5, 2018

Maybe someone should create a site like this that runs all security exploits on SO in your browser.

duxup · on Nov 5, 2018

Reminds me of "security expert" warnings in the mass media where every time something happens they tell people effectively the same thing "don't do things unless you know / trust it".

Most people have no clue what they're running and can't possibly take the time to know enough to really have a clue.

ndnxhs · on Nov 5, 2018

Trust is fairly worthless as well. Malware devs buy trusted software to turn in to malware so something that was good and trusted for years is now bad with no warning.

jaco8 · on Nov 6, 2018

Exactly my thought too. Trust,however, is a weakness. This weakness is overcome by the average web surfer due to knowledge and time constraints or unintended clicks on links and ads. The future hopefully will have security audited browsers which increasingly minimize dangers arising from these actions.

pc86 · on Nov 5, 2018

I can't speak to the worst case scenario as I'm not very knowledgeable about browser sandboxing or what JS can['t] do in a given context. But as a developer I do think there's a pretty significant difference between running code that someone wrote on their own website and loading arbitrary code from a third party then executing that without any human looking at the code.

dahart · on Nov 5, 2018

That’s more or less what all iframed ads are. Plus most popular web sites load libraries for services where the site owners can’t easily see what’s happening under the covers. Google analytics would be an example of this, but there are many, and lots of sites use multiple such services.

Uehreka · on Nov 5, 2018

Of note though, cross-domain iframes have intense restrictions on them to prevent them from XSS-ing the host page or its domain.

Of course, if you’re running 3rd party ad scripts on your actual page, you’re at their mercy.

pc86 · on Nov 5, 2018

I see what you're saying and I agree for the most part, but all that code is written by someone, reviewed (maybe), and put to some form of a production environment. The code in StackOverflow is usually just typed right into the answer box and submitted. I know I've edited answers where there were compilation errors in the accepted solutions! Not strictly relevant to JS of course.

seba_dos1 · on Nov 5, 2018

That's exactly why I would rather trust code from random SO answers than from the ad network.

wlesieutre · on Nov 5, 2018

That's how it should work with ad networks, but you really never know what code you're going to end up serving.

https://www.trendmicro.com/vinfo/vn/security/news/cybercrime...

template3 · on Nov 5, 2018

This is unintentionally fucking hilarious. You already concede that review is a “maybe”... so we have to assume it won’t happen. So basically your argument boils down to web code meeting the high standards of being “written by someone” and put in “some form” of a production environment. Any http server on a routable ip address qualifies as “some form” of production environment.

VBprogrammer · on Nov 5, 2018

I'd go further. I'd be much more trusting of running and random JS found on stack overflow than the JS found on your average production environment.

Stack overflow is heavily moderated. I think it'd be a challenge to put something malicious on there which lived for more than an hour or so.

Compare that to your run of the mill PHP shared hosting. I know which side I'd rather take my chances with.

pc86 · on Nov 5, 2018

You don't think there's the slightest bit of difference between production code written at even very dysfunctional organizations with mediocre developers, and random answers on StackOverflow?

And you've only been here for a few weeks so I'll give you the benefit of the doubt, but calling something "unintentionally...hilarious" is usually taken as just another way of calling someone an idiot. I'd suggest you try to find ways to make your point without deriding those who don't share your view, especially on something so inconsequential.

nothrabannosir · on Nov 5, 2018

> You don't think there's the slightest bit of difference between production code written at even very dysfunctional organizations with mediocre developers, and random answers on StackOverflow?

I do; a random SO answer will be miles better than anything coming out of an average organisation.

To write an SO answer, you need to actually care about answering SO questions. To write code in an average company, you need to... have a pulse, it seems.

PhrosTT · on Nov 5, 2018

I'm sure at the very least some of the sorts are incorrect and will brick your browser tab from an infinite loop or something.

retbull · on Nov 5, 2018

Thats crazy talk everything on Stack Overflow is correct because it has the most upvotes.

nicoburns · on Nov 5, 2018

To be fair, I'd imagine the code in SO answers is less likely to be malicious than the code in adverts on most news websites...

CPLX · on Nov 5, 2018

But what if it’s the answer to the question how do you brick your browser tab from an infinite loop?

toomuchtodo · on Nov 5, 2018

You've discovered a new unit test.

jxcole · on Nov 5, 2018

The standard attack in this case is that someone could steal your cookies (if they are Js cookies) or execute instructions on gkoberger.github.io without your permission, since that is the context.

The way you would do this is create an answer to one of these questions on stack overflow that includes malicious javascript.

However, it's kind of unlikely there is much you can do with it. Sometimes a site will share cookies with subdomains, but this is not likely for github because you are allowed to publish arbitrary js there, so that would be a huge security hole.

Uehreka · on Nov 5, 2018

This is why GitHub uses a separate domain (github.io as opposed to github.com) for user sites.

floil · on Nov 5, 2018

Moreover, github.io is in the public suffix list, so it is effectively a TLD (foo.github.io is a different site from bar.github.io; they can't become same origin by means of document.domain).

The risk of eval() is giving control of the site data of foo.github.io to the author of a stackoverflow comment.

The warning is part of the fun, though.

_puk · on Nov 5, 2018

Github hasn't always used a seperate domain for user sites [0]. They do now to directly address the GP's concerns.

http://homakov.blogspot.com/2013/03/hacking-github-with-webk...

Seirdy · on Nov 5, 2018

I run a Flatpak'd browser with limited privileges, so the worst that can happen is my `~/Downloads` folder gets messed up.

gkoberger · on Nov 5, 2018

Hey, creator here! I built this a few years ago on a whim, and am surprised how well it still works. Thanks for sharing again :)

(Psst, if you're an engineer and like dev tools, I'm hiring! https://readme.io/careers)

hyperman1 · on Nov 5, 2018

You might want to update the warning:

Is it safe?

Uh… it evals both user input and random code, unchecked, from an external site. This is what the security-minded folks writing anti virus software would refer to as: hey, our unpacker does that too! In kernel mode if we're from symantec! Must be perfectly safe!

Yeah, cheap shot, I know ;-)

https://googleprojectzero.blogspot.com/2015/06/analysis-and-...

munk-a · on Nov 5, 2018

Is it terrible that I am highly tempted to use this as an API, forcing the value in via headless chrome then printing the page to a PDF and using OCR & regular expressions to extract a sorted list? I'm pretty sure that's an O(1) (ish, not really) which I always heard was the best kind.

creatornator · on Nov 5, 2018

It's not O(1). You are only calling the API once, but if you put in a larger list the time will increase in accordance to whatever the complexity is of the sorting happening behind the scenes. Calling this O(1) would be like saying qsort is O(1) because you are only calling the function once.

your-nanny · on Nov 6, 2018

dude.its called a joke

legohead · on Nov 5, 2018

do you hire remote?

_ZeD_ · on Nov 5, 2018

this reminds me of the 4chan's sleepsort: for each number $n in the array, spin a thread that sleeps $n and then append $n to the result array.

tyingq · on Nov 5, 2018

Adapted for JS: https://jsfiddle.net/dheu9q7f/

jhardy54 · on Nov 5, 2018

Even worse: https://jsfiddle.net/8ta2pfny/

diabeetusman · on Nov 5, 2018

Would the algorithmic runtime of this be O(N)?

mortehu · on Nov 5, 2018

The O(N log N) best case only applies to comparison sorts. What's described here is not a comparison sort, and could easily be made O(N) with radix sort if the size of the numeric type is constant.

bananasbandanas · on Nov 5, 2018

No, the runtime complexity is just hidden in the scheduler of your OS

johnfn · on Nov 5, 2018

I don’t think this fully resolves the problem. If you had n computers and had each of them sleep for one of the numbers, and then append to a shared list, you could sort in O(n) without a scheduler.

My favorite part about this algorithm is that you can speed it up by a factor of k - for any k! - by simply dividing the time you sleep by by k.

saurabh20n · on Nov 5, 2018

Comparison-based sorts have O(1) complexity on the size of the elements being sorted. I.e., comparison between two numbers is a single instruction assuming numbers fit within a word. Even if you assume very very large numbers then comparison would be O(k), where k is number of bits needed to represent a number. So a more accurate runtime of O(nlogn) algorithms should be O(k.nlogn).

The runtime of this algorithm is O(n.2^k).

delinka · on Nov 5, 2018

Is it just me, or do none of them sort this array correctly?

["zebra","apple","banana","5","bapple","banana","banana"]

Correction, the page finally found this algorithm that sorts my array[1]. So I am disappointed with the verifier function on this web page and may need to submit a PR

1 - http://stackoverflow.com/questions/3730510/#3730579

kenjackson · on Nov 5, 2018

Maybe they just have a different definition of "correctly" sorted.

hirsin · on Nov 5, 2018

That conversation about for..in - any JS experts that can say whether modules or other new advances will help with this? Not being able to use the standard functions on primitives because random other code messes with them sounds like pogo sticking in a minefield.

tomjakubowski · on Nov 5, 2018

    for (let x of xs)

in more recent versions of JS is how you iterate an array without facing that prototype pollution problem.

See https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...

hirsin · on Nov 6, 2018

Good to know, thanks!

keerthiko · on Nov 5, 2018

the website doesn't verify the accuracy of sort output. it just gives you the first SO answer that a) has a code snippet b) which is error-free c) and has a function that can be extracted and run ...for your input format, and returns the output of passing your input code into it.

It's true to the xkcd mission, which is just to find SO answers until it returns something, not to actually confirm it was sorted correctly.

delinka · on Nov 5, 2018

The alt-text on the linked comic says "...until the list is sorted." Ergo, one must verify that it is sorted before knowing whether to execute the next code snippet.

CarVac · on Nov 5, 2018

It only checks that the output is sorted, not that it's a sorted version of the original string.

The first answer that apparently works has a comment stating that it fails with more than 2 duplicates; indeed I tried a list with three 3s and the resulting passing answer only had one of those threes.

amelius · on Nov 5, 2018

So a function that always returns an empty list passes the test?

gr3yh47 · on Nov 5, 2018

that's because that answer is for how to find unique values in an array. it works as intended with any duplicates, stacksort just chose an answer that wasnt only sorting

colinprince · on Nov 5, 2018