Hacker News new | past | comments | ask | show | jobs | submit login
Whatruns: Identify technologies used on any website (whatruns.com)
430 points by mcone on Aug 25, 2017 | hide | past | favorite | 233 comments



Hi, Hacker News!

We are truly stunned to see us on top of HN today! :)

WhatRuns is a free browser extension that shows you what runs a website – from ad networks and developer tools to fonts and Wordpress plugins. You can also follow websites and get notified when they add or remove technologies.

We soft-launched a couple of weeks back and was lucky enough to be picked up by the Chrome team. We were featured on the Chrome Webstore, landing us 12k active users in one week. It was a huge validation and helped us tremendously in squashing bugs and making a finished product. We realise we have a long way to go, and our little team is working round the clock to make it happen. We also launched on ProductHunt today: https://www.producthunt.com/posts/whatruns

Would love to hear what you think :)

UPDATE:

Thank you for all the feedback!

Sorry about the occasional false detections. We are looking into this. This is largely because we detect a considerably large number of technologies/plugins compared to our counterparts. Lots of possibilities for false pattern recognition etc. Rest assured our team is working round the clock to improve accuracy and add more technologies/plugins.

Also, Our servers are going a bit cranky due to the huge traffic we are experiencing today. New websites (that was not loaded on WhatRuns before) are now queued up and might experience 2-3 seconds delay. This is to ensure best experience for our active users.

Thank you so much for such a great response!


SUPER ANNOYING BUG.

It seems to only look at the second-level domain, and thinks that websites with the same subdomain are the same.

They are not.


WR currently considers subdomains as a part of the main domain.

Most users like to know the full tech stack of a website. If there is a blog at blog.company.com and if it is using Intercom, it can be a useful data. I hope this makes sense.

Anyway, we will definitely address this concern and think about adding an option for subdomain separation.


So as an example of where this is a problem - All NZ government websites are hosted underneath the govt.nz namespace, e.g. dia.govt.nz, treasury.govt.nz, ssc.govt.nz and so on.

But all of them are evaluated as one, govt.nz. But they are all quite separately hosted and operated and use differing technologies.

Looks like this works fine for other similar situations e.g. .co.uk.

This is a really cool service, thanks for making it available for use!


govt.nz is listed in https://publicsuffix.org just like co.uk , so ideally software should treat them the same.


We frequently update our supporting TLDs to avoid this problem. It's important considering the plethora of new fancy TLDs that are popping up all the time.

Thanks, and glad you liked the product!



Wow, very helpful. Added to Trello. Thanks!


Update: WR just started supporting subdomains (version 1.5.4). It was the most requested feature; hope you like it!


How does this differ from BuiltWith?


Copying my previous reply: Comparing with Wappalyzer and BuiltWith, here's why we think we have something different (and maybe better):

1. WhatRuns detects fonts, Wordpress plugins and themes (tens of thousands of them).

2. Ability to follow sites (and know what techs websites started using/ditched).

3. Very lightweight compared to our counterparts, and arguably better UI ;)

4. More accurate data. BuiltWith can be very inaccurate as you might've already noticed. Wappalyzer is fairly accurate, but limited in technologies. WhatRuns is trying to be the best of both worlds.


Others to consider are SiteStacks (http://sitestacks.com) and Wappalyzer (http://wappalyzer.com)


Not sure if it's related to load, but whenever I try it on a long-tail site (i.e. not one you would have precached) it comes back with information about google.com instead of the domain I'm on.


Surprised to see google.com runs jquery - can't possibly be true.

https://www.whatruns.com/website/google.com


We just tried to replicate this issue but failed. It's working fine on our end. Can you share the URL in question so that we can take a look? Thanks!


Not the GP, but I have seen this - http://www.thelittlebeettable.com/


Very strange. It is working fine on our side: https://cl.ly/1x150t1m0922 I'll check this with our developer and get on it right away. Thank you so much for taking time to report this!


Plugin just spins forever with no results on one of my clients old websites:

https://www.ideasenfoto.com/

I tried a great deal of sites using similar tech, and an infinite spin is all I get on all cases.


Strange, we are getting instant results: https://cl.ly/262c2l300g3g

Please drop me a line if you're still experiencing this issue: jijo [at] whatruns.com. Thanks!


Huh, seems to be working now.

Looks like it's not picking up Django! ;)


Just check https://www.whatruns.com/website/vuejs.org and you're showing the site as using EmberJS. Looks like you're not launch ready.


Sorry you feel that way. We truly understand your frustration with detection accuracy, but when there are tens of thousands of technologies to detect, the only solution is to break things and move fast.

We were featured on Chrome Webstore a few weeks back and got a great response (12k+ active users) helped us enormously in improving the accuracy and efficiency, and I'm sure HN and PH launch will be even more helpful in improving the product.


Sorry don't meant to be rude and I understand there are tens of thousands of technologies to detect. Kind of expecting your engine to be able to detect front end JS frameworks easier than backends, at least for those popular JS frameworks. In case you not notice the site I'm pointing to is the home page of VueJS and you're showing them using a competitor tech.


We're on it. Thank you so much for bringing this to our attention.


Well, at least you've got the business-speak down and can fix your bugs quickly. That'll get you far, I think.


Haha, I'll take that as a compliment.


You will probably want to add GNU Social. I pointed it at my site https://kwat.chat and it came up with nothing.

Also, Wordpress doesn't seem to be detected, either, on my other website: https://johnrockefeller.net


> Looks like you're not launch ready.

It's 2017 man, things are launched with kinks all the time now. Your definition of launch ready doesn't have to apply to everyone else.


Its kinda ridiculous that we've just accepted things are going to broken at launch.


Things are always broken at launch; you can’t prevent all the bugs and unexpected things always happen.

IMHO it’s a very good thing we’ve "accepted things are going to broken at launch"; the "ship early, release often" model works a lot better than "wait 10 years before release so that everything is perfect but you’re 9 years late".


Around ~2000 it became common knowledge to download the latest driver from the net instead of using the supplied one on the CD for hardware you just bought.

That is the assumption of broken hardware to be shipped 17 years avant la lettre?


I totally understand your point, but WhatRuns is not a broken extension. Google blessed us with more than 10k+ active users by featuring us on the Chrome Webstore front page :) and we have been improving the technology ever since.

However, there is a lot of manual labour involved in correcting detection inaccuracies, which our team is working full-time on. Rest assured WR will only improve from here on. Thank you for dropping by!


> WhatRuns is not a broken extension

Try running it on www.example.com and let me know how many of those are accurate.


https://github.com/discourse/discourse

  > Built With:
  > - Ruby on Rails
  > - Ember.js
  > ...
and it's been used at subdomain https://forum.vuejs.org/.

From another comment I understand that it shows all detecting stuff for all subdomains, so this is the case.


They discovered Discourse, which uses Ember.js.


I would find use for products like this but I'm emphatically not enabling a chrome extension unless I've built up significant trust with the company providing it.


You can get a web version by editing urls: https://www.whatruns.com/website/google.com


Nice find! :) However, WR only publicizes websites that were once loaded through extension. If you have a new website and if it wasn't opened on WhatRuns before, this lookup won't do the trick.


In that case you could just boot up a VM and run Chrome with WR from there.


"Just?" That's a fair sum of work. Do you have purpose built VMs just lying around?


I do, actually. If you're privacy-conscious but still need to use certain high-risk programs every once in a while (such as closed-source Chrome extensions), keeping a VM instance around for that purpose isn't the most complicated thing you could do.


Right, but I guess my question is: how would this be an intersection of usability and utility? Might be if Qubues ran on any computer I own.

I doesn't, so this is more of a parlor trick thsm a practical way to police extensions.


This should be the top comment.


I keep a separate chrome user profile for testing extensions. You are right that chrome extensions can read all browser data.

Here is how to set up a separate user profile

https://support.google.com/chrome/answer/2364824


How about 15k happy users, featured by Google Chrome, transparent extension and top of HN and PH? ;)

On a serious note, I understand your point and realise how new extensions can be dangerous. However, we have a very good team and is trying to solve all the concerns we had with our counterparts.

I hope you'll give us the benefit of doubt! :)


But why do you need to run on my PC?

Just add a "insert url here" box and do it on your server.

I implicitly distrust anyone who insists on running code on my PC that they could run elsewhere.

Especially.. code they can remotely update


We started with extension as developers/designers found it especially handy for a quick look-up while working on their projects. Not to worry though - we're working on something for the web as well!

Also, our counterparts got a majority of their traction from browser extensions which made it our obvious priority (even though it wasn't the easiest of options).


15k is nothing and you can buy 'featured' status. Let me know when it gets closer to 500,000.


Chrome team handpick their featured products; I don't think we can buy the status.

We know 15k is not a lot (comment was pun intended), but it looks like a good start :)


Permission required for this means extension can read all website and change it.

I dont know much about extension development - can a finer permission like - reading browser url is not sufficient to achieve functionality ? or better - a button in extension options to read only current url ?

How do you defend yourself of not selling users data ?

Having said above - compared to extension wappalyzer (which I had) this gives so much more information!! Really cool.


@samblr First off, thank you so much for your kind words. Glad you liked the info we're providing.

To address your concern with the privacy, WhatRuns do not collect or log any visitor information – including IP address, location etc. We receive anonymous website data and match with our database to display the results. Hope this clarifies.


Yes, but do you enable two-factor auth on your developer account(s)?

https://chrispederick.com/blog/web-developer-for-chrome-comp...


Thanks for introducing PH! That site is so great!


No problem. It sure is! :)


I think your level of skepticism is good but most people will just hit that install button and not think twice.


Looks to be very similar to https://builtwith.com/


True. Comparing with Wappalyzer and BuiltWith, here's why we think we have something different (and maybe better):

1. WhatRuns detects fonts, Wordpress plugins and themes (tens of thousands of them).

2. Ability to follow sites (and know what techs websites started using/ditched).

3. Very lightweight compared to our counterparts, and arguably better UI ;)

4. More accurate data. BuiltWith can be very inaccurate as you might've already noticed. Wappalyzer is fairly accurate, but limited in technologies. WhatRuns is trying to be the best of both worlds.


At least they do it properly; as a site.

Your extension only works on Chrome, and it is for a feature that is not used commonly. There is no good reason to install it on a web browser.

Installing it in a browser is also a threat that the extension might do more than just scanning sites, and even if it doesn't affect privacy it still encourages installing extra junk on web browsers.


Their's also works as a site. For example https://www.whatruns.com/website/minecraft.net. But as far as I know you would have to write that manually, without a web-interface.


Which make things look even worse. If all the extension does is just redirect to that page, why is it even needed?


Extensions show you all the technologies used on a website. It does not redirect you to this page. However, if you click on a particular technology from the extension, you'll be taken to the respective tech's page which has a small description and list of websites using it. We hope this is useful.


We started with extension as developers/designers found it especially handy for a quick look-up while working on their projects. Not to worry though - we're working on something for the web as well! ️

BTW WhatRuns works on all major browsers.


The extension does not seem to work on Firefox for Android, even though it's installable.


I only see Chrome buttons on the site.


That is if you access from Chrome. Please head over to our 'Download' page for Firefox: https://www.whatruns.com/downloads/ We haven't publicised other extensions due to lack of demand.


I'm on firefox and just checked it sends proper user agent (for firefox 55)


I don't see how it'll work on any browser on iOS.


Yeah, how dare you make something free and give great and useful information in return for a bit of client gathered data.


Haha, we're truly sorry!


I'd put an asterisk next to #4 for now. On a couple ruby/rails apps we've built you listed the backend tech as cowboy/erlang. I saw your comment above about how hard it is to be accurate with thousands of frameworks but rails? We're using jwplayer, segment, and facebook (all of which you correctly detected, woohoo!) so maybe that is confusing things?

[edit] to be fair other options I tried don't detect the backend at all. This is a single page app with rails api so I get that might be harder than a rails app with server rendering and full page reloads.


If there’s no headers or obvious tells at a framework level, it can be hard to detect server-side code. Maybe Ruby-specific serialization in session cookies, or the name of session cookies, use of HTML templates or code gen or URL patterns... but there can be tons of false positives. Client-side is much easier and a whole different story. Same with pre-built client code like CSS in WordPress templates, or standard admin login pages.


How is WhatRuns more accurate? Are you doing something different to get your information?


We are using a deep learning algorithm to improve the detection. We also have a built-in module that automatically detects new web patterns – which we then manually curate to ensure accuracy.


> We are using a deep learning algorithm to improve the detection.

No, you are not.


@korzun

I'll explain.

We use several signals like code snippets, filename, directory name, header info and several others to accurately identify technologies. However, there are many possibilities where this can go wrong even with few signals correct. Every time we detect a technology, we calculate a probability of its accuracy and filter out the rest. This system self-learns and improves the identification over time. Hope this helps.


I like your website but what you are describing is not a deep learning algorithm. I'm not a big fan of people who do something well and then start going overboard with buzz words.

You are most likely have a system that consumes content, compares that content against known hashed variants. If there is no match, you diff against known variants and check if the output matches any of the 'minimal' implementations.

If you can't match anything, you simply stage that content for a manual review.


Thanks for telling us what we do at Whatruns. I’ll let the team know ;)

On a serious note, I'm with you on how new start-ups go overboard with buzz words. As for WhatRuns, it was intentional that we do not use any jargons to advertise our product on the website, Product Hunt or HackerNews, so that it does not lose its charm.

For a new startup to achieve this scale in technology identification and accuracy compared to established players with more than a decade of development (and data), it is self-evident that manual labour would not yield such a result. In fact, technology breakthroughs and an excellent technical team were the reason why we decided to give this shot in the first place.

We plan to publish a comparative study on our experience with the effectiveness and superior prediction quality of deep learning vs normal pattern identification on our blog (which we will soon move to Medium). Stay tuned! :)


What is self-learning? What kind of algorithms if I may ask?


The one advantage Wappalyzer has is that it changes its icon to show the primary technology the site is using (i.e. JS framework) so I can glance over and see "Oh, this is built with React".

If Whatruns had that feature, I'd seriously consider switching. But otherwise, there's just no way I could. It's way too convenient.






My understanding is that StackShare doesn't detect the technologies that are used but simply displays information that has been manually recorded.

Edit: Looks like I'm probably wrong since I see they have a tool named "Stack Scanner"


Noob question: Looking at your competitors' traffic with SimilarWeb: They have all ok to low traffic, none of them really growing. So it might be a hard business to grow since a lot is SEO driven/organic.

However, Builtwith is selling some plans which also include SEO related features like keyword reports. I understand that some might pay for latter but there is even more competition in that space.

What I don't get: Who should pay for your stuff? It's of course interesting to see other stacks but honestly it's not a crucial thing. My CTOs and I know what they are doing and of course we like to get inspired but yeah, at the end of the day tons of research, years of experience, debate and the individual use case decide our stack and not what some random website does. Same for design-relates stuff, btw to find a font-face is just a Command-Option-I away.

So no offense, but I am just wondering why you start a business which is already there, which is hard to scale and which is hard to get paid for.

Guess I missed something and happy to hear your view.


Our business model will be similar to that of BuiltWith's, i.e selling list of websites using a particular technology. For eg., list of websites using Drift chat (https://www.whatruns.com/technology/drift) will be a super-useful competitive intelligence for other live-chat startups.

Also, we are planning to introduce a predictive sales system which will suggest clients based on their technology adoption. For eg., if a company migrates to Magento, they are a potential client to Magento extension developers.


Ok thanks and this makes sense, a leadgen tool for BTB sales.


Or for hackers looking to exploit technologies with known vulnerabilities.


This was the first use case that popped into my head


People are paying for services like this because it's a valuable tool for lead generation.

If you are selling a Wordpress plugin, you don't want to contact website owners that run on e.g Joomla.


Neat, but still some kinks. I'm not seeing Angular for https://fonts.google.com/, but can quickly find the tell-tale ng- attributes in the HTML.

BuiltWith has been around for a while and has it's own chrome extension [0]. It correctly identified fonts.google.com as using Angular.

[0] https://chrome.google.com/webstore/detail/builtwith-technolo...


Has anyone used both this and Wappalyzer [1]?

The latter is what I've been using and seems to have more users with higher ratings

[1] - https://chrome.google.com/webstore/detail/wappalyzer/gppongm...


Comparing with Wappalyzer and BuiltWith, here's why we think we have something different (and maybe better):

1. WhatRuns detects fonts, Wordpress plugins and themes (tens of thousands of them).

2. Ability to follow sites (and know what techs websites started using/ditched).

3. Very lightweight compared to our counterparts, and arguably better UI ;)

4. More accurate data. BuiltWith can be very inaccurate as you might've already noticed. Wappalyzer is fairly accurate, but limited in technologies. WhatRuns is trying to be the best of both worlds.


Noted! :) We'll look into this right away.


Ehm sorry.. but I refuse to clutter my browser with silly extensions which could and really should entirely live on a website as a service.


Hi Dustin, we started with extension as developers/designers found it especially handy for a quick look-up while working on their projects. Not to worry though - we're working on something for the web as well! ️


HN is remarkably fickle; a browser extension is a perfectly reasonable user-friendly mechanism for the service given the choices out there. There are privacy concerns given the coarse level of granularity that Chrome provides, but until Google changes that ("Read and change all data on websites you visit" shouldn't be the same thing as "give the current URL to the browser extension when I click its button"), that's just what we're stuck with for user friendliness.


I mainly use Wappalyser, after finding it to be more reliable than BuiltWith, but I've given this a quick go on some of the sites I work on, and I have the following feedback:

1. All in all, this looks really tidy, so nice work!

2. Sadly, it looks a bit limited on detecting anything .NET/Windows. I pointed it at a few Umbraco sites running on Azure, and none of it was picked up.

3. It doesn't look like it works for subdomains.

4. Wappalyser does a good job of detecting Angular 2, whereas this seems to struggle.

These issues aside, I'll probably keep it running at work, and if these things can be resolved I can see this being my preferred choice.


Awesome. Thank you so much for your kind words.

Addressing your concerns,

1. Thank you ;)

2. Devs are looking into this. Neglecting .Net/Windows wasn't intentional. We will work on this.

3. Yes, WR currently considers subdomains as a part of the main domain. Most users like to know the full tech stack of a website. If there is a blog at blog.company.com and if it is using Intercom, it can be a useful data. I hope this makes sense.

Anyway, we will definitely address this concern and think about adding an option for subdomain separation.

4. Noted!


I had seen wappalyzer recommended in the context of webapp pen testing and have used it since - Is recommend it as well


FWIW, none of the static media (images, css, js) seems to be loading for me - I'm just getting a bunch of 404s. This is happening in all browsers on my system, including when plugins are disabled.

Might be network weirdness on my end, I dunno. (Or a HN "hug of death"?) Anyway, wanted to let you know.

Congrats on the project :)


Thank you so much! Yes, HN 'hug of death' seems to be the culprit here :) We're experiencing occasional downtimes. We're on it.

New websites (that was not loaded on WhatRuns before) are now queued up and might experience 2-3 seconds delay.


Btw, how much traffic does HN's "hug of death" mean approximately?


Not sure. I'll make sure to update it here after :)


Glad to help :)


Same here


Congrats, WhatRuns looks very accurate to my tests so far and indeed better in UI terms.

I only have one extra UI recommendation that I think Wappalyzer got right, which you could enable as an option.

When a popular CMS/language/server OS is detected, Wappalyzer will use its icon in place of Wappalyzer's plugin icon. E.g. if Joomla is detected, Wappalyzer's icon on the plugins' toolbar will switch to Joomla's logo.

There's a specific order to this preference that looks to go from the CMS used (e.g. Joomla, WordPress etc.) down to the framework (e.g. Laravel), programming language (e.g. PHP), webserver (e.g. Nginx) and finally the server OS. In other Words, if Joomla is detected, it will be displayed first, not PHP.

The above is extremely helpful for anyone developing for the CMS communities (like myself).

Of course, to maintain your identity as a plugin, you could use a double logo (a mashup of your own and the dominant/higher-level technology detected).

* UPDATE: You should also consider providing a way for anyone to easily suggest new frameworks, apps, CMS extensions/plugins etc. to be detected, by providing a name, icon, description and the way to be detected (e.g. HTTP header, pattern in the HTML output or even HTML comment, linked source etc.).


Thanks @fevangelou! Appreciate you taking time to drop your suggestions.

Dynamic icon - I agree with you that it can be quite convenient to display the top technology (preferably the CMS). We will think about this as an option in the future updates!

Technology submission - That's a great idea. We are adding this to our roadmap. Thank you so much.


As a counter point, I really don't like that Wapalyzer does this. It always takes me a couple of seconds to find the correct extension to click because its icon always changes.

If you keep your icon the same I will switch from Wapalyzer.


We got the same response from a ProductHunt user as well. Like we mentioned above, it will only be introduced as an 'option', which means you can happily switch :)


Nice. I'll check it out today.

Update: You've already won me over with the better layout (compared to Wapalyzer), separating them into technologies. The fonts is a nice addition too.


Awesome! Glad you like it.


I still like it, but I'm starting to see a lot of things that are obviously incorrect. For example, facebook.com uses Google Analytics? Hacker News uses React? Most sites I check out have obviously incorrect technologies, so I don't know which ones I can trust. Unfortunately that makes this tool pretty much useless.

Once you've fixed this I'll be sure to use it regulary.


I'm leery of Chrome Extensions. They are basically just a plot to collect your usage data and sell it to marketing companies. I have disabled almost all Chrome extensions and locked down my browser. I got tired of the super targeted, annoying advertisements that were being thrown at me.

Check out the privacy policy before installing any Chrome Extension.

https://www.whatruns.com/privacy


To address your concern with the privacy, WhatRuns do not collect or log any visitor information including IP address, location etc. We receive anonymous website data and match with our database to display the results. Hope this clarifies.


I would greatly appreciate to test your technology on a given site on your website before installing your extension


I agree.. tbh, I'm not sure why I'd want this as an extension. Seems like I'd use it too sporadically to justify keeping it in chrome.


We totally understand your point. We started with extension as developers/designers found it especially handy for a quick look-up while working on their projects. Not to worry though - we're working on something for the web as well!


as mentioned in the earlier comments, you can test on their url https://www.whatruns.com/website/<domain name>


I wondered how long it'd take for the BuiltWith competitors to appear after the article a few months ago (https://news.ycombinator.com/item?id=10316060).

The golden rule of business: If you're onto a sweet money-maker, don't shout about it.

I'm currently working on a competitor to a site I read about that bragged about their business model, and if they'd have kept it to themselves they'd be facing one less competitor...


They do say that imitation is the sincerest form of flattery, which holds up almost as well as the "imitate then innovate" mantra.


That article wasn't the first broad mention. I remember reading about BuiltWith being an outstanding one man project a couple of years ago, so I am in fact surprised the copycats took so long to show up.


Wappalyzer [0] might be a good open source alternative.

[0] https://github.com/AliasIO/Wappalyzer/


As mentioned in my previous comments, here's why we think we have something different (and maybe better) than our counterparts like Wappalyzer:

1. WhatRuns detects fonts, Wordpress plugins and themes (tens of thousands of them).

2. Ability to follow sites (and know what techs websites started using/ditched).

3. Very lightweight compared to our counterparts, and arguably better UI ;)

4. More accurate data. BuiltWith can be very inaccurate as you might've already noticed. Wappalyzer is fairly accurate, but limited in technologies. WhatRuns is trying to be the best of both worlds.


This is pretty nifty. Cracking the extension open reveals a fairly basic API you can use to skip the extension. Here's some code to use it.

https://gist.github.com/9b/f5fe434bf9965d673963884b56d93d9a

On the privacy side, I could see concern from those using the extension. When the site is not found in their database, the full HTML of the page appears to be submitted to the servers and processed. This is a bit of what you would expect, but may present some concern for cases where a new site is submitted and PII is sent to WhatRuns servers.


Great hack. We will release the API very soon! :)

Privacy side, I'm sure you noticed that we filter out any text content before sending html tags ‘anonymously’ for technology identification.


Does anyone know what they are using to detect Wordpress?

Unfortunately some sites that I am responsible for running in production are WP and we try our best to hide this fact and block all admin functionality to the public due to WP's less-than-stellar history of security vulnerabilities. This is the first tool I've seen that has detected it and now I'm stumped.


> we try our best to hide this fact

Can't really hide wordpress because any time there's a new vulnerability, scrapers spam every site on the internet attempting to use it it anyway, regardless of what tech they're built on


I'm curious what efforts you've made to "hide" WordPress. Can you share any of your techniques? I assume it's stuff like:

- Rename paths to eliminate "wp-" prefixes and recognizable folder structure (wp-content, wp-include, etc)

- Remove or rename any common plugins that inject recognizable WP-specific code into the page

- Rewrite requests to bare paths instead of e.g. index.php

I assume you'd also try to do as much handling as possible at the Apache/NGINX layer instead of letting requests hit the WP application.

Seems like a HUGE amount of effort, and I'm probably not even getting everything. Is there a more efficient way of securing/locking-down a WP site?


For cyph.com/blog, we have a WordPress instance accessible only by SSH tunnel, and what gets deployed publicly is a static site generated using a plugin called Simply Static (with a little bit of additional processing).


How long does it usually take for a small site to be generated using Simply Static? I tried it once before, and wasn't very impressed by the performance (I don't think it's a problem with the plugin, but maybe PHP itself).


Simply Static itself takes about a minute, but it's actually a decent amount longer because we have to simulate a browser and run retry logic to handle failures. All in all, with post-processing included, the static blog generation is the single longest part of our deployment process.

Ultimately it isn't a huge deal for us though, since it runs concurrently with other build/deployment steps that in total (sequentially) take a similar amount of time.


I've run it on a simple bog-standard out-of-the-box Wordpress install with no obfuscation just now and it said "No apps found". Not sure what the issue is.

One thought I had was perhaps it uses some cached batch parser and shows "No apps found" for all sites on first-run until it finishes analysing in the background? It doesn't seem to work at all on a few very obvious but small/obscure CMS sites but works fine on all well-known high-traffic sites.


It could be something as simple as the class names of elements on the page. WP has some defaults that are recognizable.

Also, most WP pages will be loading scripts from from the wp-includes directory. There are probably others I'm overlooking, and some WP plugins probably also drop recognizable script tags into your pages.

Since this is the first tool that has detected it, it's very possible you've already covered all of the things I mentioned.


Yes, `wp-contents`, `wp-includes` and basically anything else prefixed with `wp-` is a very clear signed that WordPress is behind the site.


Well, you'll certainly be interested in wpscan.


The WP REST API is a new way to detect if a site is running WordPress. If you hit the homepage of a WordPress site it will return a link header with a location to the REST API. They can also just hit /wp-json/, or /xmlrpc.php, or many other files that WordPress requires. Like looking for assets served from wp-content, or wp-includes.


You really shouldn't be relying on security by obscurity to prevent attacks to your websites. If you check your access logs you'll see countless attacks that are unconditional, they'll just try the attacks without any kind of sanity checking.


I don't know what they're using, but Wappalizer uses regular expressions over the HTML. You can intentionally mislead the scanner without much effort.


Could it be as simple as checking if certain .php files respond to web requests?


The best way to hide WP is to stop using that pile of garbage.


From these comments, I didn't realize how much dislike there is for chrome extensions.


I can’t speak for others, but I don’t like the clutter and potential security/privacy issues. I am not saying there are such issues, but it “feels” like there could be. I don’t have the time or desire to heavily vet extensions so I tend to avoid them. What they say they do and what they actually do — hard for me to quickly be able to verify them.


To address your concern with the privacy, WhatRuns do not collect or log any visitor information including IP address, location etc. We receive anonymous website data and match with our database to display the results. Hope this clarifies.


> To address your concern with the privacy, WhatRuns do not collect or log any visitor information including IP address, location etc

That's not the whole truth - you are using Google Analytics to track visitors and you fail to disclose this in your privacy policy, despite this being mandatory under the Google Analytics T&C's.

Well done launching what looks like a very cool project, and I hope you can further improve it by informing visitors that you are using Google Analytics to track them (or even drop GA completely in favor of something privacy friendly).


We are using Google Analytics only on the website, it will not (and do not have access to) collect extension user data. However, you are right that we should've mentioned this in our privacy policy. We're on it :)


That’s nice. However, with a website, I don’t have to even worry about it as much. Basically “trust us” is a high bar to clear because the potential to gather data is still there.

Good luck with your thing. I am sure you did a ton of work; I am just naturally risk-adverse when it comes to installing extensions that that a potential to do things I might not want.


Why does this need to be a browser extension? No, thanks.


Because that's how they make money. They go about creating a database of what websites use what technologies. They later sell that info to sales people as leads.

I'm not sure what extra tracking they do beyond that!


You can build a database by accepting URLs submitted by users, too. It just baffles me that people willingly install these extensions that—on the tin!—say that they can "Read and change all your data on the websites you visit". INSANE


Yeah that.

I would try it as a bookmarklet but I never install Chrome extensions that ask for all data on all websites. It's just an insane permission for what should only get URLs when I explicitly ask it to.

I wish Chrome would add a permission like this "website URL of the current page with your express permission every invocation".


Because even though they can, that's not what they're doing.

I see comments like yours on this site pretty often, and it is tiring. There are many reasons people behave the way they do, and probably the most common reason is that their behaviors haven't caused them any harm as far as they know.

The warning "Read and change all your data on the websites you visit" is perhaps scary the first time you see it, but then it becomes insignificant as time goes by and as extensions get installed without causing any visible harm.


> The warning […] is perhaps scary the first time you see it, but then it becomes insignificant as time goes by…

Which is exactly why it's dangerous. Granting access like this without a thought to the potential consequences is just asking for a bad character to take advantage of the blind trust people place in extension authors.

The core issue is the options Chrome gives extension authors. Offering the ability to grant permissions per-site and per-use would greatly reduce the threat. Even just a per-use "Are you sure?" confirmation would help.


> The warning "Read and change all your data on the websites you visit" is perhaps scary the first time you see it, but then it becomes insignificant as time goes by and as extensions get installed without causing any visible harm.

LOL, what matters is the threat itself and not your waning level of apprehension over the threat. This is really a very, very strange comment. The point is there is no need for this to be a browser extension. Putting an input element and some AJAX on the page is trivial, so I really don't buy the excuse that they haven't had time to put together a web app yet.


> It just baffles me that people willingly install these extensions that—on the tin!—say that they can "Read and change all your data on the websites you visit".

It's disappointing you can't have finer grain permissions for Chrome Extensions. What's the alternative though if you can't make it a web service though? A Electron or native app for example would have even more permissions and could read any file on your computer.


They do not need to worry about websites and CDN's that would mark their spiders as such. They get free scrapers thanks to that.


I understand your concern, but as I mentioned in my previous comments, we started with extension as developers/designers found it especially handy for a quick look-up while working on their projects.

Also, our counterparts got a majority of their traction from browser extensions which made it our obvious priority (even though it wasn't the easiest of options).

Not to worry though - we're working on something for the web as well!


Heh, good luck doing this for HN. You might say "Arc," but it's been modified for a decade.

I wonder if the mods would ever be interested in being interviewed or talking about some of the tech. The last bit of Arc info we got was https://news.ycombinator.com/item?id=11240681, which was awesome.

It's pretty unique. I don't think any other large website in the world has written their own stack from top to bottom. Even Facebook uses php.


That link was pretty interesting.

Having to explicitly declare thread local access is a clever hack.

I also wonder what database they use (if any).

Did they also build their own http stack?


They use in-memory hash tables, which works since the whole site can be in memory.

Originally they did build their own http stack but switched to nginx for a reverse proxy. On the other hand I'm not sure how much they lean on nginx's facilities.


Useful, thanks!

Noticed that it doesn't report correctly for subdomains - one of the sites I built is at foo.megacorp.com, but the extension reports the results for megacorp.com which is a separate property.


WR currently considers subdomains as a part of the main domain.

Most users like to know the full tech stack of a website. If there is a blog at blog.company.com and if it is using Intercom, it can be a useful data. I hope this makes sense.

Anyway, we will definitely address this concern and think about adding an option for subdomain separation.


Edit: About 15 minutes after posting, this seems to work fine on my site now. Sorry for the confusion!

The spinner does not stop and it gives no results for my site (https://myhikes.org) - this is with both FireFox and Chrome extensions. Seems to work great for everything else though.

Just replying in case you're looking for new edge cases to debug!


Works for me on your site! (Chrome)


Very weird, now it works after reading this and re-checking.

Oddly enough I restarted Chrome and Firefox twice before posting the comment in hopes that it was just my machine. Thanks for the sanity check!


It could be because of the peak time HN and PH created yesterday. We had to queue all new URLs so that URLs that we once processed (almost all of the top 1M websites) work fine. There were also minor downtimes between server upgrades.


I can't comment on the detection accuracy because this extension makes an important mistake -- it ignores the actual URL you are on and always performs detection on the root domain. So if I point the extension to a webapp at app.mycompany.com I get results for our marketing site at mycompany.com, which uses completely different (and more boring) tech.


Yes, WR currently considers subdomains as a part of the main domain.

Most users like to know the full tech stack of a website. If there is a blog at blog.company.com and if it is using Intercom, it can be a useful data. I hope this makes sense.

Anyway, we will definitely address this concern and think about adding an option for subdomain separation.


+1 for having a privacy policy linked from their home page that addresses both their browser extension and their website.


Thank you, first thing we did before the launch!


The Chrome extension is a nonstarter for me.


Why is this a company? It's one Chrome addon.

Seriously people.


their idea is to index All the site tech and sell it to sales people


I like the idea, but i have one big problem with Whatsrun:

it is capturing my browsing behaviour because it sends any URL i browse in the background to the whatsrun.com server, even when i don't want to know what software the page is running (means clicking the icon), so Whatruns get's a full browsing history from me (and you even set a UUID cookie to track unique users!).

This is a huge privacy issue! Imagine Whatruns is starting to sell this data!

To replicate simply open the dev-console for the extension and click the network tab.


Thanks! No visitor information is ever logged. We don't collect or log any visitor information including IP address, location etc. We receive anonymous website data and match with our database to display the results. I'm sure you noticed that we filter out any text content before sending HTML tags ‘anonymously’ for technology identification. This speeds up the process and is responsible for the speed WhatRuns has achieved.

Hope this clarifies. Drop us a line if you still have any concerns, would love to clear it for you: hello [at] whatruns.com


Fun. I'm getting a false positive for Rails (I'm using Passenger, but not Rails), and Elevio for documentation (never heard of it!). Other than that it guesses right.


It says ycombinator.com uses React, jQuery, "Vis JS", a Facebook tracking pixel, CloudFlare and Cloudfront ...

https://www.whatruns.com/website/ycombinator.com

That sounds strange. Can these claims be backed up somehow? I cannot see anything in the source that would confirm these.

It also says Facebook uses Google Analytics \o/


Have you checked the headers that these site(s) are sending/receiving? There's usually a couple of indicators in them that point towards whatever tech stack is running.


It could also be the data collected from HackerNews as WhatRuns consider subdomain as part of the main domain.

Most users like to know the full tech stack of a website. If there is a blog at blog.company.com which is using Intercom, it can be a useful data. I hope this makes sense.

Anyway, we will definitely address this issue and consider introducing an option for subdomain separation.


How are they detecting other technologies apart from javascript? By requesting the companies to share the tech stack manually?


Interesting choice of domain name. At first I thought this is WhatRunsWhere. [1] I checked a few WordPress sites that use CloudFlare, and it didn't detect WordPress. Let me know if you need the URLs.

[1] https://www.whatrunswhere.com/


That would be great! Please share the URLs in question so that we can take a look. We are squashing bugs one at a time ;) Email: hello [at] whatruns.com. Thanks!


It hangs on an old website, with the following error in the console: TypeError: Cannot read property 'hostname' of undefined at Object.setNoAppsFoundText (chrome-extension://cmkdbmfndkfgebldhnkbfhlneefdaaip/js/popup_final.js:153:22)


If you prefer the command-line, whatweb has been around for a while (first public release in 2009): https://github.com/urbanadventurer/WhatWeb


Tried on a site whose backend I know doesn't contain any ruby on rails and it says web framework is ruby on rails. Maybe they use Bayesian inference and don't worry about not having any data...


Looks cool, but care to share how you manage to process the entire web, which is around 200 million domains and billions of web pages? Sounds like a herculean task for a startup your size.


Yes, it is challenging to index and process billions of web pages on a regular basis. Our current architecture can be scaled to handle 500+ million web-pages. However, to increase the crawl frequency and introduce more services, we are building a distributed computing network - called Newton Network (newton.network). We hope this will give us enough processing power to power our ambitions ;)


I tried it on a website of mine running on localhost using Python and it said languages "Python Node.js PHP Ruby" which seems a bit over enthusiastic as none of the non Python stuff was running.


WhatRuns is currently not working on localhost. It is on our roadmap and we will definitely give it more weight! :)


Doesn't seem to handle my simple little kid quotes project. Just spins, and spins with no result. ¯\_(ツ)_/¯

https://kidisms.com/


Sorry, we experienced a minor downtime earlier today, which could be the reason why. Can you try again now?

If you're still facing the issue, please drop us a line with the URL in question so that we can take a look: hello [at] whatruns.com. Thanks!


Off topic, but, has anyone else been able to identify software frameworks by the behavior the application presents before?

I find myself getting slightly better at this as I spend more time in web development.


I ran it on my site and it didn't find anything.

I run jQuery, nginx have google analytics and have my ssl certificate with lets encrypt. All stuff that builtwith.com found without any issues.


Our servers are going a bit cranky due to the huge traffic we are experiencing today. New websites (that was not loaded on WhatRuns before) are now queued up and might experience few seconds of delay. This is to ensure best experience for our active users.


Testing this against a number of websites I know the stack for this seems to not only be missing information but regularly reports things never used on that site.


Copying the comment I previously posted:

We truly understand your frustration with detection accuracy, but when there are tens of thousands of technologies to detect, the only solution is to break things and move fast.

We were featured on Chrome Webstore a few weeks back and got a great response (12k+ active users) helped us enormously in improving the accuracy and efficiency, and I'm sure HN and PH launch will be even more helpful in improving the product.


What runs whatruns? - since it does not work at the moment


Sorry, we experienced occasional downtimes earlier today, which could be the reason why. Can you try again now?

If you're still facing the issue, please drop us a line with the URL in question so that we can take a look: hello [at] whatruns.com. Thanks!


I can't add this to Opera it seems, though there is a button that shows me i could. All extensions disabled, latest version. Chrome works fine.


Sorry about this. We have only publicized Chrome and Firefox for now considering the demand. We will release the rest within a weeks time. Thanks for dropping by!


All cool, thx for your effort! Good Luck!


Wow.... Thanks to my website's Content-Security-Policy, I blocked Whatruns' javascript!!!


Awesome tool! I have often wondered about which tech sites uses but almost never bother with checking the source etc.


Awesome. Thank you so much for your kind words!


Please do make the extension open source, as it has access to all website data, mind the privacy, show us the code.


You don't need their permission to look at the code, just use something like http://crxextractor.com/ to fetch the CRX from the Chrome Web Store, then rename to .zip and extract.


Not likely, unfortunately: "Our proprietary pattern recognition algorithm efficiently detects even the latest web technologies and plugins used on websites." (https://www.whatruns.com/about)


Very unlikely that it does this client side. I presume the extension basically makes an API call with the url of the current site.


Really inaccurate. Just tried and it reported React for Vue/Nuxt.js, CloudFlare and nginx for Zeit now.sh.


We truly understand your frustration with detection accuracy, but when there are tens of thousands of technologies to detect, the only solution is to break things and move fast.

We were featured on Chrome Webstore a few weeks back and got a great response (12k+ active users) which helped us enormously in improving the accuracy and efficiency, and I'm sure HN and PH launch will be even more helpful in improving the product.


Love the design. Nice work guys. Are you going to be selling leads based on tech info as well?


Thanks! I'll share this comment with our designer :)

Our business model will be similar to that of BuiltWith's, i.e selling list of websites using a particular technology. For eg., list of websites using Drift chat (https://www.whatruns.com/technology/drift) will be a super-useful competitive intelligence for other live-chat start-ups.

Also, we are planning to introduce a predictive sales system which will suggest clients based on their technology adoption. For eg., if a company migrates to Magento, they are a potential client to Magento extension developers.


Doesn't seem to work on subdomains, unfortunately. Just does the main domain instead


It is working on subdomains, but you are right that the primary domain is prioritised.

Most users like to know the full tech stack of a website. If there is a blog at blog.company.com and if it is using Intercom, it can be a useful data. I hope this makes sense.

Anyway, we will definitely address this concern and think about adding an option for subdomain separation.


Pretty cool. Something like this was overdue. Loved it.


Thanks Amit, glad you like it!


Looks similar to stackshare


How good is this over Wappalyzer ?


Comparing with Wappalyzer and BuiltWith, here's why we think we have something different (and maybe better):

1. WhatRuns detects fonts, Wordpress plugins and themes (tens of thousands of them).

2. Ability to follow sites (and know what techs websites started using/ditched).

3. Very lightweight compared to our counterparts, and arguably better UI ;)

4. More accurate data. BuiltWith can be very inaccurate as you might've already noticed. Wappalyzer is fairly accurate, but limited in technologies.

WhatRuns is trying to be the best of both worlds. :)


Nobody mentioned netcraft!


No need to install anything, just follow this url:

https://www.whatruns.com/website/reddit.com


Thanks for sharing that link. Here's a bookmarklet (as opposed to the Chrome extension) to launch this on whatever site you're on:

   javascript:void(window.open('https://www.whatruns.com/website/'+window.location.hostname));


This doesn't seem to work for all sites, for example my site https://charlieegan3.com doesn't work: https://www.whatruns.com/website/charlieegan3.com


I think the link is just returning cached result that was already identified with the extension.


Why no input on the page itself for something like that?


We started with extension as developers/designers found it especially handy for a quick look-up while working on their projects. Not to worry though - we're working on something for the web as well! ️


This is beautiful! While there are similar alternatives, I love the looks of Whatruns so I'll stick with it.

The URL has to be publicly accessible from the Internet, right?


Thank you so much! I'll share this comment with our designer ;)

Addressing your question, all URLs once passed through WhatRuns will be publicly accessible. You will have to use the extension for new sites for now.


Choosing a data vendor based upon the UI seems odd. Personally I'd chose whichever provider has the most accurate up to date information, but thats just me.


Yup, if they are basically the same but this is more intuitive and easier why not? As I don't depend on this (otherwise I'd agree) and it's just a "for fun" thing, the differences between them are negligible for me.


Would it be possible for you to add OVP & CDN detection?


OVPs and CDNs like Cloudflare, Amazon Cloudfront and most others are already detected by WR.


I see this now. Thank you very much for your reply.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: