Hacker News new | past | comments | ask | show | jobs | submit login
WordPress base configuration files on GitHub (github.com/search)
165 points by ishener on Aug 17, 2015 | hide | past | favorite | 84 comments



At what point do developers get criticized/held responsible for using public repositories for private websites? I get it, people like github but when you can get a private repo on bitbucket for free there's no excuse for this.


> people like github but when you can get a private repo on bitbucket for free there's no excuse for this.

Absolutely, Github public repos have always been a blackhat gold mine. But I guess a lot of people have never heard of Bitbucket, since Github is advertised everywhere , on education blogs, books , ... I'm sure some noobs using it don't even realize repositories are public and searchable.


or just a bare repo on your server. I personally don't see the appeal of Github for private projects at all.


Many hosted solutions integrate with private Github (OpsWorks, Codeship, CircleCI, CodeClimate, and many more).


PRs, Issues, Wikis, etc.


This isn't a problem with public/private repositories. It's possible to commit the config files for Wordpress without exposing your passwords - you just have to not be lazy about it, and store the actual values in the environment or pull them from a file you don't commit with everything else.


Github does have a wider set of services that integrate with it. That said, if you have to go Github, private repos aren't that expensive. (And more services seem to be recognizing that Bitbucket is an increasingly popular options)


They are if you have to manage a lot of them.

I love GH, but I only use it for my public projects. Limiting private repos ($200 for 125?) seems insane to me, and it will drive people to make public items that shouldn't be.

For any private projects, or ones involving clients, I use BitBucket and make all repos private. It's a difference of $190 for me (I use the $10/mth plan with BB and host well over 125 private repos).


> I love GH, but I only use it for my public projects. Limiting private repos ($200 for 125?) seems insane to me, and it will drive people to make public items that shouldn't be.

But that's not really github's fault for people having public repos. For $20/month I can setup a VPS with my own source code hosting service (full management like gitlab) and host all the repos I want. I get people love the features of github - but they never really use them.

I would say people use public repos on github because they are lazy. And when people get nailed for uploading their Amazon AWS keys - they should really think about an alternative solution for their git repo needs.


150,000,000+ database passwords, of which 99.9999%+ are from local development servers.


Excluding localhost and some obvious cases where the values are in a local config file still leaves around 111,000:

https://github.com/search?p=1&q=filename%3Awp-config.php+DB_...


localhost in the case of Wordpress just means the database is running on the same machine as the web server. Practically every WP instance is set up that way.


> localhost in the case of Wordpress

Uh, the concept of localhost is not unique to Wordpress in the slightest.


That's not what he meant. What he meant was reading that the database is on localhost doesn't mean it's a development system. Many production instances of Wordpress run the database on the same host. Therefore localhost can also mean production. That might not be true for other services, but for Wordpress that's common. This is what he meant.


Excluding localhost will exclude a lot of legitimate installs.


Most wordpress devs support multiple environments in their .wpadmin (dev -> localhost, prod -> some server). So this is creating a lot of false negatives.


Which would be 99.99926% :-)


How can you tell that? Just because it specifies "localhost" doesn't necessarily mean it's a dev box.


Exactly. Also remember that on many many wordpress servers you also have phpMyAdmin. You can use the username + password to login, and you'll get access even when it's restricted to localhost.


It's pretty scary to have phpMyAdmin public facing. IIRC some older versions of mysql-server even had a bug where it'd let you in with a random password in 1/256 chance!


It should be really bad, since phpmyadmin get attempts are the most frequent on my home webserver and I don't even have it installed. Maybe older vulnerable versions are still around though.


I agree with mahouse and of course are some of these password legit. But this is nothing new, don't store sensitive data in git. Everyone know you can search this stuff on GitHub and if we look back - Google was a nice password search engine too (and still today)


Don't store sensitive data in git, or don't store sensitive data on public github repos?


Don't even store sensitive data in git, it can be a bad idea: http://www.jamiembrown.com/blog/one-in-every-600-websites-ha...

Store credentials in environment variables.


How would you go about making a repeatable, automated deployment if you don't store configuration information in source control to load into the environment variables?


Depends on what you're using for your deployments, for example if you were using Puppet you might use something like https://github.com/TomPoulton/hiera-eyaml


Sorry, how does this prevent you requiring configuration in source control, or are you just suggesting that those credentials should be encrypted?


Consul and Vault go a long way to achieving that.

https://consul.io/

https://vault.io


Doesn't this just push the problem up (down?) a level in the hierarchy? I mean, you still need to deploy these and configure them with the information the rest of your deployment requires right?


That link is talking about a problem with e.g. .htaccess and basic directory permissions, not a problem using git per se. But yeah, put that stuff in envars.


If there's any question, I think the rule has to be the former. There are standard, auditable ways to keep sensitive data out of git: .gitignore, environmental vars, etc. Once it's in git, any attempts to keep it out of a public repo will probably be manual and ad hoc.

If the organization is "closed" by default, i.e. it only rarely releases code to the public, this may not matter as much.


If you're the sort of developer who puts a wp-config.php file in a git repo (eg no proper deploy process.. otherwise that file wouldn't exist in the repo, or no reference to the config file in .git-ignore), you're probably the sort of developer who'll use the same password on your local dev machine and your live site because "setting up MySQL users is hard."


And how many of those MySQL passwords are also used by the WordPress admin account?

Even if it's just 0.1%, that would be still 150'000 valid passwords.


And the other 95% doesn't allow external connections.


Found several working username+password for FTP accounts.


I'd guess that a large portion of those which aren't localhost are also firewalled or otherwise inaccessible from the public Internet.


Keep clicking, because I found what appear to be two valid passwords in 5 different config files (with public hosts.) Obviously I didnt test them, so you might be right that they change or are in dev mode, but that smells like a lot more than 6 9s.


How many of those change when they go live?


Hmm. If you alter the search to "filename:wp-config.php FTP_PASS" you start getting some that look like ... legit. For those who don't know, WordPress has some level of access to hosting server via FTP, for upgrades and plugin installs.

Pertinent config globals are FTP_BASE, FTP_CONTENT_DIR, FTP_PLUGIN_DIR, FTP_PUBKEY, FTP_PRIKEY, and of course, FTP_USER, FTP_PASS, FTP_HOST.


In a similar vein, things like this https://github.com/search?utf8=%E2%9C%93&q=filename%3Aid_rsa... are also why passwording your private keys is very important. Tons of these keys (why are people committing these to public repositories??) aren't passworded. It astounds me that someone has the technical knowledge to create an ssh key/pair, commit to github, and manage to send their unencrypted private key off into the public sphere.



I love how, for me, the first result is a backup of a/the MIT website


Does anyone else get a CAPTCHA on the next search after browsing through those results?


I got it too, I think its flagged because some people crawl google with those search terms looking for vulnerable systems.


I didn't.



Thought it was going to be template stuff for a min, clicked on the first one and saw "/ MySQL database password */ define('DB_PASSWORD', 'JasxkvpY72KKCdttdBqt');"


There are also salts/hashes in many of these configs...not such a great place to store those =)


Another one I posted about sometime ago is filezilla config files. Found lots of FTP servers with their passwords in the filezilla config files committed on github. [0]

[0]: https://www.google.co.in/search?q=inurl%3Afilezilla+inurl%3A...


"Passwords" in the title is a bit misleading. Most of these are staging files with little or no sensitive information there. However there is the odd bit of interesting data there if you look hard enough.

Github search is an untapped resource just like Algolia Search is on Hackernews. Infact I have largely replaced my Google searches with these ones for more refined and curated results.


What do you mean with staging files, what is not sensitive about username and password of the database?


Well a Wordpress production site is a rare and precious thing to find on Github. There are some that exist, but then even if I do find it:

1.) Password will be changed

2.) Possible honeypot

3.) Boring site is boring. No need to hack it. Not popular enough

Same goes for other databases on there. An enormous amount of cruft to wade through to get anything remotely juicy/interesting. And the same heuristics apply above: is it really so great that I logged into a boring MYSQL database that is probably being monitored and has nothing interesting in there in the first place?


I think they mean that a majority of these are instances where someone installed WP to play with it and these are the testing files rather than an actual website they are using. When I installed WP recently, I know I just used a dummy password for testing.


Apart from putting your wp-config on github, it's also a terrible idea to use short passwords like 'p@ss12' for a database password that will be sent from one machine/program to another most of the time - such passwords should at the very least look like 'jm0Y/ZGjxYZay2yraskQ5AbZ8Qe0r0pRVDdnEkaIvHU', computers can remember strings that long and developers can copy-paste if it's stored in a file already.


A majority of these results aren't actually wp-config.php files. If you sort by date indexed, you'll see that the results include all manner of files.

    filename:"wp-config.php" "define('DB_NAME'," extension:php
seems to give better results


Just as bad, I see many developers leaving their Rails app with production secret key.

It just takes more time.


Security experts, I have a question: if a database server just allow connections from a white list (trusted IP's), exposing database passwords on a GIT repository is still a problem?


In any case where having passwords is relevant to security rather than just a hindrance to usability, exposing passwords is a security problem. If exposing a password isn't a security problem, you shouldn't require a password in the first place.


"Defense in depth" is a commonly accepted security principle that suggests otherwise:

https://en.wikipedia.org/wiki/Defense_in_depth_(computing)


> "Defense in depth" is a commonly accepted security principle

Indeed.

> that suggests otherwise:

Except that it does no such thing. If you have passwords for defense in depth, they both exist for security reasons and it is a security problem to expose them (because you've just eliminated part of your depth.)

Defense in depth means that the problems of any one layer being violated are mitigated by additional layers of security, it doesn't mean it suddenly ceases to be a security problem if one of your measures is compromised. It just reduces the likely immediate severity of such a compromise, providing a greater chance of being able to effectively address it before it leads to an actual breach.


It's not clear from your question if this hopefully hypothetical database server is exposed to the internet or if it's on a private network, but:

Yes, it's still a problem, because then you have to depend on the whitelist staying valid and never having an admin accidentally turn it off. And you also have to depend on none of the machines on the trusted IPs being compromised either. And you have to depend on many other things not happening as well.

Instead, you want what is called defense in depth--several layers of security so that an attacker needs to breach several defensive layers in order to get access to what they're looking for. Relying on just an IP whitelist as a single layer of defense is not considered to be a good practice.

https://en.wikipedia.org/wiki/Defense_in_depth_(computing)


I mean, if a database can be accessed only from a certain IP, like in Microsoft Azure.


[I not saying I am an expert] It is always a problem to expose passwords. Sure they can't use it from the outside, but what's the point of even having the password if its public? The password is to prevent unauthorized access. If they are able to get to a box, and pivots, that password becomes useful to the attacker.


Maybe the password is the same for the wp-admin user. Password reuse is a really serious and common problem.


Definitely a good idea to whitelist ips, and, I'm certainly no security expert, but I still would avoid putting any type of credentials in a repository. Someone could have an ip reset, use a VPN tunnel, or any number of scenarios. I think it is better practice to use environment variables or something similar. Correct me if I'm wrong for those of you who are in security.


UNLOQ.io increases the security of your digital properties through a distributed authentication system that doesn’t require your users to remember any passwords.


This is one reason we don't use cloud-based source code hosting. All it takes is one idiot fork or an accident and wham, code everywhere.


It's a good case for private repos, but an even better case for not committing passwords to a repository in the first place.


I don't even bother committing any part of wordpress core. I just commit the wp-content directory because unless you're a mad man, you shouldn't be modifying anything outside of that directory. First rule of WordPress is don't touch core! wp-config is definitely an exception, but better safe than sorry to avoid issues like this.


I've found something else in WordPress with this simple search method that I'd argue is worse.


You do know that you can play the same game with other languages as well?

https://github.com/search?utf8=%E2%9C%93&q=filename%3Asettin...

I feel like people don't accept the fact that people do stupid stuff in other languages.


I think you're being oversensitive about PHP.

Wordpress, and by extension, its predictably-named settings file, is an easy search-target because it's very popular among novice/new developers.


> I think you're being oversensitive about PHP.

I'm just tired of seeing the same search on github for PHP config files.

> its predictably-named settings file

So does every other popular application and framework on the planet. This isn't something specific to wordpress - we can play this game all day with different applications, frameworks, and languages.

https://github.com/search?q=mysql+user&type=Code&utf8=%E2%9C...


Not to mention that a knock on Wordpress isn't necessarily a knock on PHP, since many non-developers (or companies that do zero dev in PHP) install Wordpress.


I found this search more interesting than someone pushing their wp-config to a repo; also warning, some are nsfw https://github.com/search?p=100&q=filename%3Atits.jpg+&ref=s...


That link is a great example to demonstrate how much Github search sucks now. You've explicitly searched for filenames of "tits.jpg", but it's showing you a complete mishmash of different files.


I was surprised that it worked at all; I assumed some filtering would kill it right away.


What is there to say ... developers , don't dump your projects on github public repositories ... use bitbucket and free private repos if you can't afford to pay FOR GOD SAKE !!! ...

!!! How many of them use the same credentials for their emails ? facebook ? twitter ? for their AWS account ? this is a nightmare.


The first few repos I peeked at were several years since their last commit.


Which is why I don't use any database password if the database is listening on localhost only, which is the case most of the time.


I don't think this is a good idea, even if the database is just listening to localhost. Say a malicious script gets uploaded to the machine, it will be able to dump the entire database without any need to seek out credentials.


Agree... it's better to still have credentials, but ALSO only listen locally. At least that way the credentials need to be found first!


Is there a way to crap this ? Is there an api for github search ?

Regards


That's why i am using https://Coding.net (Chinese Only), a China Startup, provide free and unlimited private repositories hosting service with lots of feature like Code reviews,Custom domains,WebIDE... Go private, guys!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: