Hacker News new | past | comments | ask | show | jobs | submit login
The Basics of Web Application Security (martinfowler.com)
223 points by vog on March 30, 2016 | hide | past | favorite | 41 comments



I've taken a different approach to explaining the basics of application security.

These are the things that are most likely to cause your application to be secure:

    1. Data being treated as code.
    2. Invalid (or missing) logic.
    3. Insecure operating environment. (e.g. outdated packages)
    4. Cryptography failures.
Data being treated as code covers the obvious things like SQL Injection and Cross-Site Scripting, and makes it easier to conceptualize the solution: Never mix the two. That's what prepared statements accomplish.

Invalid/missing logic is the realm of "missing access controls" and "confused deputy" problems. (This is also where I'd place CSRF vulnerabilities.)

Insecure operating environment: A secure application on an insecure server is not secure. Patch your systems. Use HTTPS. WordPress can't save you from CVE-2012-1823. Apache can't save you from Shellshock.

Cryptography failures are their own category, where the phrase "padding oracle" is routinely uttered with a straight face, a single mistake can cripple your application, and comparing strings the normal way programmers compare strings is considered dangerous. The solution here for people who aren't interested in becoming an expert is to hire one. Or a team of them.

These descriptions are an attempt to address the fundamental problems with how insecure code is written, rather than to arm the reader with a checklist of specific instances of vulnerabilities they can check off.

I generally dislike checklists (OWASP 10, SANS 25).

How many people are well-armed against SQL Injection, but pass user data to PHP's unserialize() function with reckless abandon? This is a data-as-code issue.

How many people use str_shuffle() for password reset token generation? This is a cryptographic failure.

I wrote about this at length here: https://paragonie.com/blog/2015/08/gentle-introduction-appli...

If anyone's interested, I also curate this reading list on Github: https://github.com/paragonie/awesome-appsec


This is an amazing resource! https://github.com/paragonie/awesome-appsec


I say this with a lot of respect, but this seems pretty noddy for a Martin Fowler article. There's not a lot of meat here and I really don't consider myself an app sec expert :-)


It says 'basics' right there in the title. Everyone has to start somewhere.


I can imagine him muttering to himself, "I can't believe I have to write this." But then I think of all the people I've worked with, and then think of all the "developers" ThoughtWorks must have worked with...


Article was well worth while just to learn about Little Bobby Tables. Still laughing. https://xkcd.com/327/


I was a bit disappointed that it's not called "Patterns of Web Application Security".


Isn't this article going to be updated though? ala http://martinfowler.com/bliki/EvolvingPublication.html


Yes, there will be more sections added over time. From the bottom of the page:

> This article is an Evolving Publication. Our intention is to describe more basic techniques that developers could, and should, use to reduce the chances of a security breach.


Indeed. Even my boss knows this stuff :)


It's a good start, covering SQLi and XSS vulnerabilities. It would be great if it covered CSRF vulnerabilities too. I think those are possibly the three most common dangerous vulnerabilities in web apps, and this page would be a perfect resource if it had those three.


Or things like JSONP, CORS, and directory traversal :-/.



Some notes:

1) Never blacklist. Seriously, unless you are in the business of writing and securing browser parsers, you are never going to catch everything. There are many vectors not listed on OWASP's XSS Filter Evasion Cheat Sheet. See https://www.w3.org/TR/html5/scripting-1.html#restrictions-fo... for an idea of what you're up against.

Because of the incredible number of ways you can cause security vulnerabilities through injection, in 2016 unless you're sending Content-Type text/plain always try and use secure templates. https://github.com/cure53/DOMPurify is nice, but there are likely good options for your language.

I say use secure templating because you need highly contextual encoding. As this article points out, escapes for HTML will not work in Javascript. Neither will they work for URLs. Single quotes are not escaped in most URLencoding schemes for example.

Many an application has been hacked due to a well-meaning engineer trying to prevent open redirect by only allowing urls with '/' at the front in links, not realising that '//x.com' also takes you out to x.com, or preventing '/' at all, not realising that '@x.com' will take you to x.com, or that '../../..' injection can cause requests to any endpoint on the same domain.

2) Javascript is not your only enemy, and it is not by any means your most fierce. You can use CSS to exfiltrate secret tokens: http://mksben.l0.cm/2015/10/css-based-attack-abusing-unicode...

Practically any injection can 'redress' your page so it appears drastically different from a user, which is potentially more powerful than just script injection. You can bypass same origin policy boundaries by coercing the user into making input into an invisible iframe if the stars is aligned.

3) Though this article talks about input validation a lot, it doesn't talk about actually how to do this defensively. You need to construct terse expressions that limit to input data you know can only be safe. Many an application has been hacked by specifiying /#[A-Fa-f]{3,6}|.*/ for hex colour codes.

Wherever you can, do not deal with raw DBMS language strings (SQL, Javascript) and string interpolation, that's a huge red flag. Instead use a driver / wrapper that provides injection safety and use prepared statements.

4) Unless you really want to invest time into learning web security and doing CTFs, don't try to write your own filters. For example, the suggestion that escaping "'" to "\'" can be bypassed simply by adding a "\" just before the open quote. "'" -> "\'", but "\'" -> "\\'", closing whatever single-quoted string you're in.

5) Serialization is not secure escaping in some contexts. For example, if your endpoint identifies as text/html, but instead returns JSON, I'm going to send some HTML in the JSON string and send a user to the endpoint directly to get XSS.

6) HTML is not your only enemy. XML documents can exfiltrate secrets through XXE and can be coerced into XSS with XHTML islands. Flash is a terrifying thing because it for the most part ignores Content-Types (see: https://miki.it/blog/2014/7/8/abusing-jsonp-with-rosetta-fla...). Requests that flash applications make are just as injectable as any other.

Unfortunately, though this article might cover a fair portion of encoding security, huge issues like effective and secure CSRF protection is not discussed. That's a story for another day. One day there will be a thorough and true guide to security in webapp development, but that is not this day.


I say use secure templating because you need highly contextual encoding. As this article points out, escapes for HTML will not work in Javascript

I just wanted to reinforce this comment because I work in infosec (mostly web app security) and many of our clients make mistakes in this area. If you don't encode depending on context, you're going to have a bad time.

Most developers are aware of URL encoding and HTML encoding but then you also need to consider other encoding techniques for contexts such as JavaScript and CSS. As an example the single quote when:

URL encoded is %27

HTML encoded is '

JS encoded is \x27

CSS encoded is \000027

It gets way too difficult and cumbersome to manually write encoding code for all these different contexts and deal with the endless number of edge cases. Don’t go hunting for single quotes or angled brackets and then replace them what what you think it should be. Rather, you should rely on encoding libraries available in your framework. As an example, in .NET you can import the AntiXSS package and then you have a number of library functions such as CSSEncode(), HTMLEncode() and JavascriptEncode() at your disposal. Similar libraries exist for other major development frameworks.


To add as well: html attribute encoding vs html encoding as another context.


> Never blacklist.

http://www.ranum.com/security/computer_security/editorials/d...

"The Six Dumbest Ideas in Computer Security"

    #1) Default Permit
    ...
    #2) Enumerating Badness
    ..,
The unfortunately common assumption that security is about preventing badness has done an incredible amount of damage to security. We see it in everywhere. A huge amount of so-called "security" products (such as antivirus) are a futile exercise in trying to enumerate all of the bad things in the world.

The better approach is using proper recognizers built from explicit, formal grammars whenever possible. Meredith and Sergey's explained this problem very well in their talk[1] at 28c3. Define the grammar for valid inputs, use a parser generator to avoid bugs, and move on to the next problem instead of endlessly adding checks for bad input.

[1] https://media.ccc.de/v/28c3-4763-en-the_science_of_insecurit...


So, in other words..

    for bad_thing in life:
        delete(bad_thing)

    # program crashes



I like this, but the glaring omission of auth and handling auth information is striking.


"Omission"? This article is about to be be extended, we don't yet know which topics they plan to cover. From the bottom of the page:

> This article is an Evolving Publication. Our intention is to describe more basic techniques that developers could, and should, use to reduce the chances of a security breach.

http://martinfowler.com/bliki/EvolvingPublication.html


wip, but agree.


I'd love to see these more basic security guidelines included in the documentations of frameworks and packages using these technologies and the online resources teaching them.

In the last few years plenty of fantastic, user-friendly online learning resources have popped up teaching all the cool things you can do with code but very few of them ever mention security. And all too rarely do github repos mention 'Watch out! This could be dangerous!'.

Teaching security always feels like someone else problem. I'm all for many more articles like this one!


To be fair, the authors of frameworks probably know more about security than your average user of the framework. The first time I heard of CSFR was when I couldn't get my Django forms working and had to read up on why.


I am not sure about ng-bind-html being dangerous.

"Note: If a $sanitize service is unavailable and the bound value isn't explicitly trusted, you will have an exception (instead of an exploit.) " Source - https://docs.angularjs.org/api/ng/directive/ngBindHtml


A bit of a bummer that http://evil.martinfowler.com/ resulted in Server Not Found.


This is just a wild guess which I can't verify at the moment, but have you tried setting the evil bit in your outbound packet headers, as per RFC3514?


Is this the same Martin Fowler that came up with REST?


Nice you now understand why some people don't trust REST.

REST is based on a lot of optimism. Security is based on a lot of pessimism.

Building is a sane balance between both. Basically, you build slow to secure stuff.

Remember cathedrals were built for centuries by dedicated generations of small teams of highly skilled non compromising masons.

That's the way you build security. At the right speed.

openBSD is a going a tad too fast, and is a tad too big, but it is still quite secure and functional.

The problem is industry is wishing to make it go faster by throwing more unqualified man power creating a Babel tower effect.

Well, it does not seems to work.


Not sure why you think this shows why I shouldn't trust REST...

I can't see how REST is fundamentally insecure! Could you elaborate?


It is not REST that is unsecure by itself. It is people following ideas without critical thinking.

Maybe I only had _under brained_ CTO, but I always had problems in details. And even though I have been an architect/sysadmin, recently I went back to coding for others, because I prefer coding.

I saw a lot of mess in signaling the error code, or changing how to return them (is it an 5XX error or a { "status" : "bla", "error" : ... }) or even using the "PUT/GET/POST/DELETE" correctly.

Wrong handling of cache too... of HTTP headers, media type... authentication ...

Also trying to make distributed transactional systems without synchronized clocks on server (and with unsecure NTP source because editing the default ntpd.conf is sooooo hard). And denying the need of FSM for distributed REST servers because REST is supposed to be stateless. (lol)

Or the specs of rest limited to a nice HTML page with one example of the "right case" and no descriptions of the types and no informations on how errors are handled.

On the paper REST is great. In practice ... Oh boy. It is the PHP of messaging.


I think you probably need to read up on RFC 2616 again then, and in particular section 9, method definitions.

In terms of status codes, the RFC is very clear: "Response status codes beginning with the digit "5" indicate cases in which the server is aware that it has erred or is incapable of performing the request" [2] so if you are returning a 200 OK when using REST and there server is indicating an error, then you are doing it wrong.

In terms of caching, OPTIONS is not cacheable; GET is cacheable if it conforms to section 13 [3], HEAD can be cacheable, POST is not cacheable at all, responses to [UT are not cacheable, likewise with DELETE.

As for REST being stateless - well, yes! HTTP is "a generic, stateless, protocol" - and indeed every request you send is an independent transaction unrelated to any previous request.

As for producing a distributed transaction system, sure if you have a distributed set of servers processing REST requests, without some sort of FSM it's going to be hard to implement... I just don't see this as a fundamental issue with REST.

As for "Or the specs of rest limited to a nice HTML page with one example of the "right case" and no descriptions of the types and no informations on how errors are handled" - aside from the HTTP RFC, REST is detailed in Roy Thomas Fielding's doctoral dissertation [4] and is a really great read, I recommend reading it if you haven't already as it details many of the things you raise as issues.

1. https://tools.ietf.org/html/rfc2616#section-9

2. https://tools.ietf.org/html/rfc2616#section-10.5

3. https://tools.ietf.org/html/rfc2616#section-13

4. https://www.ics.uci.edu/~fielding/pubs/dissertation/rest_arc...


REST is based on a lot of optimism.

REST is based on the observation of the communications patterns developed for the web. Optimism? About what?

And by the way, OpenBSD includes a RESTful server in its core projects.


REST works well over HTTP as it handles hypertext, but it's interesting that Fielding wrote the following:

REST does not restrict communication to a particular protocol, but it does constrain the interface between components, and hence the scope of interaction and implementation assumptions that might otherwise be made between components. For example, the Web's primary transfer protocol is HTTP, but the architecture also includes seamless access to resources that originate on pre-existing network servers, including FTP, Gopher, and WAIS. Interaction with those services is restricted to the semantics of a REST connector. This constraint sacrifices some of the advantages of other architectures, such as the stateful interaction of a relevance feedback protocol like WAIS, in order to retain the advantages of a single, generic interface for connector semantics. In return, the generic interface makes it possible to access a multitude of services through a single proxy. If an application needs the additional capabilities of another architecture, it can implement and invoke those capabilities as a separate system running in parallel, similar to how the Web architecture interfaces with "telnet" and "mailto" resources. [1]

1. https://www.ics.uci.edu/~fielding/pubs/dissertation/rest_arc...


And never run eval() on any user input.


Avoid running eval(), unserialize(), etc. because you're wrong about what isn't user input.


never run eval on anything, unless you have a very good reason.

reminds me of this gem google gave me as a top result once: https://gist.github.com/40/3192269


Seems like an intro to the basics of web app sec. :-(

First step... do not try to do application security until you know it all: until then, find someone who does know all about it.

Before you even start development research your risks and expectations: - legal requirements (financial, data protection, healthcare: fines or criminal charges aren't fun (although who knows anyone who got investigated never mind lost? Enforcements a joke!)) - ethical (it's not just about money) - your targets (customers, staff, browsers, servers, network) - financial (data, financial/payment systems, IP, etc)

- Work out what you must do (legal) - Work out what would be crazy not to do (financial) - Work out what you should do (ethical) - Work out what the hell your capturing that you don't need to (mother's maiden names belong with financial institutions and your mother's family; do you really need them to check for your user? Is a Captcha and some 2FA setup with a mobile and email not good enough for you?) - Don't have a business model where you catch all and work it out later (Google has billions for lawyers so it can just screw everyone, we're not all Google) - Document the rest, so you have a baseline for any future analysis.

Other todos: - keep libraries up to date (owasp CVE checking helps), if there's a known attack on the mapping library you are using then escaping might not be enough - worry about url content (screw REST when it comes to PII: demanding /user/hilary@homeserver.com is not a good idea: never put PII or security tokens there if they might be bookmarked or sent as referers) - understand http headers (especially referers, CORS, etc) - understand how to setup https servers/clients (that actually authenticate certificates) - be wary of social engineering risk (one factor auth is easy to commit to github, no devs should be exposing test servers bound to 0.0.0.0 on Starbucks wifi) - test your security! system tests and integration tests: definitely not just external testing companies (they're terrible unless they can see your source code) - don't use security frameworks without reading the documentation (some are dumb enough to have default private keys in there: Apache Shiro I'm looking at you)

Most importantly of all....

Document and share your security expectations with all of your team. When Graduate Bob walks into work and doesn't know enough about application security, he shouldn't be commiting to master in your web app until he does. Ensure expectations are regularly audited and that code reviews and testing reflect the expectations.

If you have a problem with this, leave sensitive data alone: there's lots of jobs in IT that don't base their business model around capturing as much personal data as possible (well, there's a few left).


Somewhat OT: I really dislike the terms "blacklist" and "whitelist". Are there alternatives?


How about denied & allowed?


ayelist/naylist




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: