Hacker News new | past | comments | ask | show | jobs | submit login
The Design of Software is a Thing Apart (pathsensitive.com)
198 points by nancyhua on Jan 16, 2018 | hide | past | favorite | 117 comments



> Those who speak of “self-documenting code” are missing something big: the purpose of documentation is not just to describe how the system works today, but also how it will work in the future and across many versions. And so it’s equally important what’s not documented.

Documentation also (can) tell you why the code is a certain way. The code itself can only answer "what" and "how" questions.

The simplest case to show this is a function with two possible implementations, one simple but buggy in a subtle way and one more complicated but correct. If you don't explain in documentation (e.g. comments) why you went the more complicated route, someone might come along and "simplify" things to incorrectness, and the best case is they'll rediscover what you already knew in the first place, and fix their own mistake, wasting time in the process.

Some might claim unit tests will solve this, but I don't think that's true. All they can tell you is that something is wrong, they can't impart a solid understanding of why it is wrong. They're just a fail-safe.


Probably four times a year I find out that defending my bad decision in writing takes more energy than fixing it.

You start saying you did X because of Y, and Y is weird because of Z, and so X is the way it is because you can’t change Z... hold on. Why can’t I change Z? I can totally change Z.

Documentation is just the rubber duck trick, but in writing and without looking like a crazy person.


I do the same. If I'm finding it too hard to describe what I'm trying to achieve in a one-liner comment, I'm probably doing something wrong.

Also, writing it down lets someone else, later, take the role of the duck and benefit from your explanation to yourself.


> If you don't explain in documentation (e.g. comments) why you went the more complicated route, someone might come along and "simplify" things to incorrectness, and the best case is they'll rediscover what you already knew in the first place, and fix their own mistake, wasting time in the process.

I've done this to myself. It sucks. Revisiting years old code is often like reading something someone else entirely wrote, and you can be tempted to think when looking at an overly complex solution that you were just confused when you wrote it and it's easily simplified (which can be true! We hopefully grow and become better as time goes on), instead of the fact that you're missing the extra complexity of the problem which is just out of sight.


I have once or twice embarked on what I was sure was a well-thought-out, solid refactoring job, only to find that after a long process of cleaning, pulling out common code, and adding the special-case logic, I had refactored myself in a giant circle.

Every step of the process seemed like a local improvement, and yet I ended up where I started. It was like the programming equivalent of the Escher staircase: https://i.ytimg.com/vi/UCtTbdWdyOs/hqdefault.jpg.


Reminds me of the theoretical model of stored-data encryption - it's the same as encrypting communication, only you're sending data from your past self to your future self :-D


> Some might claim unit tests will solve this

Yes. Tests will solve this. Your point is perfect for tests.

If another experienced coder cannot comprehend from the tests why something is wrong, then improve the tests. Use any mix of literate programming, semantic names, domain driven design, test doubles, custom matchers, dependency injections, and the like.

If you can point to a specific example of your statement, i.e. a complex method that you feel can be explained in documentation yet not in tests, I'm happy to take a crack at writing the tests.


How do you express "X is a dead end; we tried it and it didn't work because Y, so this is Z" as a unit test? The strength of prose is that it can be used express concepts with an efficiency and fluency that syntactically-correct cannot do.

Sometimes you just have to pick the right tool for the job, and sometimes that tool is prose. I think if you get too stuck on using one tool (e.g. unit tests), you sometimes get to the point where you start thinking that anything that can't be done with that tool isn't worth doing, which is also wrong.


Literate commit messages. The tool TRAC did a great job of surfacing project activity into timelines and exposing views like that. It's possible with GH but I'm usually the only one on projects to write commit messages that aren't dismissive like "words" or "fdsafdasfas"... soooo.... Release Notes are the best I can do for now.

Bigger still, is what happens when a project spills beyond a single repo, but not even Google is that big :) :). Apache projects are good models for that kind of documentation IMO, even though the pages have ugly css.


> Literate commit messages.

That's a form of written-in-prose documentation like the OP was arguing for, and unlike writing unit tests. Documentation doesn't have to be in a separate file.


Here I'd distinguish between system/integration and unit tests. Unit tests as a whole tend to amount to a mirror of the code base. If a given function f returns '17' and a test validates that fact, all we've done is double check our work -- which has some value, but doesn't protect against the case in which f is _supposed_ to return 18 and both the code and the test are wrong.

OTOH, system tests provide a realm where external implicit/explicit requirements may actually be validated. Perhaps.


IMO tests should never do this unless the function's role is obvious. This prevents internal restructuring (Which is bad!). What you should instead prefer is end-to-end testing.

So for a lexer, you'd create a dummy program with the output of each token on a newline, and then test that the tokenizer's output matches what you expect from the input. But you shouldn't test whether the functions themselves are correct. The tests cases should be designed to throw up any bugs in any of the internal functions, and the system should catch it as defined.

Otherwise you end up documenting what the system currently is, rather than that the goal of the system is met.


Completely disagree here. If you test this way you'll be sure to have a correct implementation for this dummy program. Further, if the test fails you now have to try and figure out what component caused the failure.

A unit test should test a specific unit; commonly a class. You should have unit tests for the interface of that unit (never private or even protected methods). This absolutely does not prevent internal restructuring but it drive you toward seeing your classes as individual service providers with an interface.


> If you test this way you'll be sure to have a correct implementation for this dummy program.

That's the point. If the dummy program fails then your implementation is bad.

> Further, if the test fails you now have to try and figure out what component caused the failure.

Have you heard of logging?

> A unit test should test a specific unit; commonly a class. You should have unit tests for the interface of that unit (never private or even protected methods).

This is almost exactly what I suggested. I stated 'dummy programs' for individual tests because it's more modular and closer to actual usage than mocking. If your language has an equivalent, use that.


>Have you heard of logging?

So your solution is to dig through log files to find the problem instead of read the unit test name(s) that demonstrate the failure?

>This is almost exactly what I suggested.

Well, it wasn't clear from your wording. It sounded like you were advocating integration testing instead of unit testing (which is a thing that people commonly do, so that's why I reacted to it).


> a complex method that you feel can be explained in documentation yet not in tests, I'm happy to take a crack at writing the tests.

// This implementation is unnatural but it is needed in order to mitigate a hardware bug on TI Sitara AM437x, see http://some.url

(that's an obvious one; but there are plenty of other cases where documentation is easier than a test)


Turns out this is a great example of why tests are better than documentation.

One way to implement this is by using two methods: one normal and one with the hardware mitigation code, with a dispatch that chooses between them.

This separation of concerns ensures that your normal code runs normally on normal hardware, and your specialized code runs in a specialized way on the buggy hardware.

This separation also makes it much clearer/easier to end-of-life the mitigation code when it's time.


A. You took it too literally ( maybe it's not a hw bug on some exotic processor, maybe it's a mitigation for a security vulnerability that affects a broad set of OS distros).

B. Your solution is not obviously better, it's a different trade-off. And it's not immediately apparent to me how exactly you would write the test for the affected hardware, in the first place. What if the hardware bug is extremely hard to reproduce?


Fair points.

A. When I encounter areas that need specialized workarounds (such as a mitigation for a security vuln) then I advocate using two methods: one of the normal condition and one for the specialized workaround. Same reasons as above, i.e. separation of concerns, easier/clearer normal path, easier to end of life the workaround.

B. In my experience tests are better than documentation for any kind of mitigation code for an external dependency bug, because these are unexpected cases, and also the mitigation code is temporary until the external bug is fixed.

Imagine this way: if a team uses just documentation, not tests, then what's your ideal for the team to track when the external bug is fixed, and also phasing out the mitigation code?


> Imagine this way: if a team uses just documentation, not tests, then what's your ideal for the team to track when the external bug is fixed, and also phasing out the mitigation code?

- not all external bugs are fixed, some are permanent (e.g. the hardware issues).

- tests can tell you when something is wrong, but not when something is working. I'm not really sure how a test could tell you that "external bug is now fixed" - and even in cases where that _might_ work, I'm not sure it's a good idea to use a test for that.


Do you think this still holds true if you name all your tests in the format test1, test2 ... testN? If not, then you're in the realm of documentation, not tests, and the descriptive names (which is a form of metadata, just as comments are) of the tests are what is communicating these special cases, and not the test content itself.

Combining the two is good, but let's not act like the tests themselves immediately solve the problem.


My opinion is that test names, function names, variable names, constant names, high level languages, literate programming, and well written commit messages, all help code to be understandable; I'm fully in favor of all using all these in source code and also in commit messages.

My experience is that documentation is generally a shorthand word that means non-runnable files that do not automatically get compared to the application source code as it changes.

Of course there are some kinds of blurred lines among tests and documentation, such as Cucumber, Rational, UML, etc.; but that's not what the parent comment was talking about when they described the function with a naive/buggy implementation vs. an enhanced implementation that handles a subtle case.

> but let's not act like the tests themselves immediately solve the problem

I'm saying that yes, the tests do immediately solve the problem in the parent comment's question: a test for the "subtle" case in the parent comment immediately solves the problem of "how do we ensure that a future programmer doesn't write a simplified naive implementation that fails on this subtle case?"


That wasn't the parent's case though. The parent's case was an odd implementation that provided correct enough approximations of the answer in a much more performance way. Neither are incorrect, but the "simplified" version would be a step backwards.

As noted, the comments are for the why, which tests don't tell you without some additional information.


Sometimes I wonder if we should mark tests as "documentation", "internal" or similar. When I get a pile of unit tests I find it really hard to tell which ones are for the overall system vs. testing just a detail that may change.


  import cPickle
  def transform(data):
    with open('my.pkl', 'r') as f:
      model = cPickle.load(f)
      return model.predict(data)


If you change code and break tests doing something that is known to be wrong, you are wasting your time and everyone else's.


Old and somewhat contrived example, but the first thing to pop into my head is the famous fast inverse square root function.

    float FastInvSqrt(float x) {
      float xhalf = 0.5f * x;
      int i = *(int*)&x;         // evil floating point bit level hacking
      i = 0x5f3759df - (i >> 1);  // what the fuck?
      x = *(float*)&i;
      x = x*(1.5f-(xhalf*x*x));
      return x;
    }
I can't think of a way to write a test that sufficiently explains "gets within a certain error margin of the correct answer yet is much much faster than the naive way."

The only way to test an expected input/output pair is to run the input through that function. If you test that, you're just testing that the function never changes. What if the magic number changed several times during development, do you recalculate all the tests?

You could create the tests to be within a certain tolerance of the number. Well how do you stop a programmer from replacing it with

    return 1.0/sqrt(x);
And then complaining when the game now runs at less than 1 frame per second?

Here's a commented version of the same function from betterexplained.com.

    float InvSqrt(float x){
        float xhalf = 0.5f * x;
        int i = *(int*)&x;            // store floating-point bits in integer
        i = 0x5f3759df - (i >> 1);    // initial guess for Newton's method
        x = *(float*)&i;              // convert new bits into float
        x = x*(1.5f - xhalf*x*x);     // One round of Newton's method
        return x;
    }
It's still very magic looking to me, but now I get vaguely that it's based on Newton's method and what each line is doing if I needed to modify them.

I actually just found this article [0] where someone is trying to find the original author of that function, and no one on the Quake 3 team can remember who wrote it, or why it was slightly different than other versions of the FastInvSqrt they had written.

> which actually is doing a floating point computation in integer - it took a long time to figure out how and why this works, and I can't remember the details anymore

This made me chuckle. The person eventually tracked down as closest to having written the original thing had to rederive how the function works the first time, and can't remember exactly how it works now.

I think the answer is both tests and documentation. Sometimes you do need both. Sometimes you don't, but the person after you will.

[0] https://www.beyond3d.com/content/articles/8/


For example, here's one way to write a test that sufficiently explains "gets within a certain error margin of the correct answer yet is much much faster than the naive way".

Using Ruby and its built-in minitest gem:

1. Write a test that does minitest assert_in_epsilon(x,y,e)

2. Write a minitest benchmark test that compares the speed of the fast function with the speed of the naive function.

Notice the big advantage for long term projects: if the hack ever ceases to work then you'll know immediately. This actually happens in practice, such as math hacks that use 32-bit bit shifts that started failing when chip architecture got wider.

> no one on the Quake 3 team can remember who wrote it

Exactly. We have the code file, but not any documentation separate from the code, such as notes, plans, attempts, reasoning, etc.


> Exactly. We have the code file, but not any documentation separate from the code, such as notes, plans, attempts, reasoning, etc.

I agree with you that it can be tested. But it doesn't explain anything about why it works, or what methods the author tried that didn't work as well. If you ever had to make it faster, or make it work better on different hardware, you'd be starting from scratch again.


> If you ever had to make it faster, or make it work better on different hardware, you'd be starting from scratch again.

Optimizations like these a great area for tests because the test files can keep all the various implementations and can benchmark them as you like.

This enables the tests to prove that the a new implementation is indeed optimal over all previous implementations, and continues to be optimal even when there are changes in external dependencies such as hardware, libraries, etc.

> But it doesn't explain anything about why it works

IMHO it does, for all the areas expressed in the original link and the parent comment.

Here are the original link examples:

1. "What [TDD] doesn’t do is get you to separate the code’s goals from its implementation details."

IMHO first write the code's goals as tests, then write the method implementations. The tests may need new kinds of instrumentation, benchmarking, test doubles, etc.

2. "[D]oes the description of what Module 3 does need to be written in terms of the description of what Module 1 does? The answer is: it depends."

IMHO write modules with separation of concerns, and with clear APIs that use API tests. If you want to integrate modules, then write integration tests.

3. "Quick! What does this line of code accomplish? return x >= 65;"

IMHO write code more akin to this pseudocode:

    return age >= US_RETIREMENT_AGE

    return letter >= ASCII_A


Write a property based test, ie generate a bunch of random inputs, then assert that all of them are within some (loose) margin of error.


> Write a property based test, ie generate a bunch of random inputs, then assert that all of them are within some (loose) margin of error.

So someone comes along later, look at your test, and wonders: why did you go through all that trouble? You can definitely write tests for a lot of that stuff, but they still don't fluently communicate the why of your choices.


This doesn't satisfy the time constraint though.

    return 1.0f / sqrt(x)
Passes a property based test but now your game doesn't actually run because it's much too slow of an operation on hardware at that time.

You can also test execution time too, but that's finicky and doesn't help explain how to fix it if you break that test (if there's no accompanying documentation).


But at this point all you're saying is "I can't think of a way of testing performance".


Performance can be tested in a unit test. You just need to measure the time needed to compute the function on a given set of numbers, then measure the time needed to compute 1.0f / sqrt(x) on the same set of numbers. The test succeed if your function is 10x faster. In future, the test may fail because sqrt has improved and this trick is no more needed.


No I'm saying a test for performance doesn't accurately describe the reasoning behind it without accompanying documentation.

Speed is the entire purpose of this method, not the numerical accuracy.


The "why" is actually more relevant to the point made in the article title than "how it will work in the future and across many versions."


Peter Naur's "Programming as Theory Building" also addresses this topic of a "theory" which is built in tandem with a piece of software, in the minds of the programmers building it, without actually being a part of the software itself. Definitely worth a read: http://pages.cs.wisc.edu/~remzi/Naur.pdf


The biggest problem is when users of software, programmers of software, and the software code itself have 3 different incompatible theories of how it works.

Sometimes it gets worse still: you can have different theories according to (a) scientists doing basic research into physics or human perception/cognition, (b) computer science researchers inventing publishable papers/demos, (c) product managers or others making executive product decisions about what to implement, (d) low-level programmers doing the implementation, (e) user interface designers, (f) instructors and documentation authors, (h) marketers, (h) users of the software, and finally (i) the code itself.

Unless a critical proportion of the people in various stages of the process have a reasonable cross-disciplinary understanding and effective communication skills, models tend to diverge and software and its use go to shit.


This is why dogfooding is so important - you're updating the programmers' model to align with the users' model, reducing the total problem space (and thus the available avenues to get it wrong) by many degrees of freedom.


This is why during software design the first thing I talk about is not the UX flow or the software architecture, but the user's mental model (or the mental model we want to give them).


Seconded. In fact I would suggest it instead of this article - it is much better.


Thirded. It is a classic and was far ahead of its time.

There are some great comments about this buried in https://hn.algolia.com/?query=naur%20theory&sort=byPopularit....


The article looks good but I don't see why that should take away from the OP.

The OP makes excellent points concerning the relative independence of design and code in the context of the "extreme programming" paradigm having become very common if not dominant.


Thanks for linking this, don't think I'd ever seen it before and as someone who's a second generation holder of a very large system's underlying theory it feels extremely accurate, but puts things into terms I never considered before.


Nice, that's great! I hadn't known of it before.


"Bad programmers worry about the code. Good programmers worry about data structures and their relationships."

-- Linus Torvalds


> I shall use the word programming to denote the whole activity of design and implementation of programmed solutions.

Isn't this definition circular, using "programmed" in defining "programming"?


I think for circularity you'd need a pair of definitions -- "programming: making a program" and "program: the result of programming". In this case, we already know what a program (or a "programmed solution") is -- that is, we can tell that something is a program without necessarily knowing how it was made. So the definition at least provides some new information on top of that -- the name for the activity of creating programs.[1] Also, by including the concept of "design", it lets you know when the author says "programming", he doesn't just mean the acts of writing source code, or typing it in.

[1] You could have probably guessed that the name was going to be "programming", but it might not have been.


I would love to see the pendulum swing back around to _good design_ again.

It matters more when designing libraries/frameworks than one-off apps.

Switching to a new framework/platform/language at the point the one you were on before finally matured enough that it was hard to ignore the need for good design doesn't actually help. you'll still be back eventually.


There have been articles over the last few years that have highlighted the dangers of idolizing and prioritizing innovation. I too hope for increased attention to craft and design of software in the future.


The OP doesn't seem to understand TDD:

"So you update the code, a test fails, and you think “'Oh. One of the details changed.'"

Some of the concerns they raise about writing tests are covered by Uncle Bob here: http://blog.cleancoder.com/uncle-bob/2017/10/03/TestContrava... and here: http://blog.cleancoder.com/uncle-bob/2016/03/19/GivingUpOnTD...


Good design is difficult. That's the easy to understand part. What is difficult for me to understand, why is this skill just ignored. There are lots of skills that are difficult, but people still persist on learning them. Not for software design. It-works-somehow-for-now seems to look good enough for the most. This also results in "OOD is difficult, FP will save us. Oh no, FP does not really save us, FRP for sure will". Sorry guys, you will need to break some eggs for omelette.


Seems to me there might be ways to program that convey more information. For example flow-based programming (FBP) seems like it might help and should help make the flow of the program explicit and obvious. That is, inherent to the code is a high level overview of what it does.

From my own limited experience it can make explaining a program to someone new almost trivial. You just use the various flows defined as almost visual guides to what is happening. I don't want to say FBP is a silver bullet, but I think it points to the idea that it is possible capture much more of the theory and design of the program in the code.


We’ve increased our productivity by quite a lot over a five year period by ditching most testing on smaller applications.

Basically our philosophy is this: a small system like a booking system which gets designed with service-design, and developed by one guy won’t really need to be altered much before it’s end of life.

We document how it interfaces with our other systems, and the as-is + to-be parts of the business that it changes, but beyond that we basically build it to become obsolete.

The reason behind this was actually IoT. We’ve installed sensors in things like trash cans to tell us when they are full. Roads to tell us when they are freezing. Pressure wells to tell us where we have a leak (saves us millions each year btw). And stuff like that.

When we were doing this, our approach was “how do we maintain these things?”. But the truth is, a municipal trash can has a shorter lifespan than the IoT censor, so we simply don’t maintain them.

This got us thinking about our small scale software, which is typically web-apps, because we can’t rightly install/manage 350 different programs on 7000 user PCs. Anyway, when we look at the lifespan of these, they don’t last more than a few years before their tech and often their entire purpose is obsolete. They very often only serve a single or maybe two or three purposes, so if they fail it’s blatantly obvious what went wrong.

So we’ve stopped worrying about things like automatic testing. It certainly makes sense on systems where “big” and “longevity” are things but it’s also time consuming.


the information of a program’s design is largely not present in its code

And that's the problem. We need ways to make those higher level designs (~architecture) code.


This is the problem that I've run into trying to use formal methods.

I love them, I can express some things very concisely and even clearly. But there's no direct connection to the code and so keeping things synchronized (like keeping comments synchronized with code) is nigh impossible.

We need the details of these higher level models encoded in the language in a way that forces us to keep them synced. Type driven development seems like one possible route for this, and another is integrating the proof languages as is done with tools like Spark (for Ada).

This will reduce the speed of development, in some ways, but hopefully the improvement in reliability and the greater ability to communicate purpose of code along with the code will also improve maintainability and offset the initial lost time.

And by keeping it optional (or parts of it optional) you can choose (has to be a concious choice) to take on the technical debt of not including the proofs or details in your code (like people who choose to leave out various testing methodologies today).


>use formal methods.

My admittedly very brief experience with formal methods was that they were actually less close to the "design" of the software than the code. So not sure that's a direction that will get us anywhere we need to go.

>in a way that forces us to keep them synced.

Why "synced"? Wouldn't it be better if those higher level designs were actually coded up and simply part of the implementation, but at a level of abstraction that is appropriate to the design.

We used to increase our level of abstraction, but now we appear to have been stuck for the last 30-40 years or so. At least I don't see anything that's as much of a difference to, let's say Smalltalk as Smalltalk is to assembly language.


Gilad Bracha wandered off to work on progressively typed languages after he’d had enough of trying to fix Java’s type system.

I think if it took something like JSdoc and have it more teeth you could do something like this in just about any of the dynamically typed languages.


Well, he also "wandered in" from doing optionally statically typed languages...see Strongtalk and Newspeak. :-)


Especially in the functional space there are some good ways to do DDD, although I don't see why we limit this to functional languages.

https://fsharpforfunandprofit.com/series/designing-with-type...


See my recent comment:

> We need model based editing environments that will allow us to have a much richer set of software building blocks.

https://news.ycombinator.com/item?id=16117668


I've had similar thoughts, notably as a way to side-stepping the composability limits of current parsing theory. But these limitation are increasingly worked around...

And looking at rust, I can start to imagine a future where macros are powerful enough to support a lot of declarative coding.

When coding javascript today I write code like:

    // can be imported, and api.router() mounted in express
    let api = new API(...);
    module.exports = api;    

    api.declare({
      method: 'get',
      route: '/hello-world',
      description: `bla bla bla...`,
      scopes: {AnyOf: ['some-permission-string']},
      // (more properties)
    }, (req, res) => {...});
Effectively making large parts of the app declarative. It's still far from powerful enough. But I'm not sure giving up text is the way to get more powerful building blocks.

Declaring JSON + function is super powerful in JS. In rust macros might allow us to make constructs similar to my "API" creator, but with static typing. And who knows maybe macros can expose meta-information to the IDE...


Nice.

But also a workaround, right? Because it's not actually declarative, it's APIs that are made to look declarative-ish.

So there's several layers of mismatch, for example most of the active ingredients being strings, meaning you're coding mostly in the string-language embedded into JS.

We do that a lot.

Probably time to start looking at our workarounds (and Macros are another workaround) and figure out what we are actually trying to do.


> But also a workaround, right? Because it's not actually declarative, it's APIs that are made to look declarative-ish.

I'm not entirely sure what you mean...

In an ideal world "API" would be a keyword, similar to how "class" is a keyword for declaring/defining classes. Example:

    API MyApi {
      constructor(defaultName) {
        this.defaultName = defaultName;
      }
    
      /** some description */
      method: get
      route:  /hello-world
      scopes: some-permission-string || other-permission-string
      {
        let name = request.query.name;
        if (!name) {
          name = this.defaultName;
        }
        response.send('hello ' + name);
      }
    
    }
    module.exports = MyApi;  // similar to how you would export a class in JS

I think something _like_ this will be doable using rust macros in the future (if not already on nightly).

Whilst my JS code, where I create an API object and call API.declare(...) for each route isn't as neat as having an "API" keyword and code-generation for said keyword, it's pretty close.

I just write the API.declare(...) as something that collects arguments, and then those can be instantiated multiple times... Similar to how a class can be instantiated multiple times.

I do similar things for loading components that depend on each other: declare dependencies, define function for loading using said dependencies -- then have some library code construct a function that loads any desired component with maximum concurrency (by analysing the DAG).


We can't and don't have to give up text.

Text is essential for humans and could be a significant part of the interface (both reading and typing).

The problem is more about the underlying unit being text.

The problematic part of text is when we just write it freely and then have to parse it back and make sense of it which has very severe consequences.

This is because in the programming environment world we are forever stuck with a "text editor" mindset.

This means we shift the complexity away from the environment and pass it on to the compiler and the programmer which is not a good place for it to be.

So you can use "nano" to write very complex programs. Some people consider that a benefit, and in one way it is. However it is also simultaneously very problematic. Because it perfectly demonstrates just how far removed the authoring tool is from the problem domain.

The way I see it is that to unlock a new generation of software development experiences there's no way but to accept that we can't forever confine the act of programming to the primitive act of writing text like a book in a text editor.

The environment needs to be much more closely connected to the problem domain so that it can further empower you to do stuff.


> The problem is more about the underlying unit being text.

I think richer editing, in effect editing the AST, rather than the concrete syntax is interesting.

But I'm not convinced that it's a blocker for making more declarative code. I think macros will get us very very far. The procedural stuff in rust, has me really excited about the future.

The downside of not using text as the underlying unit of truth is that you have to reinvent an IDE, version control, conflict resolution, review tools, etc. That takes a LOT of buy-in.

If you can 90% of syntactic sugar required from macros (and the like), you can work on the actual abstractions / declarations instead of the compiler toolchain.

Just my two cents.


I don't think it's possible (at least with today's technology). The high-level design/specification is intentionally vague; if it wasn't vague, we wouldn't need the low-level code, we could have a compiler generate it from the high-level specification.

As far as we can tell, the technology that can create a piece of exact code from a vague specification is called strong AI.

Heck, we don't even have a language to describe vague specifications without loss of fidelity. We don't know if such a language can exist.


I think there is at least one and probably several layers between "code as is written now" and "vague intents".

Of course I could be wrong.


I'm not sure that's possible in any truly meaningful way. Design is a very high level of abstraction that expresses a world, a particular view of that world with regards to a general set of problem domains, and a set of principles and theories about acting within that world. Code is a means (and not the only means) of achieving those actions.

This is not unlike the domains of philosophy, morality, ethics, and law. Attempting to express or enforce philosophy and morality via legalism is an exercise in futility, and even ethics which appears to be on the same level as law actually isn't since the presumption of ethics is behavior even in the absence of a law.


It seems kind of magical to be able to encode the intent of a program outside of its actual function. Theoretically this is what comments are for, but obviously those have zero enforcement value at the compiler level.


Thats because the 'why' is more powerful; from it you can infer the 'what' and 'how'.


I guess that is at least sort of what they are working on at VPRI.

http://www.vpri.org/


Absolutely, they're probably very much at the top of people trying to solve this.


Executable user stories?


This is a lovely article. Software is a possibly a) errant and b) misinterpreted operational semantics of some other semantic horizons of contractual or implicit expectations. Knuth's Literate Programming was onto something. We inhabit a world of word problems and even faulty realizations of rarer formal specifications. Claims concerning "phenomena in the world" drive maintenance and enhancement regimens.


Worse still, most of us walk around under the delusion that we know what we want while others can see it doesn’t make us happy.

How do you get the product you want when you don’t know what you want?


The hard part is reaching a committed niche in the user base, whether paying customer or audience kinds. Software tools generalize from and for niche sponsors with more specific needs. Software growth is then the "consensual delusion" of feature set and paying constituency accretion. Platform heterogeneity "churn" offers both big risks and big opportunities. Never a dull moment in software development.


Just buy Apple. They know what you want.


You may be joking, but I think the way in which this is true explains Apple’s success, even though they’ve generally released products that are less featureful and significantly more expensive than their competition.


1 point by charlysl 21 minutes ago | edit | delete [-]

Wouldn't it be better to use data abstraction instead of abusing primitive types?

For instance dates are often abstracted as a Date type instead of directly manipulating a bare int or long, which can be used internally to encode a date.

So, age, which isn't an int conceptually (should age^3 be a legal operation on an age?), could be modelled with an Age type. This, on top of preventing nonsense operations, also allows automatic invariant checking (age > 0), and to encapsulate representation (for instance changing it from an int representing the current age to a date of birth).


return x >= ‘A’;

Would be better than

return x >= ASCII_A;

surely. ASCII_A could be set incorrectly, or have a dumb type, and is more verbose anyway. By using the character directly, the code speaks its purpose.


>ASCII_A could be set incorrectly, or have a dumb type, and is more verbose anyway. By using the character directly, the code speaks its purpose.

I disagree. ASCII_A speaks it's purpose (we purposefully want an ASCII A stored here). And one can check the constant's definition, and immediately tell if it's correct. E.g.

  const ASCII_A = 'A' // correct

  const ASCII_A = 'E' // wrong
So:

  return x >= ASCII_A
tell us the intention of the code's author.

Whereas:

  return x >= ‘A’;
only tells us what the code does, which might nor might not be correct (and we have no way of knowing, without some other documentation).

So, by those two lines:

  const ASCII_A = 'E';
  (...)
  return x >= ASCII_A;
We know what the code is meant to do, AND that it does it wrongly (and thus, we know what to fix).

These line, on the other hand:

  return x >= ‘A’;
tells us nothing. Should it be 'A'? Should it be something else? We don't know.


How do you know that it's the "E" that is wrong, and not the ASCII_A? Maybe it should be ASCII_E.

(If you say it's because it's written twice, well, that's only a valid clue if ASCII_E doesn't happen to be defined too.)


>How do you know that it's the "E" that is wrong, and not the ASCII_A? Maybe it should be ASCII_E.

Ultimately you don't, but ASCII_A requires double the intentional actions to name it and have it also be 'A', whereas 'A' vs 'E' or whatever else is a much easier typo.

It's the whole idea behind NOT having magic values in your code. That is, that:

  if (temp > 212)
tells us much less than:

  if (temp > WATER_BOILING_TEMP)
and that we can more easily spot an error with:

  WATER_BOILING_TEMP = 275
than with:

  if (temp > 275)


Ultimately you don't, but ASCII_A requires double the intentional actions to name it and have it also be 'A', whereas 'A' vs 'E' or whatever else is a much easier typo.

Unless, as I wrote after, you have both ASCII_A and ASCII_E declared, which wouldn't be surprising.

I don't find the "spot the error" argument to be very convincing; I still name stuff, but just for the semantic value.


275°F is about right at a little over 3 bar.

Or 275°C at around 60 bar.


return x >= "A"; // ascii A

Gets the whole message across in one line, as does using 65 with the comment.


Comments aren't so good - now you have 2 things to change when the program changes. In the real world, often the comment won't be updated and will become actively misleading/wrong/bad.


(Ignoring the typo "A" != 'A')

return x >= 'A';

already and only means ascii A. Is there a C compiler anywhere where or likely in future where 'A' in C is NOT ascii A? The comment is redundant if correct, and could be wrong after an edit, so it has no value.


>return x >= 'A'; already and only means ascii A.

See, here's where you are wrong.

  ASCII_A = "A"

  alphas = ["Α", "А", "Ꭺ", "ᗅ", "ꓮ", "A", "𐊠", "A", "𝐀", "𝖠", "𝙰", "𝚨", "𝝖"]

  for c in alphas:
      print c == ASCII_A
Output?

  False
  False
  False
  False
  False
  False
  False
  True
  False
  False
  False
  False
  False
Several of the numerous possible utf-8 alphas. Those are not A in different fonts -- they are different unicode characters that look like A. And depending on your font they could look absolutely the same as plain ascii a (of which only one towards the middle of the list is). And depending on your locale and keyboard language settings, one of them could be as easy to click as the regular english A in ASCII.


I deliberately used the character literal ‘A’ and not any of your UTF8 strings. I think you are mistaken to confuse a character with your strings. Is this wrong?


You can have a unicode character literal -- and depending on the language there's no distinction between character and string (at the type level), a character is just a string of length 1.


I was assuming C, where there is a difference.

int main() { printf( "%d %d\n", 'A', "A" ); return 0; }

produces: 65 197730221

since the value of string "A" is its base address.


Be careful with your quotes (depending on which pseudo-language this is.)


Without the convenience of autocomplete and re-use in other places in the code, and with a comment that can always get out of sync with what the code does much easier than a named constant.


My comment was a bit weak. Putting something more of a requirement or design intent in the comment is better. Having it all there can be better than a well described constant with a definition somewhere else. Sure, they could get out of sync but at least you'll be able to see the discrepancy right there on that line if you're looking. But to each their own.


You must be one of those people who writes stuff like #define TWO 2


In this strawman example, perhaps. However, code is usually surrounded by other code. So you could have the 'A' in multiple places. By using an explicit identifier you are protecting yourself against typos (depending on the language, it could be a compile-time error or at worst a very clear runtime error instead of a logic error). The other benefit of ASCII_A is that you are signalling that you are doing ASCII comparisons as opposed to using 'A' as a placeholder for a special value of 65 & thus be confusing the reader (e.g. some spec says 65 is some kind of magic value). Finally, by having an ASCII_A it provides you with the opportunity to add documentation explaining why this constant is the way it is (why not 'B'). The benefits scale with the number of instances (e.g. if that specific 'A' appears multiple times in a file, you wouldn't be able to document it in 1 spot).

Of course, all of this is likely overkill for your specific example. If I'm writing a to_hex routine, I'm not going to extract those constants as the context & commonplaceness of the algorithm makes it redundant. For the same reason that one might write i++ in a for loop instead of i += ONE. However, extracting inline constants to named variables is frequently something I look out for in code review, especially the more frequently the same constant appears in multiple places, the more difficulty a reader might have trying to understand why that value is the way it is (or if there's any discussion at all), or if it's a value that will potentially change over time. The negative drawbacks of extracting constants is typically minimal & with modern-day refactoring it's a very small ask of the contributor.


> The negative drawbacks of extracting constants is typically minimal

> ASCII_A

It comes down to naming and purpose.

The example, ASCII_A, is terrible because it doesn't describe the purpose with its name.

What will end up happening in any large codebase is ASCII_A will get reused in dozens of different places for dozens of different reasons.

If it was named minValidLetterForAlgorithmX it would convey intent and its more likely to be used correctly.


I'm partial to ALPHA_START or FIRST_LETTER. While it's true that 'A' is both, the naming helps communicate that the context is range-testing for alphabets inside a larger character set.


> In this strawman example, perhaps.

I'm not so sure it's a straw man, I often see defining constants like this cargo culted even if there are only one or two uses. In that case 'A' is great because it's value is right there, I don't have to look at the assignment and then go look up what the actual value is, so it's more readable.

When it's used in several disparate places then ASCII_A is better and your arguments about correctness should take precedence, we sacrifice some readability but it's worth it.


It's a strawman in the sense that it's completely devoid of context with a contrived example. FWIW, I found 0 instances of something like this on GitHub (https://github.com/search?q=%22ASCII_A%22&type=Code&utf8=%E2...). I concur that cargo culting it to the extreme can lead to absurdness, but that's true of all maintenanability rules of thumb. Any rule of thumbs can be over-applied. However, in my experience the inverse is generally more true.


Sure, I understand. The surrounding code would include the type of x, which, if char, would help understanding even more.

But you’re channeling some crazy madness suggesting that someone would use ‘A’ to mean 65. Shudder. I guess we’ve all seen some horrors over the years.


>But you’re channeling some crazy madness suggesting that someone would use ‘A’ to mean 65.

Or just an encoding scheme.


Where 65 means ‘A’? Madness.


You could have a binary file format with a header of ABBA. You could choose to check the signature by doing an integer comparison of 65666665, 0x41424241 or "ABBA". Like I said, ASCII_A is a bit silly, but the maintenance value of extracting constant literals to constant variables with an explicit name & documentation explaining where the constant comes from is pretty solid, at least in my experience.


A is 65 in ASCII. http://www.asciitable.com


I know. It was a joke.


He says that in the article:

> ASCII_A (usually spelled just 'A')

Of course, they are not the same thing. In the last 6 months I've worked on a very old system that uses not-quite-ASCII. 'A' was 65 but '#' wasn't 35.


There's the theory that any hardcoded constant directly in code is bad idea. It may be used more than once, or used only once now, but in the future used more than once, or in the future the value may be changed and if it's used more than once, this is a source of issues.


I get that using hard-coded constant is a bad idea, but using ASCII_A instead of 'A' is about as sensible as using SIXTY_FOUR instead of 64.

If A signifies something else, use that name; otherwise just use plain 'A': it already gives us as much information as needed, and has one less place where the programmer can screw up.


I get that in general. It depends if the code is meant to inspect the character x on this machine right now, or really the ASCII character x.

As an aside, if someone changes the constant value of ‘A’ now, the world will be broken for a while. (But my code would recompile correctly unchanged with the new standard header.)



I wish websites wouldn't change the browsers default scrolling behavior


I misread the title as "the design of software is a thing of the past". I welcome the actual title and content though.


I lost interest after 'that is a fatal mistake'.

Fatal mistake? Really? An unrecoverable failure?

So, none of the software I've written in the last decade worked, despite all evidence to the contrary?

Right.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: