Long day, lots of driving, and then a two hour BJJ class, and then a Spanish study session. My brain is exhausted so this is a bit rambling, sorry:
No, they aren't talking about pure functions. Look into design by contract which brings this idea to the forefront. Invariants are properties that you state about a system that should always hold true over some portion of code. They are related to pre- and postconditions (which are more commonly understood and used, sometimes even explicitly, whereas invariants tend to be less explicitly covered by code).
Pre- and postconditions are properties that hold true before or after, respectively, some segment of code (often defined at function boundaries). Like that an input, even though its type may be int, should always be positive. That's a precondition, you can encode it into some languages, or you can cover it with an assert, or you can just "know" it (far more common). Postconditions are similar, at the end of the computation some properties hold about the returned value or the way that the state of the system has changed.
Some languages (SPARK/Ada, for instance) even let you prove that if some properties are true (basically preconditions become assumptions about the system) then your code will result in the postcondition being true (but not for everything you may want, they're working on pushing the boundaries of what can be proved). And if the system can't prove it, but you believe it to be true, you add other properties about the system or tell it to assume certain things that you may be able to prove by hand (or whatever) but not programatically. Or perhaps the systems proof reveals a problem where the computations will violate one of the conditions. Languages like Idris permit encoding many (most?) of these ideas into the type system which has its own accompanying proof system.
int foo (int n) {
// pre: 1 <= n && n <= 10
// post: 1 <= result && result <= 10
return n * 10;
}
A prover (or anyone with common sense, short example, use your imagination for how this could be more usefully extended) can show that given the assumption of the precondition and the action of the code, the post condition cannot hold. In some languages, without the ability to prove it or even with, you may use asserts (often people disable these for "release" builds because they can be a performance hit, so hope your testing and code reviews are good):
int foo(int n) {
assert(1 <= n && n <= 10); // pre
int result = 10 * n;
assert(1 <= result && result <= 10); // post
return result;
}
Invariants are related to that, they are properties that should hold over some period (perhaps the entire program, perhaps just parts). A common application is the idea of a loop invariant. This is something that is expected to be true about the state of execution of a loop across all iterations. As a trivial example you may have a sorting algorithm (let's not worry about the efficiency of this, brain tired) and you have two indexes (quadratic time worst case). Your outer loop determines the lower bound of the next inner loop:
for i from 1 to size
min = i
for j from i+1 to size
if a[j] < a[min] then min = j // not a good sort, don't do this at home kids
swap a[i] and a[min]
Then you have two invariants: The first portion of the sequence (from 1 to i) is sorted (it may not be the final elements in it, but it is sorted). This holds both before and after the swap. And min is always the index of the smallest value in the subsequence from i to size when we hit the swap. Like I said, this one is trivial, but it's about all my brain can handle right now.
You can also see invariants in the way abstract data types may be used. For instance, to use a simple example, suppose you had (for reasons) a structure called unitary_complex with these two fields:
unitary_complex { real, imaginary }
We can establish an invariant: The length of this complex number will always be 1 (making it unitary). A reason OO languages often encourage the use of getters and setters (setters, in particular) is precisely to ensure that invariants like this are enforced. Raw access to the struct means that the responsibility to maintain the invariant is scattered across the codebase, everywhere that real is modified, imaginary may or may not need to be modified as well. So everywhere you use this you have to be aware of the invariant and maintain it, that's a great deal of complexity that gets introduced. And if somewhere you do:
ui.real = 10; ui.imaginary = 20;
There is no stopping you. But if you have to go through a method (in OO languages) to modify it (for instance, you can only rotate the pair, math happens behind the scenes) or if the type is "opaque" (meaning outside its primary module and perhaps a few limited "friend" modules the internals cannot be seen or modified, like can be done in C and most languages, really) then you can contain the enforcement of the invariant in one (or at least fewer) place(s). Again, tired brain, use your imagination to consider any type that you may want to constrain by some property you want to always hold. Many data structures that you use in your everyday programming language have these kind of invariants (a set never has more than 1 of an item; a dictionary never has more than 1 of a key; a sorted collection is, well, sorted).
Invariants are sometimes "relaxed", that is there may be a portion of the program where it doesn't quite hold true. In a more complex example where you cannot guarantee a sequence of statements are executed atomically, you may only be able to declare something invariant at a point before or after the sequence (which makes it similar to preconditions and postconditions in some sense):
total_cost = 0
bag = {}
// invariant: total cost is always the sum of the cost of items in the bag
// adding an item:
bag.add(tv)
total_cost += 1000 // only after this is the invariant true again
The invariant is not actually true at the moment between those two statements, which is helpful to recognize. In a concurrent system this suggests that we need some kind of control to ensure those two statements are treated atomically (they both complete together). In fact, this is the sort of "trivial" thing that TLA+ and similar systems can help you discover. For a real world case, I used TLA+ to show that the communication model between two processors using shared memory (it was the system architecture we had, not something we could alter) could permit, basically, a section to be marked "ready" (for reading) when it wasn't actually ready, which meant the reader could end up with garbage data. Nothing to do with pure functions, just defining invariants, preconditions, and postconditions and observing what the available actions could do (and in that case did) that would violate them.
PSA: The above DbC wikipedia page contains a link to the paper Applying Design by Contract by Bertrand Meyer which is THE paper to read to learn usage.
No, they aren't talking about pure functions. Look into design by contract which brings this idea to the forefront. Invariants are properties that you state about a system that should always hold true over some portion of code. They are related to pre- and postconditions (which are more commonly understood and used, sometimes even explicitly, whereas invariants tend to be less explicitly covered by code).
Pre- and postconditions are properties that hold true before or after, respectively, some segment of code (often defined at function boundaries). Like that an input, even though its type may be int, should always be positive. That's a precondition, you can encode it into some languages, or you can cover it with an assert, or you can just "know" it (far more common). Postconditions are similar, at the end of the computation some properties hold about the returned value or the way that the state of the system has changed.
Some languages (SPARK/Ada, for instance) even let you prove that if some properties are true (basically preconditions become assumptions about the system) then your code will result in the postcondition being true (but not for everything you may want, they're working on pushing the boundaries of what can be proved). And if the system can't prove it, but you believe it to be true, you add other properties about the system or tell it to assume certain things that you may be able to prove by hand (or whatever) but not programatically. Or perhaps the systems proof reveals a problem where the computations will violate one of the conditions. Languages like Idris permit encoding many (most?) of these ideas into the type system which has its own accompanying proof system.
A prover (or anyone with common sense, short example, use your imagination for how this could be more usefully extended) can show that given the assumption of the precondition and the action of the code, the post condition cannot hold. In some languages, without the ability to prove it or even with, you may use asserts (often people disable these for "release" builds because they can be a performance hit, so hope your testing and code reviews are good): Invariants are related to that, they are properties that should hold over some period (perhaps the entire program, perhaps just parts). A common application is the idea of a loop invariant. This is something that is expected to be true about the state of execution of a loop across all iterations. As a trivial example you may have a sorting algorithm (let's not worry about the efficiency of this, brain tired) and you have two indexes (quadratic time worst case). Your outer loop determines the lower bound of the next inner loop: Then you have two invariants: The first portion of the sequence (from 1 to i) is sorted (it may not be the final elements in it, but it is sorted). This holds both before and after the swap. And min is always the index of the smallest value in the subsequence from i to size when we hit the swap. Like I said, this one is trivial, but it's about all my brain can handle right now.You can also see invariants in the way abstract data types may be used. For instance, to use a simple example, suppose you had (for reasons) a structure called unitary_complex with these two fields:
We can establish an invariant: The length of this complex number will always be 1 (making it unitary). A reason OO languages often encourage the use of getters and setters (setters, in particular) is precisely to ensure that invariants like this are enforced. Raw access to the struct means that the responsibility to maintain the invariant is scattered across the codebase, everywhere that real is modified, imaginary may or may not need to be modified as well. So everywhere you use this you have to be aware of the invariant and maintain it, that's a great deal of complexity that gets introduced. And if somewhere you do: There is no stopping you. But if you have to go through a method (in OO languages) to modify it (for instance, you can only rotate the pair, math happens behind the scenes) or if the type is "opaque" (meaning outside its primary module and perhaps a few limited "friend" modules the internals cannot be seen or modified, like can be done in C and most languages, really) then you can contain the enforcement of the invariant in one (or at least fewer) place(s). Again, tired brain, use your imagination to consider any type that you may want to constrain by some property you want to always hold. Many data structures that you use in your everyday programming language have these kind of invariants (a set never has more than 1 of an item; a dictionary never has more than 1 of a key; a sorted collection is, well, sorted).Invariants are sometimes "relaxed", that is there may be a portion of the program where it doesn't quite hold true. In a more complex example where you cannot guarantee a sequence of statements are executed atomically, you may only be able to declare something invariant at a point before or after the sequence (which makes it similar to preconditions and postconditions in some sense):
The invariant is not actually true at the moment between those two statements, which is helpful to recognize. In a concurrent system this suggests that we need some kind of control to ensure those two statements are treated atomically (they both complete together). In fact, this is the sort of "trivial" thing that TLA+ and similar systems can help you discover. For a real world case, I used TLA+ to show that the communication model between two processors using shared memory (it was the system architecture we had, not something we could alter) could permit, basically, a section to be marked "ready" (for reading) when it wasn't actually ready, which meant the reader could end up with garbage data. Nothing to do with pure functions, just defining invariants, preconditions, and postconditions and observing what the available actions could do (and in that case did) that would violate them.https://en.wikipedia.org/wiki/Design_by_contract