On Programming Language Complexity, or Our Brain as a CPU with 7+-2 Registers

	Author:	“No Bugs” Hare Follow:
	Job Title:	Sarcastic Architect
	Hobbies:	Thinking Aloud, Arguing with Managers, Annoying HRs, Calling a Spade a Spade, Keeping Tongue in Cheek

After writing on a subject of results of integer conversions being really convoluted (and often counter-intuitive) in C/C++ [NoBugs18], I was engaged in more than one discussion where the main line of argument was along the lines of “hey, as long as the behavior is well-defined in a consistent manner – it is ok; it is our job as developers to know it”. While trying to articulate an IMO-very-obvious counterargument (which boils down to “certain things, while being strictly defined in a very formal way, are utterly unusable”) – I came to an interesting analogy which I want to share.

Miller’s Law and 7+-2

Wtf hare: “number of objects we can keep in working memory, is 7+-2In 1956, a seminal article [Miller56] was published, which is often interpreted as “number of objects we can keep in working memory, is 7+-2”.

Since that point, lots of further research was published (arguing that it is not 7+-2, but rather 4+-1, with numbers 5 and 6 also being mentioned in this context); also much more complicated models were proposed, with debates about subtle differences between “short-term memory” and “working memory”. However, all the research I know about points to an extremely limited size of both short-term memory and of working memory, and this is all that really matters for our purposes; arguing whether it is really 4 or 9, won’t change our reasoning in any significant manner.¹

As “7+-2” sounds nice (and as any other numbers mentioned above won’t make any difference to our reasoning) – for the purposes of this article, let’s use “7+-2” as a synonym to “very limited number”.

¹ As a side note: most of criticism-related-to-interpreting-Miller’s-Law-in-the-industry, comes from those UX people who’re arguing for longer-than-7-item menus etc.; I don’t know for sure whether Miller’s 7+-2 applies there, but what I do know for sure is that having more than half a dozen of non-ordered-non-structured items in a user-presented mandatory choice usually qualifies as a Really Bad Idea(tm); so is giving up to management desires to have user fill a 20-field form merely to start playing a free game.

#define Brain “CPU with 7+-2 Registers”

Hare with an idea: “Let’s consider our brain as a CPU, which consists of the control unit, ALU, and 7+-2 registers.As we’re programmers – for us, the simplest way to represent this concept of the short-term (or “working”) memory with 7+-2 slots, will be to consider our brain as a CPU, which – as usual – consists of the control unit, ALU, and registers. And, according to our interpretation of Miller’s Law, the number of registers in our “brain CPU” is limited to 7+-2.

Of course, it is an extremely simplified view – but for our purposes, it will do.

Unnecessary Complexity in Programming Languages Eats All-Valuable Registers in Our Brain CPU

Now, armed with this approximation of our brain – it becomes easy to explain why having not-so-obvious rules in the language is a Bad Thing(tm) for developer’s performance:

it is just because we need to remember about these rules all the time, which takes one of our Brain CPU Registers

That’s it. What happens next – is a process which is extremely well-known to any developer of high-performance C/C++ code: if we’re eating even one register with something-which-is-not-really-necessary – our compiler will find a way around it, but usually, it will come at a cost of lots of PUSH/POP instructions – which are relatively expensive on modern CPUs.² As a result – eating even one single register out of our available ones will cause a significant slowdown (BTW, number of registers in our brain is similar to that of x86, where such effects were very pronounced).

² compared to register-register ops which take less-than-1-cycle on average, read/write even to L1 cache tends to cause 3-5 cycles

Back to C/C++ Conversion Rules

When applying it to the case in hand (C/C++ integer conversion rules), it will mean that sure, it is possible to remember this set of rules; however – as it isn’t intuitive, we’ll have to remember about it all the time. This, in turn, will eat one of our “brain CPU” registers, causing a significant slowdown in our development.

Hare with omg face: “To make things worse – depending on our project guidelines, it can easily be not the only silly thing which we have to remember about”To make things worse – depending on our project guidelines, it can easily be not the only silly thing which we have to remember about (such silly things may include everything, from being-safe-from-signed-overflow-UB, all the way to order-of-obtaining-mutexes to avoid deadlocks), so in practice, we’ll be using more than one of our CPU register for reasons which are not directly related to the things-we’re-really-working-on.

To summarize – I contend that

“Ability to comprehend” programming language is not sufficient for it to be usable; it is “ability to comprehend it on autopilot” which is really desirable, at least if we take developer’s productivity into account

As an extreme example of theoretically-comprehensible but utterly-unusable programming language, I refer to [brainfuck]; while it is very simple (and is even Turing-complete) – I hope there is no doubt that rewriting even a 10K-LoC real-world program into [brainfuck] isn’t really feasible.

I rest my case.

Practicalities

BTW, I am very far from saying that we shouldn’t use C++ because of conversions; what I am arguing for from a practical standpoint – is

having guidelines which prevent the very need to think about such silly (~=”not-directly-related to our task-at-hand”) things

In case of thinking about order-of-acquiring-mutexes, my favorite guideline is “do NOT use mutexes at app-level, at all” <wink />; as practice has shown (on more-than-one million-LoC project handling billions user interactions per day) – it works as a charm. As for avoiding a signed-overflow-in-a-non-critical-part from becoming a disaster – I found myself arguing for using -fwrapv in production (with lots of checks in critical places, and with potential testing with -ftrapv).³

Judging hare: “Moving C++ one step farther from being a brainfuck is IMNSHO always a Good Thing(tm)As for C/C++ integer conversions – the answer is not that obvious, though I’ve seen policies of making-everything-signed (including container wrappers using some analogue of ssize_t), working reasonably well.⁴ And BTW, some help in this regard from WG21 would be very nice too (it would move C++ one step farther from being a [brainfuck], which is IMNSHO always a Good Thing(tm)).

³ I wouldn’t argue for it for a code-which-controls-nuclear-reactor, but for the more-or-less-usual business-level code it has been seen to work better than other alternatives

⁴ that is, when used in conjunction with -fwrapv policy mentioned above

[+]References

Acknowledgement

Cartoons by Sergey Gordeev from Gordeev Animation Graphics, Prague.