Abstraction Driven Design - The Essence Of Software Development

29th Jun 2024
20 min read
Tags:
essay,
software,
programming

Over time I have talked to multiple people who genuinely seem like following some ritualized approach to software development, such as TDD or Scrum, living by laws such as “Red, Green, Refactor”, doing “daily stand-ups” or following recipes from some design pattern bible, either being still in honey moon phase with that idea, or really believing they found some kind of silver bullet. Maybe it’s just me, but something inside me automatically revolts whenever some new religion comes up and it’s evangelists start proselytizing.

Alongside reflections on my own stance concerning software development paradigms and practices, in the following I try to argue that mechanical usage of any development approach does not necessarily leads to good results, and that conversely, one can write perfectly fine software without the constraints of rigid ceremony – at least if you are the kind of person with enough passion for the craft and who is committed to to internalizing and following good practice in spirit, if not by letter.

Paradigms are Shadows of Someone’s Experience

It is well known that software development is both an art and a craft. The craft is, like any other, only learned by doing, and the art is learned through the experience gained by doing it for a long time. Nothing replaces having done innumerable stupid things, exploring many different approaches, and feeling all joy and frustration caused in the process. Experience cannot be taught or substituted.

All formalized methodology is an attempt of the creators, usually very senior developers, to capture the core intuitions and insights they gained and convey the lessons they learned to others - a mere shadow of someone else’s painfully earned experience, carefully crafted into a dogma for others to live by.

While you cannot truly teach experience, you can train behavior. So unlike in biology, where “form follows function”, here we have that “function follows form”. The promise of any methodology or paradigm is: if you follow the steps as you are told, you will end up with a result that has certain desirable properties, by design.

All these methodologies, paradigms and recipes are social, mental and technical tools, created to capture and disseminate some good practice or successful pattern - something that empirically and reproducibly results in high quality output in a suitable environment. Each paradigm is an abstraction of someone’s ineffable practical experience, their personal glimpse into the Tao of software development, earned by the daily fight with concrete problems they were solving. Sadly, these ideas, even if good, often turn into some kind of cult, but probably this is unavoidable.^[1]

The Art of Picking the Right Tool for the Job

One popular religion, Scrum, is about managing work, time and communication. TDD is about consistently creating well-designed, modular and loosely coupled interfaces while ensuring that they actually work. BDD is about focusing on actual requirements and usability to guide the design of the system. DDD is about extracting, isolating and shielding the core of your system from ever-changing requirements and environment parameters, and so on.

People following a dogmatic system too zealously tend to be overjoyed by good experiences they have, if that system works for them, but miss out on the insights and valuable lessons of knowing other ways, of feeling the difference and understanding the reason behind each rule on an intuitive level. Each methodology has intrinsic costs and trade-offs, just like every software it helps us building. Given some problem we want to solve, the approach or design dictated by someone else’s dogmatic system is

not necessarily the best
not the only reasonable option
sometimes not even feasible or useful

I strongly believe in eclecticism. Learning from everybody and everything, soaking it all up. Learning the trade-offs - the good, the bad and the ugly, of every architecture used for managing code or people, each programming paradigm. Trying not to fall for hypes, researching all the reasons why it might actually suck, identifying situations where it is not applicable or actively harmful. As it is often said - don’t use a saw when you need to put a nail in the wall. When learning some new approach, ask - what is it’s spirit, it’s essence? What does it try to teach me, what does it offer, is there any evidence? In which cases can this be useful, and when is it better to stay away? What is the history and context of it? Try to read the mind of those who came up with it and guess what they actually wanted to share with us.

Some ideas have been historically and empirically proven to be horrible or naive, and software development is not only too fast-paced and young, but also much too diverse and fluid discipline to pin it all down to a few easy to follow rules that almost ensure a solid result. That is why it is still more of a performance art rather than true “engineering” discipline, compared to the standardization level and maturity of other fields.

Yes, we might have “coding standards” and the like, but many conventions and rules we train ourselves to follow are just workarounds for limitations of often old programming languages we currently use, so these kind of rules do not count. Often we even disagree on the set of conventions to use – they vary widely across projects and programming languages. So what we do not have, are reasonably fail-proof higher order rules or recipes for building large complex systems that work well and remain maintainable over long periods of time. Software projects are always a moving target. The closest thing we have are rough architectural patterns, but even those need very careful consideration.

Honestly, I don’t want it any other way. It makes programming an intrinsically explorative and creative occupation. If everything was already figured out and the work could be done by following some human-level algorithm, I would probably rather do other things with my life. However, I do think that amidst the noise and buzz of the current trends there is something timeless and universal to understand, learn and apply, you just need to see or intuit it.

Universality of Structural Beauty

Learning to program is not difficult, like learning to speak and write, so people mostly tend to see the linguistic and communicative of code. That is why we now have a flood of bootcamp coders who think they became developers by learning some syntax, while at the same time skilled engineers are a scarce resource.

Software development is less about reading or writing, it is much more about thinking. Good prose does not depend on the language it is written in, or format used for printing. We value it for well-constructed deep thoughts (and feelings), expressed with a masterful, lucid and creative use of language, whichever the author chose to use. In fact, all other carrier mediums of art can be seen as just other languages for self-expression - be it music, paintings, or computer code. Beauty is in symmetry, beauty is in good structure, beauty is in balance, beauty is in depth.

I went to university not with the goal of getting a degree, but to learn about computer science beyond the fun and shiny linguistic interface that programming languages provide. If my academic education gave me anything of value at all, then it is the ability to think deeply and clearly, and acquiring a sense and taste for mathematical beauty.

Code is sitting somewhere between natural language and mathematical equations, and I think we would be all better off approaching programming more akin to mathematicians who construct beautiful, elegant formalisms and theorems, and less like bad books with incoherent plots and too many convoluted, but empty words.

There is a nice quote from Banach I stumbled on:

A mathematician is a person who can find analogies between theorems; a better mathematician is one who can see analogies between proofs and the best mathematician can notice analogies between theories. One can imagine that the ultimate mathematician is one who can see analogies between analogies.

I hope you can see the analogy to make this quote equally applicable to developers.

Digression: An Ode to Haskell

If there is something that had a strong influence on me as a developer except for getting an academic education and diving into theory, then it was picking up Haskell. Once you know a few programming languages, each next language feels easy, mostly like learning new words for the same things. That is what I thought before I tried to learn Haskell. Haskell was the first one that really taught me something genuinely, going way deeper than syntax. It’s like instead of going to a neighboring country, you suddenly find yourself 100 years into the future exploring a different planet.

I spent a lot of time working and fighting with Haskell, and it really is the most enlightening and still practically usable programming language one can try to learn. I might be a mediocre Haskell programmer, but I believe it made me a vastly better programmer in any other language – by breaking my brain and reconfiguring it in new ways.

There are many other languages doing some flavor of FP, but Haskell is so much more than that – due to its exceptionally powerful type system and the abstractions it enables. It is the only language I know that is actually suitable and used for real-world problems and at the same time can still feel like doing abstract math. It is beautiful, and still full of ideas that very slowly, over decades, leak into mainstream languages. Usually under some less scary names, in simplified more easily digestible form they are re-discovered or absorbed by the mainstream:

addition of lambdas as primitives (for frictionless function composition)
the rise of general higher order combinators, such as maps, folds, zips
optional type, instead of using null values (the Maybe type)
variant types, instead of cumbersome ad-hoc visitor patterns (the Either type)
futures and promises as special common cases of abstract monadic computation
powerful mechanisms for parametric polymorphism (“generics”, “templates”)
preference of generic function and type composition over inheritance
preference of immutable structures and isolation of mutable state
preference of controlling side effects and separating them from pure functions

Just to mention a few examples. Most concepts that get adopted have a good power-to-weight ratio, apparently sitting at some sweetspot between simplicity and near-universal applicability.

Sure, Haskell has its downsides, difficulties, design flaws as well – try to insert a harmless print statement in a random place, or understanding memory consumption. It probably will never leave the niche it comfortably sits in, and it does not want to^[2]. Just like Jodorowsky’s Dune movie that was never made, Haskell is the most influential programming language that almost nobody ever used, but that already affected dozens of modern popular languages and enlightened generations of receptive developers. What could be a bigger success?

The OOP vs. FP war is over, and everybody won

I’ve seen some of the fruit and read enough stories of classical, deeply inheritance-based OOP, manifested in the most pure form in Java. It is a naive idea: taking the common-sense way of conceptualizing the world, and trying to mimic it in code. But the real world is a mess and hard to reason about, so that is exactly what we get: a messy entangled global state of interacting entities, the nightmare of our own making. It looks like the world agrees by now that this idea has failed miserably.

FP, on the other hand, is more mathematical in its very nature - you break the world down to atoms, simple building bricks of computation. Like physicists, who capture the essence and dynamics of the universe in a bunch of equations concerning some dozen of fields, by uncovering and exploiting the underlying symmetries and surprising homogenity and predictability underlying all the mess we see at the macro level, they have long figured out that the global dynamics emerges from a myriad of simple local interactions, each individually perfectly understood and transparent (or very close to that).

Modern OOP, as it is preached and practiced these days, has been mostly stripped of it’s naive assumptions and blunders, teaching to use composition over inheritance, keep entanglement between objects low, and thinking more about clear ownership of data. Either on purpose or by some accidental conceptual convergence, this tamed and revised form of OOP is actually pretty close to being FP, presented in object-shaped clothing. The main difference is the emphasis remaining on the data entities over the operations, whereas the FP perspective naturally does the opposite - usually emphasizing functions consuming abstract generic interfaces, instead of concrete data types.

Both views have value, in fact - they are complementary. There is data, and there is code. Sometimes they are even one and the same and barely distinguishable (that lesson and it’s power best taught by Lisp), whereas sometimes we prefer to create more rigid boundaries (embodied by languages with strong and expressive type eystems). Both perspectives are equally important and must be balanced in order to achieve our actual goals. What we all want is to write less code that does more, code that has elusive qualities such as “loose coupling, but high cohesion”, which is just fancy OOP slang to express our desire for:

simplicity (small and obvious pieces that are almost trivially correct)
generality (you can use pieces in many contexts)
modularity (you can use pieces in isolation)
composability (you can build many different things with the pieces you have)

In a system with these properties, functions and data structures are like razor-sharp tools of haiku-like beauty. I believe that code written like this is naturally closer to mathematics, closer to theoretical computer science.

The code best exemplifying all these ideals can often be found in standard libraries. What code is more atomic, generic and composable and modular than the vocabulary the language creators provide to the masses of developers building all the required systems for practical tasks? And which developers are more qualified and experienced than those creating programming languages and writing powerful libraries? If you want to know how to write well-designed software, good standard libraries are a place to start learning.

The one true method(tm) - aggressive abstraction

I find the cage of routine, dogma and ceremony to be repulsive, but I believe I do have a thing that is some kind of an intuitive methodology. When I approach a problem, I do the theoretician’s trick - try to generalize, solve the simpler general problem, then specialize to your concrete context and fill in missing details. It is just the recursive use of tasteful abstraction, which can be described interchangeably as:

aggressively identifying recurring patterns and unifying them
compressing information, while reducing or externalizing noise
decreasing the “entropy” of the system, while increasing its “potential energy”

I try to do this across all the levels and scopes of a project, manifesting this as:

pretending to build a meta software for that kind of problem I have, instead of coming up with a one-off solution for a singular problem
pretending to write code for a generic library or framework, even if the code will most likely never leave that project
actually refactoring out a (possibly internal) library, if there is something of more general value that could be reused
thinking in terms of generic, small, simple functions and data types with good properties (as discussed further above)

If you get used to thinking like this, you naturally gravitate toward code that is easy to test, understand, reuse, (de-)compose and extend. All of this boils down to performing the creative art of finding the maximal level of abstraction that is useful at any given level, but not more than that.

The process involves throwing away as many details as possible that obstruct the view of the big picture, clearly identifying invariants and symmetries and dependencies, and adding as many degrees of freedom as possible, reducing the number of needed assumptions and constants coming from practical requirements and externalizing them to parameters. Maybe this can be seen as a vast generalization of the SOLID principles, which are both consistent with and I believe quite naturally follow from this way of thinking.

The art of abstraction is where pure mathematicians tend to excel at, but of course, there is also some risk of falling into the trap of over-abstraction - when your general solution has so many parameters and moving parts that it looks not simpler and more managable, but more convoluted and complicated. This means that you missed the right point to stop. Even abstraction, the most powerful cognitive tool humanity ever created and perfected, is still just a tool. It should not be fetishized, and requires some taste and experience. As usual, everything is only good in moderation, and only if approached with healthy pragmaticism.

We are not abstracting just for the sake of it or for showing off, but with the goal of finding a clean, elegant, well designed and reasonable solution to a concrete problem we face. A true virtuoso must know when to indulge in some fancy wizardry, and when it is best to stick with something simple and straight-forward. Just like a good musician could solo all day, but will humbly play and stick to a basic melody or rhythm, if that is what the song or band needs.

How I usually write software in practice

When I sit down to solve some fresh problem from scratch, after understanding the most important side constraints and true requirements, I try to break it down, come up with nice loosely coupled building blocks, starting from sketches of bigger components and subsystems, then decomposing those subsystems down to focused, small functions and data types. I might draw some conceptual sketches and type signatures on a whiteboard or paper.

Then, piece by piece, I start writing out the functions and structures, more or less simultaneously encoding my intentions and assumptions about their behavior in unit tests. All of this is then semi-consciously mapped to idioms and constraints of a programming language that I have to or choose to use, balancing the style and patterns the language was designed for (as far as I am aware of them), and my own tastes and preferences.

If I can, I extensively use a REPL to interact with pieces of code. It keeps the gap between mental model and concrete implementation as small as possible, the code becomes more tangible - almost like a physical object. If I do an experiment often enough to notice and be annoyed by a pattern, it’s time to pin it down as an automated test. If I have no REPL, I use a test framework as a poor person’s REPL, writing out and evaluating my exploratory interactions with the function I am writing in pseudo test cases, later shaping and refining them into proper tests that check all relevant edge cases and invariants.

At no point of all this I use any trendy methodology, or follow any rigid disciplined approach or ritualized practice that has an established label. In fact, I feel that any rigid routine is very quickly not an aid, but a cage. Maybe a well-intentioned cage, but still a cage nonetheless. Fixed paradigms that someone else created are, to loosely quote Wittgenstein, the ladder that you need to use just once, in order to get to some elevated, difficult to reach place. The ladder becomes obsolete and can be thrown away once you attain the insights you can get from the new vantage point.

Once you internalize the essence of a concept, it becomes a part of your thinking, a part of your being. You learn to voluntarily stay within the constraints outlined by a useful conceptual framework, but stay open minded to recognize situations when breaking some rules or following other rules is the better option.

Am I any good at all this? I don’t know, I think I’m doing fine and hope that I am getting better over time. What I think I do know for sure is where the light is, and I always try to reach towards it. Once you start believing you are already there, you stop learning and growing. All I can hope is that code I write next year is better than last year, and that I can gracefully look back at code from last year, forgive myself, and be thankful that I apparently have evolved. I guess, another implicit rule I follow is: try to write code that future-you will not be ashamed of or hate you for.

Programming is the closest substitute for Magic

When I am absorbed in the process of creation, the only rule I follow is my gut feeling, intuition, and desire for crystalline mathematical beauty of well-crafted structure, reflected in the virtues of good code discussed above. It all just naturally flows - from the ephemeral platonic realm of perfect concepts and ideas, channelled and filtered by my imperfect thoughts, tainted by the mess and incidental complexity of real world problems and constraints, and finally projected out, pouring the spirit that came ex nihilo, right into the silicon flesh in our physical realm - executable code on a hard drive, created to do something useful.

Probably it is the closest we can ever get to the fantasy of wielding supernatural powers and using powerful magic spells that can bring a golem to life. For me, tapping into the mathematical void, source of all abstraction and creation, channeling and shaping it to manifest something new in the world that unifies both beauty and value - this is all the joy and bliss of software development.

Now the somewhat amusing punchline, and starting point for thinking about all this, was that the typical outcome of my intuitive “abstraction driven design” process is probably not too different, maybe even observationally indistinguishable, from results that others would achieve with TDD and other more formalized approaches. Which, I guess, confirms that these approaches, when used correctly, are actually working pretty well! I just prefer staying a liberal agnostic and really dislike any kind of orthodoxy, that is all.

So closing on a less serious note - to not appear to others as a wild and uncivilized barbarian, I figured that when talking to other people, maybe I should say: yes, in fact I also do some form of “TDD”! Only without mentioning that the “T” in my personal “TDD” is not for “Testing”, but for “Tao” – because I intuitively follow the Tao of structural beauty and tasteful abstraction. If the results are decent enough, who will be able to tell the difference?

If it would not turn into a cult, it would mean most people already know better. As they don’t know better and thus become cult warriors, the original idea their cult is based on is apparently is not obvious or intuitive enough for them. ↩
“Avoid success at all costs” is the motto of the language, correctly parsing the statement to get the intended meaning is left as an exercise for the reader. ↩