interpretation_of_unix_philosophy.md 6.4 KB

My Interpretation of Unix Philosophy

by drummyfish, 2020, released under CC0 1.0, public domain

I am a follower and advocate of Unix Philosophy (UP) and ideas related to it (suckless SW, countercomplex SW etc.). If you are interested in these topics, I would like to share with you my interpretation of the core idea behind UP.

UP is formulated very vaguely and so leaves a lot of space for different interpretations. Most people follow it just intuitively. Many will obey it just to some degree because following it strictly, to the extreme, seems to be counterproductive and even betray its own promises. UP will always stay a fuzzy set of rules stated in imperfect human language but we can bring more clarity to it by creating more specific interpretations.

I won't be talking about the whole UP and all of its recommendations but rather just the one that from my experience causes most confusion and debate. It is this statement:

"Do one thing and do it well."

We understand the spirit of this rule, but few people can define precisely what "one thing" and "do well" mean exactly. In my opinion these two things actually represent an upper and lower bounds on complexity of programs we should write. Please read on, this needs an elaboration.

Let's start with the first part – do one thing. Firstly, there is no such thing as only ever doing one thing – doing anything always involves doing other things too. For example even such a basic operation as adding two numbers requires performing bit additions, comparisons, reads and writes. So one thing isn't to be taken literally, it is something subjected to one's interpretation.

Let's try this: Does a 3D MMORPG game with AI and realistic physics count as one thing, a game? Microsoft could say they're doing just one thing, an operating system. Maybe it is so, but these clearly don't feel like the correct one thing.

So, when you ask a UP enthusiast "what does one thing really mean?", he (yes, it definitely won't be she) will very likely answer something like this: One thing means one thing that isn't too complex. In other words, he tells you there is an upper limit on complexity you shouldn't pass and when your project crosses this line, you should split it into two independent sub-projects. This will help your program become more elegant, manageable and reusable. This point is true and UP programmers basically agree on it.

Where exactly is this upper limit? This is subjective, fuzzy and depends on individuals, environment etc. There also isn't a single bulletproof complexity metric, but for the sake of simplicity let's just suppose lines of code (LOC). You as an individual can e.g. set your upper limit by saying "I won't ever write a program that's over 10 KLOC". But this limit depends on the programming language and only applies for compiled programs; the limit may get much lower e.g. for subroutines and functions that are part of a compiled program – e.g. 100 LOC. It is just important to have some sense of an upper limit.

Now for the more controversial part. Besides having an upper limit, UP people tend to simply try very hard to write as simple programs as possible and take the do one thing to the extreme. One example of this is the true utility that's implemented in a single line of effective code. I've tried to go this path and discovered that going to the extremes would really mean writing hundreds of thousands of programs of which most would just serve to print out a single individual Unicode character (utilities like print_a , print_b etc.). This clearly isn't good for performance, readability, efficiency or portability. Going to the extreme breaks the promise of UP.

What is the reasonable amount of simplicity here? Again, if you as a UP programmer, he may answer like this: Don't do completely trivial things, rather than making a program for printing every single character make a utility for printing strings. This will still be a simple program, it will cover printing of any character and string, and it's also not bad to add in a few features like capitalizing the string, printing it in reverse etc. This seem reasonably simple.

When we think about this advice, we find it actually says there is a lower limit on complexity too: don't do completely trivial things, rather merge them to a simple reasonably complex program. And again, we can, for the sake of being more specific, express this lower bound with LOC. According to this interpretation, you should set some lower limit, such as 50 LOC. I expect a considerable number of UP folks disagreeing with me here.

However, I think this is what the "do the thing well" really means – a lower limit. If your program is 3 LOC, whatever you have chosen your one thing to be, you're not doing it well enough because even though you achieve the thing, you lose so much on overhead and lack of flexibility that your program is hardly useful.

If we consider e.g. a cat program (program that writes out the content of a file), we see it can be implemented in less than 10 LOC. But that is too little to be a good program. If your basic implementation is so simple, you should add options that may likely be useful to the users, such as fitting the text into given number of columns, printing only a specific subsection of the file etc. So I, maybe a little controversially, say: Keep adding features until you're above the lower limit (e.g. 50 LOC). But indeed, don't get carried away not to cross the upper limit!

So, to sum everything up: One thing can really be anything, even a game or an operating system, but you have to fit it within general complexity limits, both lower and upper, which is why you won't find many UP MMO games. If your program is over the upper limit of complexity, it is no longer considered doing one thing and you should split it. If your program is below the lower complexity limit, you are not doing your thing well.

At the end I'd like to make one thing clear: you don't have to follow UP always, just when it matters – when you're writing a serious program that's meant to last, be used and improved. When you want to achieve a thing quickly, you are okay to write a quick throwaway 3 LOC script that you delete right when it's finished running, or you may take a big bloated and ugly peace of copy-pasted spaghetti code from the Internet and use it for something you immediately need to achieve. It's just not how you should normally write programs that are to be a part of a Unix ecosystem.