2014-07-10

Which Python is this about?

There is only one Python. This might come as a surprise to many people, so let's explain.

When BDFL started to develop Python, he was influenced by many ideas from other programming languages. Despite what you might think if you tried lazily thinking about some sexy aspects of it that are nice to tinker about, language design is hard. Not even BDFL gets it right every time. But differently from most other language creators, he's not afraid to say when that happens, and fix things.

Example: a long time ago, Python had int and long, two different types to represent the same thing (mathematical integers), just because one was "native" (fixed size, directly supported by underlying architecture) and thus fast, and another was correct. Of course, it's known that in programming, most integers are used as indices in small and moderate size arrays, where native type surely suffices.

But that's solving the wrong problem. Having a type that, because of its name among other things, suggests representation of counting numbers (and their opposites), but is unable to represent the number of humans currently alive, is just wrong. C has two strong reasons for sticking to that point of view (bare metal approach, and value semantics), neither of which has anything to do with Python. Its types are supposed to be representations of real-life concepts, not of processor and memory induced abstractions in another programming language. [Of course, if one models a behavior of some C program in Python, int32 would be a perfectly appropriate type. But that's surely too narrow for a built-in.]

Guido realized that, and since Python 3, there is only one int type, which chooses internal representation according to its size, and the whole model is hidden from the user pretty well. From the perspective of a programmer, 2**3 and 2**99 is calling the same method, int.__pow__ (it's even the same bound method, 2 .__pow__:), and the result has the same type as the operands. Of course it required some very deep interventions in the implementation and documentation, but BDFL boldly did it. There were many other such huge changes (Unicode, true div, delayed functionals,...), which were treated just as boldly.

How could he be so bold? It's really simple: there is such a thing as ideal Python implementation. All real implementations are just shadows of that Platonic idea, and although we can't always intuit the real idea from the shadow a priori, the thing works really well a posteriori: once you see the true thing, it's obvious.

So, Python 3.4 (or whatever it is in the moment you read this) is the most accurate existing approximation of true Python, which is the real subject of this blog. People usually ask me why do I hate Python2... I don't hate it. The same way I don't hate Newtonian mechanics -- it's wonderfully useful, up to a few warts here and there. But it isn't deeply, really the way the Universe works. Of course, neither is General Relativity (and neither is Python3) -- it's just that it's the best thing we've got. And it's pretty damn good.

2014-05-08

What's this all about?

I am not so special. I have just decided to write this since I see so many people getting it terribly wrong. Yes, I know a lot about Python, but what enables me to write authoritatively here is that I understand Python. Why? Mostly because I have read George Orwell. Language is important, because it shapes our thoughts. Language deeply affects the way we think.

Today, probably as a consequence of a proliferation of a great multitude of languages, too often people think that all languages are the same - we just have to learn the differences in vocabulary. printf in C, cout in C++, ? in BASIC, writeln in Pascal, System.out.println in Java, WRITE in FORTRAN, DISPLAY in COBOL, document.write in JavaScript, . in FORTH,  in APL, putStrLn in Haskell, write in Prolog, SELECT in SQL, content: in CSS, \write in LaTeX - they are all the same, right? Of course, there are syntactic differences - cout must be followed by that weird <<, WRITE has that strange (*,*) after it, and  is not found on a normal keyboard, for example - but that's all. Just superficial differences. No?

[Note: yes, I really did put SQL and CSS in there. On purpose. I hope you realize that CSS content: is not the same as FORTRAN WRITE. Well, guess what: neither is Python print. If you treat them the same, you're missing the point. And the fact that CSS "isn't a programming language" doesn't really matter. If you know that difference, and still fail to see the others, that's even sadder.]

If we want to learn a language, learning only the words and grammar is really missing the point. Words and grammar rules are just consequences of something much more fundamental, the culture from which the language emerges and which it feeds. And it really shows in a language. It cannot be hidden, and you shouldn't want to hide it, anyway. I'm sure you know at least some of very good reasons why you need to type 1 character in BASIC and 18 characters in Java to do similar things. Without them, BASIC wouldn't be BASIC, and Java certainly wouldn't be Java.

Here, I'll try to show you that the language matters. You shouldn't write C in Python, because it sounds just as bad as Chingrish. Native speakers will understand you, but will also - despite good manners - find what you speak really facepalm-worthy.

Graham in trouble

You know about Paul Graham. If you don't, please read some of his stuff. His writings are really thought-provoking. One of his best pieces, IMO, is Beating the Averages. There, he writes (emphasis mine):
to explain this point I'm going to use a hypothetical language called Blub. Blub falls right in the middle of the abstractness continuum. [...] 
As long as our hypothetical Blub programmer is looking down the power continuum, he knows he's looking down. Languages less powerful than Blub are obviously less powerful, because they're missing some feature he's used to. But when our hypothetical Blub programmer looks in the other direction, up the power continuum, he doesn't realize he's looking up. What he sees are merely weird languages. He probably considers them about equivalent in power to Blub, but with all this other hairy stuff thrown in as well. Blub is good enough for him, because he thinks in Blub.
He is, of course, completely correct, as he is every time he writes about general things. The power of his mind to grasp truly general concepts and put them in simple words is astonishing. But now look at this (from Re: Revenge of the Nerds, again emphasis mine):
 I know that Python currently imposes these restrictions. What I'm asking is what these restrictions buy you. How does it make Python a better language if you can't change the value of a variable from an outer scope, or put more than one expression in a lambda? What does it buy you to distinguish between expressions and statements?
Later, he seems even more desperate:
I was actually surprised at how badly Python did. I had never realized, for example, that a Python lambda-expression couldn't contain the same things as a named function, or that variables from enclosing scopes are visible but not modifiable. Neither Lisp nor Perl nor Smalltalk nor Javascript impose either restriction. I can't see what advantage either restriction brings you.
Of course you can't see it, Paul. Of course you can't. Just read Beating the Averages again, and everything will be clear. Yes, the day has come. Now you're the Blub programmer. It's really kinda funny that you don't see it. Your "I was actually surprised at how badly Python did" clearly shows you don't get it. And your pleas "what does it buy you" fit your description "about equivalent in power to Blub, but with all this other hairy stuff thrown as well" so perfectly.

I'll explain all these things. On this blog, we will first unlearn every wrong concept from thorny history of programming, and then we'll see how programming should be done. In a language whose concepts are powerful enough to confuse even Paul Graham.