jrising | [salon, muse] Salon Discussion, February 6, Changes in Programming

I really need to get back in the habit of making thorough notes shortly after the Salon-- I'm losing too many good discussion threads. One of our biggest topics at the Salon concerned recent changes in programming, which I've wanted to write about for a while. Here are my thoughts on it, informed by the Salon discussion, plus some other discussion topics below. Feel free to remind me of other topics in the comments, and I'll record what I remember about them.

Programming has changed enormously since computers were invented. I don't just mean that assembly gave way to higher-level procedural languages which gave way to object-oriented languages, although that mirrors the shift I'm interested in. In the days before C, programming languages had a fairly-small, well-defined collection of building blocks, and it was the programmer's responsibility to construct whatever they needed. In a shift into libraries and then object-oriented languages, the programmer's job has become more to connect pieces constructed by other people.

The pieces are also changing. They're becoming more intelligent, more communicative, and more accepting of ambiguity. Programmers have realized the power in-- and the need for-- type-fluidity. Currently that's instantiated in typeless languages, but these still form a kind of antithesis waiting for new synthesis with traditional typed languages.

The things we're programming are different too. The programmer is no longer a craftsman. In the past, people designed programs to do a certain thing well. Now, people realize that they are really engineering experiences or "ways of understanding". We like one program over another not because it does something better, but because it allows us to conceive of our task differently.

Which is exactly what different programming languages themselves do. With plug-in designs, programs themselves are allowing users to construct the context for their own experience.

The way we think of technology is in such incredible flux right now. With web 2.0 ideas (participatory, dynamic content; new kinds of social networking), the internet is changing and becoming the necessary context of all computer use. With mobile devices, the personal computer, our interface to it, and the ways we use it are changing. In another 10 years, programming will be vastly different; in another 20, it probably won't exist, as we currently conceive it.

Anyway, we also talked about Digital Rights Management, specifically relating to Apple's decision to drop DRM-protection tying iTunes to iPods, and how artists should be "rewarded" for their work. And we talked about the nature of Salons, and the posibility of having a kind of "party-salon", which is more like the kind of gathering that was found in Paris.

Flat | Top-Level Comments Only

From:

g-w-s.livejournal.com

Though the range of possible programming experiences may broaden, there are still plenty of craftsmen. Working with embedded systems / signal processing in the wireless communications arena, I can't yet afford to use the languages that allow for more ambiguity - I might enjoy being able to experiment more, but in the end the bottom line of mips/memory efficiency always comes back.

Does this mean that I will be forever pigeon-holed, most valued for an archaic set of skills that is no longer taught?

You say that in 20 years it probably won't exist as we conceive of it.. though the interfaces may change, I think there will always be applications that crave as many cycles as can be thrown at them - and the competition will force you to use whatever is available in order to maximize the potential of a piece of hardware. The shiniest language you could conceive will not be able to map onto the hardware in as precise a manner as would be necessary to 'win', excepting for the possibility of an extreme time-to-market race.

Sadly, while it is possible to make an okay compiler - it isn't by any means a trivial task to make software that translates between languages with a great deal of cycle efficiency. As you introduce ambiguity, it seems that you must also introduce penalties.

Did you discuss the Macrovision response to Job's open letter? :)

From:

jducoeur

I can't yet afford to use the languages that allow for more ambiguity - I might enjoy being able to experiment more, but in the end the bottom line of mips/memory efficiency always comes back.

While I've made this argument myself, I confess that I believe it less and less these days. Compilers are getting *very* smart about these things, and the result is that it isn't at all obvious that you're losing significant efficiency. For instance, when you dig under the surface of the new LINQ stuff coming out of MS, which is *apparently* typeless, in reality it's nothing of the sort. The compiler is simply doing fairly complex type extrapolation, even constructing novel types when necessary. But it's still strongly typed under the hood, and pretty much as efficient as the underlying language.

I think there will always be applications that crave as many cycles as can be thrown at them - and the competition will force you to use whatever is available in order to maximize the potential of a piece of hardware.

While I'm willing to grant this is *possible*, is there actually reason to believe that it is so? The evidence from the history of the field is that we always keep moving up the abstraction chain. The high-performance side trails the cutting edge, but it *does* follow along. Most videogames (generally the bigger speed-hogs there are) are built in C++ these days, and people are seriously playing with higher-level languages.

Indeed, it's pretty clear to me that it's not going to be *possible* to write the highest-performance code with low-level languages in the future. And I don't mean in 20 years, I mean in five. The new terascale processors that are coming out in the fairly near future, with massive numbers of cores, are well beyond anyone's ability to program well. It's going to take specialized high-performance, high-level languages to use them effectively, and it's going to take giving up a *lot* of control to the compiler. See Fortress for the most intriguing example I see at the moment -- fascinating language, which takes away many of the primal assumptions of programming, like order of execution.

So while I doubt that programming is going away any time soon, I think it's fair to say that "programming as we know it" probably is. And it's going away precisely *because* of performance. Only by accepting that these things are getting too complex to do in your head are you going to be able to really utilize the new, massively-parallel architectures...

From:

g-w-s.livejournal.com

Again, I cite a particular application. In my space, we can't use c++ - we have measured the overhead for our specific algorithms, and using c++ would mean we cannot sell our product. A lot of our code is still in assembly because the compiler just can't optimize as well as the best hand-coders - and I know some of the guys that work on the compiler. They're good, but optimizing EVERY case is a real trick.

With other architectures, where there is competition to provide a compiler that produces more cycle-efficient output, this may be different.

In my field (DSP, specifically the ruthless competition to improve wireless channel capacity per watt), history hasn't shown any change. Any improvements to the hardware are immediately translated into channel capacity figures - and failing to meet those figures would prevent us from competing.

I agree that programming in lower level languages is more complex - so you have a riddle: is it better to write more (cycle) efficient in a lower language and have to deal with longer coding times (reflected in the time to market), code maintenance problems, and poor scaling to future architectures, or is it better to use a higher level language that may improve upon everything but the cycle efficiency?

If you are in my position and you are facing a hard bottom-line related to cycle efficiency, you (unfortunately) have to compromise on every other aspect of the language. If you have a promise of rapidly increasing clock speeds across varied product lines or one of the other benefits of a higher level language is your priority, then it's obvious where that will go.

If most of the world's programmers migrate to higher-level languages, I would refer back to my original question:

Does this mean that I will be forever pigeon-holed, most valued for an archaic set of skills that is no longer taught? =)

From:

jducoeur

Hmm. Okay, it's true that the embedded space has to run *much* further behind the rest when it comes to programming languages, by necessity.

Still, I have to expect that even there hand-programming is *eventually* going to become problematic. The thing is, cycle efficiency per core is ceasing to be the gating question: the nature of computer architecture is taking a radical left turn, starting last year. After years of everyone knowing that multi-core would eventually become necessary, there was a rather sudden consensus that last year was the time -- that single-core architectures had reached their limit, and the only way to squeeze out more speed was to go multi-core.

More relevant to your point, multi-core is basically what's driving per-watt efficiencies now, as well. Part of what's been driving up the energy cost per unit of speed has been the relentless march of on-chip optimization, and those optimizations are horribly expensive. So instead, everyone is making a real leap, to more, simpler cores on each chip. Those cores are both significantly cooler and slower than the ones that preceded them; in theory, the speed is being made up for by the fact that there are more of them.

In the short run, I don't expect that to change your life dramatically: you'll just hand-code to the separate cores. But eventually, I have to question whether that's going to be practical. You can hand-code to four cores without real difficulty, but making efficient use of, say, 80 of them (and they are talking about numbers like that in the not *terribly* distant future) seems less plausible to me. I don't know the embedded world *nearly* as well as I do the personal/server space, but it feels to me like a paradigm shift is going to become a flat necessity eventually.

If most of the world's programmers migrate to higher-level languages, I would refer back to my original question:
Does this mean that I will be forever pigeon-holed, most valued for an archaic set of skills that is no longer taught?

The short answer is yes; indeed, it's probably largely so already. If you're operating at the C/assembler level, I'd guess that most current graduates really can't relate to what you do. (It kind of threw me when I started to realize that most of the kids coming out with CS degrees had never done *any* assembler, but it's true, and they regard C as quaint if they know it at all.)

From:

g-w-s.livejournal.com

In my space, there are two things that will keep multiple cores simple:

- The system architecture already uses multiple (many) cores - only the inter-core bandwidth is improving. On-chip, we do hand-code to a couple of cores, but then often the same images will be shared amongst several cores in a multi-core chip because they will be performing the same functions in parallel. Whether the core aggregating the data is handling 4 or 80 sets of streams is just a matter of bandwidth and memory.

- Deterministic processing (desirable in telecom) requires that core n perform x and y - and only x and y. You wouldn't believe how skeptical and freaked out some of our larger customers were when we made it that much less deterministic by adding *cache* to our chips.

So we certainly do trail other technology, but it's because our bottom line is dictated by how many chips we can sell - which is in turn determined by our pricing (yield from the fab), our time to market (vs our competitors), and our efficiency (mips/watt).

We have never employed on-chip optimization, as it would adversely affect both TTM and efficiency. Sadly, our compiler team has not yet produced the perfect compiler. ;-)

It's an interesting problem. I am very interested as an engineer in optimizing problems (in general, I like to make things be efficient), so it seems to be a good fit for me. I had done some assembly and C in college, but I think the courses have been largely replaced by java/c++ in many schools. It's obvious that C/assembly are not going to go away - and though I am in something of a niche, the talent pool is most likely going to shrink as demand grows. At least, that's what my bank account hopes for. =)

From:

jrising.livejournal.com

I suspect this is exactly where things will go. Higher-level languages give compilers the information and freedom they need to do optimization. Java is often able to do operations on blocks of memory faster because it can make assumptions about pointers and the size of the memory that you can't in C.

I spoke too broadly to claim that lower-level languages will disappear. Assembly language will probably take more features from high-level languages, but it will always exist as a necessary "most direct control" of the processor. But I suspect that its use will be limited to compiler-writers and a very few others.

From:

g-w-s.livejournal.com

Ahh, but that's just it - higher level languages *don't* give compilers the information they sometimes need. Even in C (which, lest we forget, IS a high-level language), this can be related to the use of compiler-specific #PRAGMA statements intended to provide hints to the compiler for optimization purposes. I can't imagine a higher-level language knowing exactly what sort of data it is going to be fed and optimizing without SYSTEM knowledge.

I am curious about your statement of java performing certain memory operations faster than C. Could you point me to an example? (I swear, I'm not being an ass - I don't know java at all, and I am having a hard time getting my head around that statement.)

From:

jducoeur

This is related to the "JIT" concept that has emerged in recent years. JIT refers to "Just In Time" compilation, and it's really where the interesting work is nowadays.

Basically, the idea is that compilers only have limited information to work from, because they have a static understanding of the behaviour of the system. But in fact, you can learn vastly more at *runtime*, by observing the actual behaviour and re-optimizing based on that. For example, if you can see the operation of your loops in action, you can sometimes determine that extra ones want to be unrolled. Also, you know more about the specific operating environment, and precisely how to optimize not just for this architecture, but for this *machine*.

Hence the rise of JITs. They're particularly a feature of semi-compiled languages like Java and C#, which "compile" into an intermediate bytecode language. In the early days, that bytecode was interpreted at runtime, which was why Java had a reputation for being slow. But nowadays, the bytecode is further compiled at runtime, transformed into real machine language as the program is loading. Furthermore, the most sophisticated systems go further, *re*-compiling as the program is running to re-optimize it based on the observed behaviour.

The result is

jrising's observation. Basically, while it's true that the *theoretical* limit of performance comes from assembler being written by someone who precisely understands the underlying architecture, the number of people who are good enough to program at that level is vanishingly small. (You might be one, but you're unusual in that respect -- in thirty years of programming, I doubt I've known half-a-dozen people who were that good at the low level, and they were all specialists in writing videogame renderers.)

*Most* C programmers write code that is efficient but not optimal, and the best JITs are now claiming (I haven't reality-checked the numbers, but the claim is common and plausible) to be able to significantly outperform all but the best, by performing extremely deep on-the-fly optimizations of things like memory caching. To get optimal performance, you need to be optimizing not just the code but the memory organization, and that is damned hard stuff to do by hand. No high-level languages make it at all straightforward, and most make it more or less impossible, but the JIT, which is watching the behaviour and adjusting on the fly, can do a pretty good job of it, moving stuff around so that memory that is being used together tends to be on related pages and getting cached together. That produces superior memory-access time and improved overall performance, since the program is spending more time hitting the L1 cache instead of RAM.

I don't know the embedded space well, as I mentioned before, so it's possible that this stuff simply doesn't matter as much where you are -- you're likely building the programs to be as cached as possible to begin with. But the number of applications where that is possible is *quite* small: most programs are simply too big and complex to make that level of memory control possible by hand. So the only practical way to get good memory optimization is to let the computer take care of it, with a sophisticated JIT working hand-in-glove with the hardware to keep the cache filled appropriately.

Caveat: all the above is talking rather above my level -- I work with this stuff, and stay reasonably up on the literature, but it's not a topic I know well. (In my space, I care far more about scalability than performance, so I just don't *care* much about this kind of cycle optimization. Threading is vastly more important to my life.) So this is my understanding of the state of the art in a field that is advancing quite rapidly...