hudebnik

A final exam in a course on video-game design. The exam was in the form of an amusement-park ride: as we rode through a darkened tunnel, things flashed on both sides, with no opportunity to pause or go back. The first section of the exam showed the icons of five well-known video games -- I don't remember exactly, but let's say Chess, Donkey Kong, Super Mario, etc. -- with numbers associated with each. Later sections of the exam named particular techniques studied in class, and asked which of the numbered games were most likely to use that technique and why.

The publisher of my textbook, Picturing Programs: An Introduction to Computer Programming, says 13 (print-on-demand) copies were sold in 2021, and they owe me GBP 88 in royalties. Woo hoo! Still got a way to go for the NY Times best-seller list....

Seriously, it's sorta surprising that even 13 print copies were sold, given that one can download the entire text (with color pictures, not the greyscale ones in the print edition) for free from the web site, and are invited to subsequently pay me what you think it's worth.

Since the textbook is written to go with a particular language and development environment (which are also a free download), I should probably update the textbook to make sure it still works with the current versions of those.

As Hillary pointed out in the third debate, any time Trump doesn't get his way he concludes that the system is rigged against him. Yes, this even applies to the third debate itself, where he alleges without proof that Hillary was given the questions in advance. Seriously, the six major topics were announced in advance, and anybody with half a brain could have made a decent guess as to what questions would be asked about those six topics. But I digress.

It occurs to me that another reason for Trump to talk about rigged elections and not accepting the results of the election is so he can call Hillary a hypocrite when Russian hackers deliver him the election and Hillary disputes it.

That sounds like a far-fetched scenario, but not impossible. We know that voting machines can be hacked to produce a total count that doesn't match the actual votes cast. We know that manual recounts are done (in most places) only when the reported vote total is very close, and even then political appointees may control how thorough the recount is. We know that some jurisdictions in the U.S. use electronic-only voting machines that make a manual recount impossible.

So if I were a security-cracker who wanted to shift the outcome of a U.S. election, I would pick a bunch of states where polling indicated my candidate was losing narrowly. On each voting machine, if a vote is cast for other than my preferred candidate, with 10% probability I record it instead as a vote for my preferred candidate. If the actual vote is 50% Clinton, 45% Trump, that hack is enough to reverse it to 50% Trump, 45% Clinton: a large enough margin to avoid a recount, but not so large as to be completely implausible.

One of the numerous fronts in the Programming Language Wars is "how fast does the code run?" This is a remarkably difficult question to answer, for reasons discussed here, but one can make certain general observations. As pointed out here, there's a top tier of languages that are typically compiled to native machine code without run-time garbage-collection, including C++, C, Fortran, Ada, and ATS. A second tier includes a bunch of languages that are (a) typically compiled to bytecode and then JITted, and/or (b) run-time garbage-collected: Java, LuaJIT, Julia, Haskell, Scala, Ocaml, C#, Go, Common Lisp (SBCL), Rust, Pascal, F#. I won't bother with the several more tiers right now.

The top-tier languages are all fairly mainstream, primarily-imperative, three-piece-suit affairs, except perhaps for ATS (with which I'm not familiar).

The second-tier languages include Scala, Ocaml, Common Lisp, and F#, all of which are primarily-functional, as well as Haskell, which is purely-functional (e.g. it has no assignment statement at all, and it takes advantage of this fact to do lazy evaluation and a variety of compiler optimizations), and a couple of primarily-imperative languages.

If you've read anything technical about my current employer, Google, you're probably familiar with the MapReduce framework for massively parallel computing: we alternate "map" phases (do the same thing to a gazillion data points independently) with "reduce" phases (combine a bunch of data points into fewer data points). You may or may not have heard that MapReduce has been deprecated internally for new code for a year or more: in its place, we use libraries that turn C++, Java, and Go into, effectively, primarily-functional languages with lazy evaluation (think Java Streams). MapReduce is still around, but as essentially a compilation target rather than something humans are supposed to actually write.

Of course, it would be a whole lot easier to write in a REAL functional language with lazy evaluation, rather than trying to retrofit C++ to play that role, but we've got an awful lot of existing C++ code....

shalmestere had left some pieces of paper lying around for me to see. I picked one up and concluded it was a recipe, written in 19th-century English, for compost -- in the gardening sense, not the preserved-root-vegetables sense. In fact, looking at the others, I realized there were at least a dozen different recipes for varietal composts, presumably for different garden conditions. And each one ended with a series of "if" statements and a list of book titles. At length I realized that the "if" statements were actually conditional-compilation directives (if the C preprocessor had been based on 19th-century English), and the book titles, all by the same 20th-century female author, were effectively include directives: if you ran the whole thing as a script, you would get the full text of all her novels, with the compost recipe inserted at all and only those places where it was appropriate to the setting.

Thanks to

jducoeur, this link to XKCD's explanation of the Heartbleed bug.

OMG: I didn't realize it was this simple and stupid. Reason #738 why no production code should be written in C or C++.

Graphics class: drew a diagram of a pinhole camera, with an image appearing upside-down on the back wall. Replaced the box with an eyeball. Asked "why is the image upside-down? Why not flipped left-right, or something else? Why should the direction of up-or-down be privileged? Light rays aren't significantly affected by gravity (except near a black hole, which we aren't)."

Did some more thought experiments until one student said "it's not upside-down, it's rotated." So we discussed rotation, reflection, scaling, etc. and I promised that all that stuff they learned in Linear Algebra would actually make sense and be useful this semester.

Computer Architecture class: discussed units, and why E = mc^2 makes sense but E = mc^3 couldn't possibly have been right. Word problem: I walk at 2 steps/second, each step covering 2.5 feet. I jog at 3 steps/second, each step covering 4 feet. If I alternate jogging 75 steps with walking 75 steps, how fast am I moving on average?

Solved the problem two different plausible ways, getting different plausible answers. Discussed which one was wrong, and why. What happens if I lengthen my jogging stride to 5 feet? Obviously this is a 25% increase in jogging speed, but what's the percentage increase in average speed? What if instead of alternating every 75 steps, I alternated every 30 seconds, or every 100 yards? Good discussion.

Data structures class: let's explore a vague problem. In the game of SubDivvy, the players agree on a positive integer to start with, and on each turn, the current player chooses a proper divisor of the current number, subtracts it from the current number, and that becomes the new "current number." Tell me everything you can about the behavior of this game.

Then commenced a series of student-led discussion. Debated definition of "proper divisor"; agreed to exclude the current number itself (or the game end in one turn), and negatives (or the game goes forever). Not sure whether to exclude 1. Defined loss of game as "there is no proper divisor," which means the number is 1 (if we allow 1 as a proper divisor) or prime (if we don't). Proved that current number only decreases, never reaches 0, and never goes negative. Proved that in the allow-1 version of the game, game always ends after at most n-1 turns; still working on minimum number of turns. Proved that (still in the allow-1 version) 1 is a losing configuration, 2 is winning, 3 is losing, and 4 can be won or lost depending on what you choose as the first move -- in other words, strategy does matter in this game (there are games in which it doesn't). Wait: we've been assuming there are 2 players, which I never said. What if there were 3, or 4...? At the bell, left students with the challenge of characterizing winning and losing game states. Good discussion.

If only every day of classes went this way....

I've spent part of the week moving boxes from my old office into my new office, unpacking things, finding places for things.... There are several boxes that never got unpacked from the previous office move (in part because I was moving into a smaller office and simply never found good places to put things, and in part because I realistically didn't need that stuff); I've found places for some of this, and thrown out some of it. Realistically, I'm never going to read that conference proceedings from 2004, or all those back issues of professional journals going back to the 1990's. And I probably don't need those photocopies of articles my grad-school professors assigned me to read: even if I did need to re-read those things or assign them to a student, I'm sure they're all on the Web.

Thursday evening I gave a talk to the Long Island Java Users' Group on the subject "Functional Programming in Java: Why and How". The Long Island Java Users' Group has, officially, 28 members, of whom about 7 were in attendance; I'm told this was the most well-attended meeting yet. So it was an intimate, informal affair: the pizza arrived about 2/3 of the way through my talk. At least three of the seven people in the audience had previously been to one of my Program By Design workshops, so they already had some grounding in functional programming, but may not have given much thought to doing it in what's not traditionally considered a "functional language". The last section of the talk was a preview of Java 1.8, which when it comes out in 2014 will have a number of features to make functional programming easier. Anyway, there were some good questions and good comments.

I spent Friday pre-cooking for an SCA business meeting hosted at our house. Hummus b'tahini, pasta with garlic yogurt sauce, olive tapenade, carrot slaw, roasted broccoli salad, larb gai, prosciutto-and-melon, and grilled lamb. (

shalmestere was at work, but she did a lot of the house-cleaning Thursday and Friday, made a batch of comfort-food-from-your-childhood lemon cake, and did most of the plating.) There were maybe twenty people, the food went over well, no last-minute disasters, and even the SCA business discussion was tolerable. And we now have leftovers for weeks, and no room in the fridge or the freezer (which is why I'm not at the farmer's market this morning).

I'm teaching a class on design and analysis of algorithms, and trying to have as much of the class as possible led by the students. All this week the students have been presenting homework solutions, and today one of the sharper students presented a comparison between a naive implementation and a cleverer, "optimized" implementation of the heapsort algorithm. He walked us through his code, both heapsorts and the timing code wrapped around them, ran through a toy-sized demonstration of each on the board, and then I said "So, do you have results?"

"Yes, I do." He ran the program: the naive implementation took 1.5 milliseconds, and the optimized implementation took about 30 microseconds. Beautiful!

He ran it again, with similar (though not identical) results. And again, and again. Pretty convincing... except that he was sorting an array of 8 numbers, on which even the stupidest heapsort shouldn't take 1.5 milliseconds. Furthermore, looking at the algorithms, I concluded that the speedup should be at best a factor of 2, not a factor of 50. So I asked him, on a whim, to swap the code segments to run the optimized implementation first, and the naive implementation second.

The optimized implementation took 1.5 milliseconds, and the naive one 30-some microseconds. Consistently. The presenter looked like he'd been hit with a pole-axe. The rest of the class were amused, but equally puzzled. We discussed possible explanations, and eventually concluded that it probably had to do with loading a Java class file from disk on the first demand for it. (1.5 ms actually sounds low for that, unless he has a solid-state drive.) After suggesting a couple of ways to correct for this, the students agreed to sort another array first, without timing it, and then try the timed runs. They also suggested moving the timing code in closer, so that it only measured the parts of the code that differed from one algorithm to the other.

The optimized implementation took 30-some microseconds, and the naive one 20-some microseconds. That again, and again, and again. This wasn't such a shock, really, as the presenter's own toy-sized demonstration had actually done more work under the "optimized" algorithm than the naive one; it was predicted to work better, in the average case, on reasonably large heaps. So we tried size 8000, and the naive algorithm still won. Then 10000000, and it was pretty close: the optimized algorithm won one match out of five, and lost only narrowly the rest of the time. Then 20000000, and the optimized algorithm won fairly consistently.

So some valuable lessons were learned about empirical efficiency-testing. Not what we set out to do, but you take teachable moments where you find them.

Current Mood: amused

Google computer scientist Neil Fraser spent some time in Vietnam and took the opportunity to visit some elementary-school CS classes. Here's his report.

Current Mood: impressed

Java's constructors have annoyed me for years. They're called and defined with a different syntax from any other method, and they can't be inherited. If objects of a bunch of classes are each supposed to be able to create duplicates or near-duplicates of themselves, each such method has to be written separately, no matter how similar they are, because they all call different constructors. And Java constructors always produce a new instance of the specified class -- they can't return an existing instance, they can't return an instance of a subclass of this one, and they certainly can't return an instance of another class that implements the same interface. All of which means they're heavily tied to implementation, and should not be exposed in an API.

One can get around some of this by using pseudo-constructors, aka factory methods. If the whole purpose of a factory method is to return a (possibly) new object, one must be able to call the method without already having an object of the class, so it has to be static. Unfortunately, static methods can't be abstract, so they can't be specified in an interface; if a dozen classes all have static methods with the exact same signature, you still have to specify them all separately.

Current Mood: annoyed

I teach computer programming, and I tell my students to always write test cases for their program before writing the program itself. In other words, before you've started solving the problem, you need to specify clear criteria for what it would mean to have solved it successfully: if I run the program on this input, I should get that result. This process helps my students clarify in their minds what the problem really is, makes it easier to solve the problem, and makes it less likely that students under last-minute time pressure will skip testing their programs.

More generally, when you're solving any kind of problem, it makes sense to specify clearly what results you hope to achieve and how you'll know when you've achieved them, before you start trying to come up with solutions. (If you're a politician, of course, you're less concerned with whether you successfully solved the problem than with the perception of whether you successfully solved the problem... so you're better off not specifying the desired results in advance, but retroactively defining them as whatever you accomplished, thus guaranteeing success. But we're not politicians here.)

Many of the problems I solve in daily life have to do with teaching. So it makes sense to specify what I expect students to learn, and how I'll tell whether they learned it, before starting to teach. It could even be argued that if I can't think of a way to check whether students achieved the desired learning outcomes, I don't really understand them myself well enough to teach them.

This is where we get the high-stakes, standardized tests so prevalent in modern education. Several things go wrong with this sort of testing:

The time spent on testing itself can start to detract from teaching. More testing is not necessarily better testing.

Even if an outcome is "measurable", that doesn't mean it's easily and inexpensively measurable. When learning outcomes are incorporated into a large-scale standardized test, there's an automatic bias in favor of things that are easily measured for lots and lots of students -- multiple-choice questions, essay questions with a checklist of points to address, etc. This may not accurately reflect many of the things we actually want students to learn.

When student testing is used to measure the effectiveness of individual teachers and schools (which, in theory, should be a perfectly good use of the data), it means most of the people involved in the testing process have an incentive to cheat: students, teachers, and local administrators. The only people with an incentive to see the tests administered fairly and accurately are those farthest from the testing process: the graders and the education reformers in government and central administration. Even if "cheating" doesn't take the form of getting "correct answers" illicitly and passing them around, it can mean "teaching to the test."

"Teaching to the test" is not a bad thing if the test actually reflects what you want students to learn. (Nobody would complain about a programmer "programming to the test," and in fact a prominent school of software development says you should almost never write code except to pass a specific currently-failing test.) But in combination with issue 2 above, the tests' existence skews the teaching process in favor of things that are easily tested, as above: memorizable objective facts and recipe-following are preferred over creativity and analysis.

Discuss. Any ideas for how to fix this? Have I framed the problem wrong in the first place?

Posted via LiveJournal app for iPhone.

Thanks to

osewalrus, Eben Moglen's keynote address at Freedom To Connect 2012:

The whole video is about an hour and a half -- an hour of prepared talk and half an hour of Q&A -- but to whet your appetite, here are some excerpts:

( excerpts )

Following up on this post...

Nobody in Monday's or Wednesday's class complained at all about doing the assignment in Scheme. In fact, they've finally buckled down to actually work on the problem, which was due two weeks ago. (This course isn't a prerequisite for anything else, so I can let deadlines slide if necessary to make sure everybody actually "gets it".)

I've put the lectures and new material on hold, and have been mostly just helping students get their programs working. One team realized, by the time they had passed one particular tricky test case, that they had used three higher-order functions in four lines; doing the same thing in Java or C++ would have been probably fifty lines of code.

Current Mood: amused

The students in my Principles of Programming Languages course have been kvetching because I asked them to do their interpreter project in Scheme. They only learned Scheme last semester, and they already know it's not a "real" language, so it's outrageous and arbitrary of me to ask them to write something substantial in it. I had pointed out that it makes parts of the assignment easier, and they replied (reasonably enough) "but if it takes more work to learn the language, we haven't really saved anything."

So yesterday in class I said "OK, I give in. This course isn't about Scheme, it's about interpreters and language features, so you're welcome to do this assignment in whatever language you like. Would you like to do it in Java?" Massive acclaim. "Let's work through Version 1 of the assignment [which they had all already done in Scheme] together in Java; once we've got that working, you can do all the subsequent versions on your own." I opened Version 1 in Scheme for reference, and started translating at the board.

The first datatype definition wasn't too hard. The 6 lines of Scheme code became 70 lines of Java, not counting blank lines and comments, but the Java code was mostly very familiar boilerplate, so that's not too bad. (Had to write constructors, getters, toString() methods, and equals() methods. Technically, should have overridden hashCode() too, but these students have never heard of hashCode().)

Then we got to a part of the Scheme code that relied on Scheme's built-in lists, which can contain anything, including other lists. Java has ArrayLists, but it's sort of a pain to work with heterogeneous and nested ArrayLists, so we started building a Java datatype to correspond to Scheme lists. That took about 90 lines of Java code, not counting comments, blank lines, and test cases. Except that the constructor wasn't very convenient to use: the students all agreed it really should take in a string of the form "(3 + (4 / 5))" or something like that. At which point the class ended.

These students haven't taken compiler construction. They've never written a lexical scanner, much less a hierarchical parser, so it didn't occur to them that this constructor, which approximates the functionality of Scheme's built-in "read" function, might be non-trivial. I just wrote it myself outside class, trying to keep things short, simple and clean: it took 88 lines for the reader itself, plus 87 for supporting data structures and 171 for test cases. (The reader is complicated enough that I really needed all those test cases, and many of them failed at least once.)

Anyway, we've just added 265 lines to the Java program to duplicate functionality that came for free in Scheme. Now we can get back to the original problem....

This will probably delay the course by a couple of days (especially if I don't give them my reader code), but if it conveys the lesson that sometimes it IS worth learning a new language in order to make the program easier, it'll be worth it.

[Followup Feb. 10:] At the beginning of class today, at least one student was already saying "let's go back to the Scheme version." I didn't; I wanted to use this program to motivate and illustrate a couple of Java programming patterns. So we got this minimal interpreter working (using the reader I had written), but we'll be back to Scheme on Monday.

Current Mood: amused

I'm teaching a Survey of Programming Languages course in the fall. The goals of the course are to teach students how to learn a language, and to introduce students to ideas and techniques they didn't see in their first-year Java courses; if they happen to learn a language that gets them a job, that's a bonus. They're familiar with (if not necessarily good at) class-based OOP and imperative programming; they haven't seen higher-order functions, closures, continuations, comprehensions, macros, or declarative programming in general. Most of them haven't seen any kind of parallel or multithreaded programming, nor network programming.

Traditionally, this course has been about 50% C++, 25% Scheme, and 25% Prolog, but the C++ content has been moved to another course. After polling the preregistered students by e-mail, I've decided to fill the gap with some reasonably-modern scripting language. Leading candidates so far are PHP, Python, Ruby, Lua, Erlang, and Scala.

I can make an unbiased choice among these because I don't really know any of them (although I've written some PHP-based server-side web scripts). So I'll be learning them just ahead of the students :-)

Any advice?

So I finished with what I had to lecture about, and told my students to spend the remaining 15 minutes of class working on their homework, while I walked around looking over their shoulders and offering help. Two of them were collaborating on a programming assignment and had turned it into a Google Wave, with a gadget that knows various programming languages (including Scheme, which we're using in this course) and automatically syntax-colors whatever you type. Kewl.

Current Mood: amused

So I'm trying to write up a solution to the homework assignment I just gave my students -- a binary search tree of (key,value) pairs in C++.

From my experience in Java and Scheme, I have a strong bias in favor of immutable data structures. So I wrote a BST class that presents a stateful "front end", but which is actually implemented in terms of immutable trees.

What should one be able to do with a BST-based dictionary? Obviously, create an empty one; insert a specified (key,value) pair; check whether a given key appears; look up the value associated with a given key; etc. I decided to leave out "remove" from the assignment, since that's harder to do to a pure BST (there are ways around it, but I didn't want to get that complicated). But it should be easy to replace the value associated with a given key.

Now, if the data structure were supposed to be mutable, I would just search through the BST, change the "value" field, and be done with it. If I were working in Scheme or Java, I would search through the BST, create a new node with the modified value but the same key and subtrees, and rebuild the answer from there on up; the old nodes will eventually get garbage-collected. So I try to do that in C++... but what happens to the old node? At some point it (and everything above it in the tree) will have to be deleted, or I'll get memory leaks. The obvious C++ destructor says "just before deleting me, delete both of my subtrees." But I just copied those subtree pointers to the new node, so they'll get clobbered.

Aha: I'll zero out those pointers before deleting the old node... but that counts as mutating, even though it's just a nanosecond before the node goes away completely. Maybe I should dumb-down the destructor so it DOESN'T automatically delete both subtrees; instead, I'll do that with an explicit call to a "cleanUp" member function before deleting. No, that doesn't work either because "cleanUp" counts as mutating, even though it's just before the node goes away completely.

The only alternative seems to be that I have to rebuild not only everything from the node-to-change UP, but also everything from that node DOWN. Which feels wasteful and inefficient -- exactly what C++ is NOT supposed to be.

Is it really true that in a non-garbage-collected language you can't have immutable data structures share substructures safely?

Side note: Oddly enough, the C++ compiler doesn't complain in the slightest about "delete this;" appearing in the middle of a member function.

Current Mood: puzzled

I just spent a week more-or-less lecturing from 9AM-5:30PM (plus an after-dinner session Wednesday night), for a baker's dozen of other computer science faculty interested in learning new ways to teach beginning programming. I'm still jazzed about the material, as are (apparently) a majority of the workshop participants, although several commented on the brutally fast pace (we covered the equivalent of a 3-credit college course in the first three days). And one bitched me out for saying "Forget what you know about sorting algorithms; just follow this recipe;" she felt this was insulting to women who have gotten where they are only by learning and knowing stuff. Perhaps a better way of putting it would have been "Try to see and do this assignment through the eyes of a beginning CS student, not somebody with a Ph.D. in CS."

Anyway, I got home Friday night utterly drained, but sorta triumphant at the same time: there were no major disasters, I didn't look like a total fool, and most of the participants seem to basically "get" the most important points of the workshop.

Today was spent recuperating, catching up with chores and bills that had fallen by the wayside, etc. Tomorrow: mow the lawn, and clear a bunch of stuff out of the garage so the professionals can replace the garage doors on Monday. Monday: go to the chiropractor, go to the bank for a cashier's check to pay the garage-door people, do some more cleaning and chores, prepare a class on ornamenting dance music, photocopy some sheet music, maybe work on the textbook or the Ostgardr web site redesign....

Current Mood: tired

A couple of meetings on campus, and worked on Da Book. Microsoft is the Anti-Christ: some of this stuff would be so much easier in LaTeX. I may yet switch back to LaTeX....

Profile

hudebnik

July 2025

S	M	T	W	T	F	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

Syndicate

Page Summary

Active Entries

Style Credit

Style: Neutral Good for Practicality by timeasmymeasure

Expand Cut Tags

No cut tags

Page generated Jul. 6th, 2025 09:39 am

Entries tagged with computer programming

Dream journal

Annual e-mail from my publisher

Another reason for Trump to talk about rigged elections

Functional programming without a functional programming language

dream journal

Heartbleed

First day of class

The week in review

That was fun (classroom report)

CS education in Vietnam

Java annoyances

Education and testing

An inspiring talk about freedom, hacking, innovation, and economic recovery

Programming languages, continued

Comparing programming languages

Call for geek help: scripting languages

Brave new world...

C++ geekery

Da Werkshop iz Ovah

(no subject)