hudebnik: (teacher-mode)
hudebnik ([personal profile] hudebnik) wrote2023-03-13 07:34 am
Entry tags:

Re-reading George Boole

We had some airplane travel this past week, and I grabbed a book for leisure reading: George Boole's 1854 Investigation of the Laws of Thought. I'd read parts of it many years ago, but didn't remember a lot of details and thought it would be worth a re-visit.

The first half of the book introduces a new way of writing logical statements and arguments based on the familiar notation of algebra (hence the modern term "Boolean algebra"), while the second half addresses the theory of probability, largely by generalizing from logic's two values (true=1 and false=0) to numeric values in between. I'm still in the first half on this re-read, so I won't say anything more about the probability stuff now.

When Boole wrote, not much had changed in logic instruction since Aristotle. The essential components of logic were statements of the form "All X are Y", "Some X are Y", "No X are Y", "individual x is Y", and "X means the same thing as Y", which can be combined in various ways called "syllogisms". Students memorized which of these syllogisms constituted valid inferences (e.g. "All men are mortal", "Socrates is a man", therefore "Socrates is mortal") and which invalid (e.g. "All men are mortal", "All fish are mortal", therefore "All men are fish"). Boole observed that most of this was really about classes and categories of objects, and he proposed to represent each class of objects with a letter, e.g. x, y, z, as in algebra. Putting two letters next to one another, the same notation as multiplication in traditional algebra, would mean the class of objects that have both properties, what we would now call the "intersection" of sets. Putting two letters together with a "+" sign in between would mean the class of objects that have one property or the other; a "-" sign would mean the class of objects that have the former property but not the latter; and so on. He's careful to point out that although the symbols are familiar from their use with numbers, and they obey some of the same rules as numbers, these are not numbers, and we can't assume that they obey all the same rules. For example, from xy = xz one cannot validly conclude y = z. But then, one couldn't conclude that in numbers either unless one knew that x ≠ 0.

The first place he makes what would now be considered a mistake is when he defines "+" to only be meaningful between two classes that are known to be mutually exclusive, e.g. even numbers and odd numbers, or men and women (remember, it's 1854). He's aware of the issue, and points out that some people might do things differently, but settles on this. As a result, he can say with confidence that x + y - y = x, which wouldn't be true using an inclusive definition of "+". On the down side, if I read him correctly, it means one can't tell whether a particular expression is meaningful or not without knowing about the meanings of the individual symbols. If I write something about the class of "sentient beings and fish", secure in the knowledge that there are no sentient fish, and then a sentient fish is discovered, I have to go back and re-write all my algebra.

Likewise, he defines "-", as in x - y, to be meaningful only when y is contained in x. As a result, (x + y) - y is meaningful whenever y is disjoint from x, while (x - y) + y is meaningful whenever y is contained in x; it's quite possible for either one to be meaningful without the other, it's impossible for both to be meaningful at once unless y is empty, and this doesn't seem to bother him. Indeed, he asserts that any sequence of additions and subtractions can be rearranged freely, for example x - y = (-y) + x, despite having never defined what "-y" means in its own right. This seems to me a serious problem, since there is no possible interpretation of "-y" that makes it behave as he wants it to behave: there is no class of objects which when "added" to x yields a result properly contained in x. But this doesn't bother him: as he warned earlier, there doesn't have to be an interpretation of every step in a chain of reasoning, as long as the starting and ending points are interpretable.

Anyway, from his notational definitions and straightforward observations about how logical statements behave in the real world, he concludes that xy = yx regardless of what classes x and y represent (analogous to the familiar "commutative" law of multiplication), x + y = y + x (analogous to the commutative law of addition), x(y + z) = xy + xz (analogous to the distributive law of multiplication over addition), and xx = x (which is decidedly not true of numbers, unless you restrict them to the values 0 and 1). Extending the analogy, he observes that familiar numeric notation has two special values 0 and 1, distinguished by 0x = 0 and 1x = x regardless of x. In the logical setting, the same is true if 0 represents "Nothing" (or in modern terminology "the empty set") and 1 "Universe" or "Everything". He then writes 1 - x, reasonably enough, for "all the things that are not in class x. Since he has already observed that xx = x, and the former can reasonably be notated x2, he has x2 = x, hence x - x2 = 0, hence x(1-x) = 0, whose natural interpretation is that no object can both have and not-have the same property -- a standard law of logic, but not one he has previously assumed, and he's managed to derive it from the idempotence of "multiplication". Cute.

Revisiting the question of inclusivity, he then translates "all the things with property x or y or both" as x + y(1-x), i.e. things that are either x or y and not x. Which works, but it strikes me with my modern-logic training as overly cumbersome. Extending it to three variables, you get things like x + y(1-x) + z(1-x)(1-y), and that way madness lies. Likewise, he translates "all the things with property x or y but not both" as x(1-y) + y(1-x).

Since classical logic is all about syllogisms, he then shows how to translate all the statements of classical syllogistic logic. "X means Y" becomes x = y. "All X are Y" is a little trickier, since he doesn't have a notation for "subset": he writes x = vy where v is a new class about which nothing is known except that it has elements in common with y (a stipulation that I suspect will get him in trouble later). Once he's got that trick, he can similarly write "No X are Y", or equivalently "All X are not Y", as x = v(1-y) and "some X are Y" as vx = vy (again, v is assumed to have elements in common with the thing it's in front of; he's not clear on whether the two v's in this equation are supposed to be equal to one another).

I had originally written, above, that he translates "No X are Y" as xy = 0, but on re-reading I realize that he doesn't give that translation. Which is a pity, since it fits cleanly into his notation and isn't semantically problematic.

[Update, 24 March:]

I've read a bit farther, and I think I have a better idea why he insists on defining + and - with semantic restrictions: because he really wants to use familiar algebraic manipulations as though logical statements literally were numbers. He writes all sorts of algebraic expressions, like (x-y)/(y-2z) + 3z2, evaluates them with specified values for the variables, and uses the results even when they involve 1/0 or 0/0. He wants to be able to do anything with logical variables that he can do with numeric variables, subject only to the constraint that every variable can take on only the values 0 and 1. So a definition that makes

x + y - y ≠ x ≠ x - y + y

would really mess up his plans, even though it has other philosophical benefits such as "if A and B are both meaningful expressions, then A + B and A - B are meaningful expressions regardless of the meanings of A and B." Indeed, he says very explicitly that he's OK with apparently meaningless expressions creeping in along the way as long as they disappear by the final conclusion. To me, this calls into question the validity of his inferences (if they don't even preserve meaningfulness, how can they possibly preserve truth?), but he's not trying to prove that his inferences are valid: he knows they're valid, because they're the familiar algebraic operations on numbers.

More later.
watersword: Keira Knightley, in Pride and Prejudice (2007), turning her head away from the viewer, the word "elizabeth" written near (Default)

[personal profile] watersword 2023-03-13 04:27 pm (UTC)(link)
I did not realize that Boolean logic is named after an actual person, and the knowledge delights me.