Saturday, September 30, 2006

Dancers of size

This week's Bay Guardian has the quintessential San Francisco story. It perfectly captures many of the things I love and hate about this ciry: the political ideals, the attitude towards sex, and the litigiousness.

Thursday, September 28, 2006

Crypto Misdemeanor

I fear that something like this will happen to me before the end of the term at IPAM.

xkcd comic by Randall Munroe

Saturday, September 23, 2006

More on Chavez and Thailand

According to an opinion poll from the Bangkok Post, more than 80% of the population supports the coup in Thailand. Meanwhile, the military junta is sending orders to the troops on the ground: Smile.

In his speech, Chavez expressed regret that he was never able to meet Chomsky before his death. Chomsky, of course, is alive and well, and his book that Chavez brandished at the UN shot up in the best-sellers list.

Wednesday, September 20, 2006

What's new in the world

Venezuela President Hugo Chavez spoke at the UN today. He first advertised a book of Chomsky's. Then "The devil was here yesterday," he said referring to President Bush, who had spoken earlier on his vision for the Middle East "it still smells of sulfur." He then made the sign of the cross. There is a video here.

Meanwhile, Thaksin Shinawatra, the Berlusconi of Thailand, was ousted in a surprisingly peaceful military coup. Here is an excellent blog coverage of the events in Bangkok. The highly respected King has endorsed the coup.

Tuesday, September 19, 2006

Protecting marriage

The Onion, whose fake news coverage has lately been quite disappointing, has a great article on protecting marriage from sharks. My favorite paragraph:
According to recent polls, only 22 percent of voters who live in shark-infested areas on either of the country's coasts say they are "very worried" about the damage sharks could wreak on married couples, while that number jumps to 86 percent in more conservative, landlocked, regions of the South and Midwest

A while ago, the Fafblog had a similar (and better) article on tainting the octupus.

What's new in Math

A group blog has been set up on the subject of concurrency theory. A plurality of contributors is named Luca, and Luca Cardelli has not even joined yet.

There is a new sum-product theorem for finite fields, improving previous results by Bourgain, Katz and Tao and by Konyagin. This is the kind of result that is used in some recent constructions of extractors.

The MacArthur "genius awards" have been announced. The awardees include Terry Tao, Manuel Blum's former student Luis von Ahn (of captcha and ESP game fame), and Berkeley computer scientist and Stanford areonautical engineer Claire Tomlin. Jon Kleinberg was one of last year's winners.

Harvard mathematician Shing-Tung Yau has set up a web page to reply to the New Yorker article by Sylvia Nasar and David Gruber. (Via Scott.) The page contains a letter to the New Yorker by Yau's attorney, a prominent Boston lawyer who has already won a \$2.1 million defamation suit against a newspaper. Don't miss tomorrow's webcast, at noon EDT - 9am in California. Try doing "whois" to see who registered the domain name.

Sunday, September 17, 2006

The Italians of Asia

It's the Koreans, according to this amusing article

Thunder rain sky spirit forbids to hit hand machine

Last March, before going to Beijing, I thought I would try to learn a few useful characters and sentences. As it happened, I did not have enough time to really learn anything useful before the trip, but I have been fascinated by the language ever since, and I have continued studying. I hope this is not a metaphor for the relationship between theory and practice in computing.

After having lived in the US for almost ten years, the way I pronounce interesting, pseudorandom, and other long words is still the butt of jokes, so I am under no illusion of ever speaking understandable Mandarin. I would like, however, to make some progress on reading and writing Chinese and on understanding Mandarin as spoken by a Beijinger or a Taiwanese.

(If you speak Chinese, either stop reading here, or by continuing reading, you pledge not to make fun of my neophyte enthusiasm.)

An educated Chinese speaker knows at least 5,000 characters, and a basic level of literacy corresponds to about 2,000 characters. I hope to eventually learn the 1,067 characters in the main part of this book. There is a method to the madness of so many characters. There are about 200 basic components, called radicals, of which all characters are made of. In the simplest cases, the radicals combine to give the meaning: for example the character 好(hao) is a combination of the radicals 女, "woman," and 子, "child," and it means "to love," "to be good," and also "good" as an adjective, or 安(an) is a combination of the radicals for "roof" and for "woman," and it means "peace." (There is peace if there is a woman in the house.) In other cases, one combines a similarly pronounced character, which suggests the pronounciation, with a radical that suggests the meaning. For example 客 (ke) means "guest" and contains the radicals for "roof," "to follow" and "mouth." The explanation is that if combines "roof," which suggests the meaning, with the character 各 (ge) which suggests the pronounciation. Why 各 (ge), which means "each," is made of "to follow" and "mouth," I have no idea.

Knowing many characters is not, however, enough to have a good vocabulary. Many words, in fact, are composed of two (sometimes three) characters. Sometimes, the combination makes perfect sense. For example, 电 (dian) means "electricity," 视 (shi) means "to look at" and 机 (ji) means "machine," hence 电视机 (dianshiji) "television." Or consider that 避 (bi) means "to avoid," 孕 (yun) means "(to be) pregnant" and 套 (tao) means "case" (as in pillowcase), hence 避孕套 (biyuntao). Other combinations are strange, for example 太 (tai) means "too" (as in "excessively"), but 太太 (taitai) means "wife," or 东 (dong) means "East," 西 (xi) means "West" and 东西 (dongxi) means "something."

Anyways, now that I have learnt a little bit of the language, I thought I would go back to some pictures of signs that I had taken in China and see if I could reconstruct what they meant.

So here is one sign:

I start by looking up the characters in a dictionary, but how do you look up a character in a dictionary? There is a shortcut if you know the pronounciation, but what about a character you know nothing about? We said each character is made of a set of radicals, and one radical is considered the "main" radical for the character. I don't quite understand how you recognize it, but at worst one can do trial and error. Another fact is that by looking at a character it is typically possible to reconstruct how it is supposed to be drawn, and how many strokes it takes to draw it. With this information (main radical and total number of strokes) you go to the dictionary, which has an index of radicals, and then, for each radical, all characters that have it as a main radical, ordered by number of strokes, and you find your character. It is interesting that the way we look up a word in a dictionary for an alphabetic language is essentially binary search; here, instead, we have more of a hash function that maps a character to the pair (radical,strokes), and collisions are handled by linear search.

Back to the picture. We have the characters
雷 (lei) 雨 (yu) 天 (tian) 气 (qi) 禁 (jin) 打 (da) 手 (shou) 机 (ji)

Where 雷 (lei) means "thunder" and 雨 (yu) means "rain," so together they are "thunderstorm." Then we have 天 (tian), which means "heaven" or "day," and, in this case, "sky" and 气 (qi) which means "breath," "energy" or "soul." Is it heavenly spirit? No, 天气 (tianqi) means "weather," and it's a two-character word. So the first part is sort of "thunderstorm weather." Then 禁 (jin) means "to forbid." 打 (da) means "to hit," and sometimes it means "to play," as in playing a musical instrument or, more generally, operating a machine, especially one that produces sound. 手 (shou) means "hand" and (remember the TV) 机 (ji) means machine. The "hand machine" 手机 (shouji) is a cell phone. So
It is forbidden to use cell phones during a thunderstorm


(If you can't see the characters in this entry, and you are using Windows XP, go to start->control panel->regional options->regional options->languages and check the "Install support for East Asian Languages" box. It just takes a few seconds.)

Saturday, September 16, 2006

The second time as farce

Tuscany is a fierce place. Locals are famous in Italy for their imaginatively blasphemous way of swearing, their biting sense of humour, and their propensity for practical jokes. Citizens of different cities have rivalries that go back hundreds of years, and in some cities, like Siena, there are centuries-old rivalries between neighborhoods. Thanks to books like this, however, many Americans have an image of Tuscany as an extended, mellow, countryside where gentlemen sit in the gardens of their villas dipping fresh produce into olive oil, in the time that is not consumed by flirting with foreign women.

In fact, the theme of idyllic, if backwards, countryside/small town recurs even in the few Italian movies that achieve wide distribution in the US. (For example Io non ho paura or, a long time ago, Academy Award-winning Nuovo cinema Paradiso.)

Sometimes, people who have to listen to me complain about the above, or who are planning a trip to Italy, ask me what movies they could watch to get a sense of what Italy is like. Unfortunately, my first recommendations (Il Caimano or Aprile by Nanni Moretti, anything with Alberto Sordi) cannot be found in the US. Two good choices are Caro diario and La meglio gioventu', but it is L'ultimo bacio which comes to mind first.

(Note: I am not talking about the best recent movies from Italy, which are definitely Ozpetek's movies, but the best movies about Italy.)

L'ultimo bacio is mostly about the character flaws of the four male protaganists, all in their late 20s. The movie was a sensation among my friends (who were also in their late 20s and early 30s when the movie was released), and it spoke to them very personally. They saw an unflattering image of themselves, but, at the same time, the movie is sympathetic to its characters. I had already lived abroad for several years when I saw the movie, and it still felt too close for comfort. This was perhaps the most intensely and specifically Italian movie I had seen in a long time.

Now, however, there is an American remake. This sounds as implausible as an Italian remake of American Beauty, and I wonder what the producers were thinking and whether the movie will work at all.

Tuesday, September 12, 2006

The trouble with "nerd pride"

When the movie The Revenge of the Nerds was released in Italy, the word "nerd" was not translated because it had no analog in Italian. American movies set in high schools or colleges would always bring very foreign notions, such as fraternities, cheerleaders, school cafeterias, elective classes in high school and so on. But, just like after watching a few pirate movies you get a sense of the conventions of the pirate lifestyle, after seeing a few of those movies, they started to make sense. The notion of nerd, however, was more difficult to figure out. To be sure, we have terms of abuse for the academically achieving, and high school and college students are fond of creating identities and cliques. Such identities, however, tend (I should say, tended, in the late 80s and early 90s, I don't know how things are now) to be defined more by class and by politics than by other factors.

Then I spent a year at MIT, I saw Richard Stallman, I heard stories about him, and I finally understood. And so came the realization: I am one of them! For the non-American, see here and here for an explanation.

After Dr. Free Ride launched a nerd-off, Sean Carroll wrote an essay on the matter. I agree with every single word. On the one hand, it is right that there is no shame in having a specialized technical knowledge, be it on the Klingon language, on gender and class in Elizabethan poetry, on the PCP theorem, or whatnot. On the other hand, social awkwardness and a certain strain of anti-intellectualism (both highly associated to the "nerd" identity) are not things to be promoted.

Besides, what is really poisonous is the notion that technical knowledge and social inadequacy have to go together. There are surely more important and complex reasons that, from K-12 to college to grad school, push girls and women away from the study of science and engineering, but this association certainly plays some role. And that's not all: we all know a few girls and women who fit, and even embrace, the "geek" and "nerd" stereotype. And, if you are reading this, I am sure you know many white and Asian men who do as well (probably, you are one yourself). How many black men do you know who do?

(Here, I am not trying to follow the lead of Governor Schwarzenegger and say that blacks have it "in their blood" to be cool, or that women have a "grace gene." The point is that the dynamics of peer pressure can be very different in different groups.)

What about the solution of nerdifying the world? I am all for a society that does not look down on specialized knowledge (of any kind), but I think we already have enough men with ponytails and witty T-shirts as it is.

Saturday, September 09, 2006

Entropy and the weak Regularity Lemma

If you remember my discussion of Szemeredi's proof of the Szemeredi Regularity Lemma, the proof goes by progressively refining an initial arbitrary partition until one gets an eps-regular partition. A potential function is used to prove that the number of refinement steps is at most poly(1/eps). I said the potential function measures "variance" or "disorder" in the densities of edges between blocks of the partition. I used this terminology because I was aware of an alternate proof by Tao where the Regularity Lemma is reformulated as a statement about random variables and the proof uses conditional entropy (but not in a straightforward way) as a potential function. Tao's proof gives a stronger version of the Regularity Lemma, similar to another strong version that was proved by Alon, Fischer, Krivelevich and Szegedy, motivated by applications to property testing.

I still cannot understand the intuition of the stronger statement, the role of the parameter "F" in Tao's proof, and the role of the "fine" partition. The probabilistic interpretation, however, leads to an amusing three-line proof of the weak Regularity Lemma. (As we shall see, unfortunately, one of the three lines is too hard for me to figure out.)

Before stating the weak Regularity Lemma, let us define the notion of weakly regular partition. Let G=(V,E) be a graph, (V1,...,Vk) be a partition of the vertices, and for any two blocks of the partition define the density
d(Vi,Vj) := e(Vi,Vj) / |Vi| * |Vj|
where e(Vi,Vj) is the number of edges between Vi and Vj; it is also convenient to have the notation
d(Vi,Vi) := 2*e(Vi) / |Vi|2
where e(Vi) is the number of edges within Vi.

We say that the partition (V1,...,Vk) is weakly eps-regular if for every two disjoint sets S,T of vertices we have that the quantity
sum{i,j} |V_i intersection S| * |V_j intersection T| * d(Vi,Vj)

approximates e(S,T) up to an additive error of at most eps |V|2.

Finally we can state the weak Regularity Lemma
Weak Szemeredi Regularity Lemma. For every eps there is a c(eps) such that every graph G=(V,E) admits a weakly eps-regular partition (V1,...,Vk) where k=2{poly(1/eps)}.

Note that the number of elements of the partition is singly exponential in 1/eps, instead of being a tower of exponentials. This is a considerably improvement, but the weaker property of the partition is not suitable for many applications, including to property testing and to additive combinatorics.

The observant reader should see how to modify the original proof to fit this new setting. (Start from an arbitrary partition, if it is not good, use the sets S and T to refine, and so on; now each refinement step increases the size of the partition only by a constant factor, and, as before, we only have poly(1/eps) refinements.)

What about the information-theoretic proof? It will be a simplification of Tao's proof to the case of the weak Regularity Lemma. Define the following random variables: X and Y are two independent and uniformly distributed vertices, and E is the 0/1 indicator of the event that (X,Y) is an edge.

We use the notation H(Z) for the entropy of a random variable Z, H(Z|W) as the entropy of Z conditioned on W, I(Z,W):= H(Z)-H(Z|W) for the mutual information of Z and W and I(Z,W|R):= H(Z|R)-H(Z|R,W) for the mutual information of Z and W conditioned on R.

Finally, we come to the proof. We ask the question: Are there boolean functions f1,g1:V -> {0,1} such that I(E, (f1(X),g1(Y))) > eps? That is, after someone picks X and Y at random, is there one bit of information we can ask about X and one bit of information we ask about Y that helps us guess whether there is an edge between X and Y? Equivalently, are there sets S and T such that knowing whether X is in S and whether Y is in T helps us guess whether there is an edge between X and Y? If not, it can be checked that the number of edges between any two sets S and T essentially depends only on the size of S and T, and V itself is a weakly poly(eps)-regular partition. (It is a "partition" with only one block.)

Otherwise, we ask whether there are functions f2,g2 such that
I(E,(f1(X),g1(X),f1(Y),g1(Y),f2(X),g2(Y))> 2eps

If not, then for every two boolean functions f2,g2, once we know f1(X),g1(X),f1(Y),g1(Y), it does not help to know f2(X) and g2(Y) in order to guess whether (X,Y) is an edge. This means that if we partition V into 4 blocks (corresponding to possible values of f1,g1), then for every two sets S,T (corresponding to f2,g2), the number of edges between S and T essentially only depends on the sizes of the intersections of S and T with the four blocks of the partition. Hence, the partition is weakly regular.

Alternatively, we keep going, but, if we increase the mutual information by eps at each step, we cannot go for more than 1/eps steps. So there must be a k < 1/eps and 2k functions f1,...,fk,g1,...,gk such that, for all functions f,g, if we know the 4k bits f1(X),f1(Y),...,gk(X),gk(Y), then we gain less than another eps bits of information about E if we are also told f(X) and g(Y). Someone familiar with inequalities about entropy (this is the step I am missing) should be able to infer that if we partition V into the 22k subsets corresponding to all possible values of fi and gi, such a partition must be eps'-weakly regular with eps'=poly(eps). (Probably about eps1/2.)

Maybe this is not really three lines (in fact it is possibly more complicated than the "variance" argument), but I feel that one can see what is going on much better than in the "variance" argument.

Road trip!

On Wednesday, I took my legally polluting car to a trip to Los Angeles. Driving from San Francisco to Los Angeles involves mostly driving on highway 5, or, more or less, going straight for a few hundred miles. One starts from San Francisco, with its 65 degrees in the summer, its environmentalist car mechanics, and its dramatic hills, and suddenly one finds himself into a plain desert (not desert as in having few people, but desert as in sandy desert) with 105 degrees. Further ahead, highway 5 cuts through more fertile, but still very hot, rural areas. It is impossible to understand California politics, the election of Reagan and Schwarzenegger as governors, the passing of Proposition 22 and Proposition 209, the 45% of votes for Bush in 2004, without realizing that there are a lot of Californians who do not live in big cities or in University towns, and who do not hate our freedoms.

This may not be a novel observation, but Los Angeles is big. In San Francisco, and certainly in Manhattan, the size of a neighborhood is defined, in part, by walking distance. If you cannot walk between two places, then it would not occur to consider them "close." Here an apartment advertised as being close to the beach, say, may easily be farther from the beach than the Mission is far the Marina (these are two neighborhoods in San Francisco that are antipodal in character and quite geographically far as well), or even farther than San Francisco is from Oakland. So far I have driven just between Santa Monica, Westwood, Beverly Hills, West Hollywood and Fairfax, already an exceedingly vast area, and, on an LA map, this is all but a sliver of the city. (In fact, these are all places that are "close" to each other.)

I have decided to live in Santa Monica, near the 3rd Street Promenade, a stretch of a few blocks that is closed to traffic and is pedestrian-only. Pedestrians? In LA? I was surprised too, but I am getting the impression that it's all tourists. By the way, in Santa Monica there is a group of five streets named after famous universities: there is Yale, Harvard, Princeton, Stanford and Berkeley. No MIT.

Tuesday, September 05, 2006

Worst invention ever

In California, all cars must periodically submit to a "smog check" to verify that the catalytic converter works and that emissions are within a certain range.

Today I bring my car in for the check. When I pick it up, I ask "Is the car ok? Does it pollute?" "Of course it pollutes," says the car technician guy who just charged me \$142.99 for ten minutes of work, "This [the car] is the worst invention ever made. It's destroying our planet."

Friday, September 01, 2006

FOCS 2006

The program of FOCS 2006 is online. According to persistent rumors, the conference will be held in Berkeley.

The program lists Xi Chen (陈 西?) and Xiaotie Deng (邓 小铁?) as winners of the best paper award, for resolving the complexity of the problem of computing Nash equilibria, one of the fundamental (formerly) open questions in computational game theory. Nicholas Harvey receives the best student paper award.

Update (9/4/96): registration is open.

Expander graphs and their applications

A few years ago, Nati Linial and Avi Wigderson taught a course on expander graphs. The course lecture notes have been edited into an article that will appear in the Notices Bulletin of the AMS.