Software engineers use a humorous approach to teach computers
how to predict our tastes.
Article by Regina Kirchweger
Illustrations by Emma Skurnick
This joke might not elicit even a little twitch on your face, whereas others might find themselves slapping their thighs in delight. Most probably your reaction depends on your cultural background, your environment and, of course, your sense of humor.
For millennia, philosophers and psychologists have racked their brains to define the sense of humor. As with all matters of taste, however, it is not that simple to predict what will make us laugh. But now Ken Goldberg, an engineering professor at the University of California, Berkeley, and his students have created a unique Web site, Jester 2.0 Jokes for Your Sense of Humor1, that tries to do just that. Based on how users rate a set of sample jokes, Jester 2.0 recommends new ones that readers should progressively find more amusing. Doubting that a computer program could possibly know what I would find funny, I tried it. And to my surprise (and delight), after I had read and rated a succession of jokes, the jokes did become funnier - at least initially.
But besides tailoring jokes to personal preferences, Jester is primarily an experiment to come up with faster, more efficient ways of grouping people with similar tastes. Called "collaborative filtering" or "recommender systems," the technique is increasingly being used by Internet companies such as Amazon.com to record consumer preferences and boost their sales by recommending books to groups of people with similar likes and dislikes. But so far, online retailers rely mainly on previous purchases to generate recommendations since the collaborative-filtering method breaks down quickly if people havent read each book in a list of examples.
Jokes dont have that problem. "With jokes, you can form an opinion in about 30 seconds," says Goldberg. And most important, people love jokes. Nobody would rate an endless series of books or CDs just to allow a couple of strangers to improve their computer program. So, researchers are looking to humor as an easy way of categorizing people - with the added advantage that their subjects will have so much fun doing it that they wont realize they are being parceled, binned, and sorted.
Psychologists explain this desire to laugh with the positive functions of humor. Humor allows us to cope with the harsh reality of existence, allowing us a smiling attitude toward life and its imperfections.
In the form of jokes, humor releases negative feelings in a socially accepted form. Jokes about the boss usually dont get you fired. Sigmund Freud suggested that jokes we perceive as funny touch on anxiety-arousing themes, topics we usually dont talk about. Thus, a joke initially evokes feelings of anxiety, but the punch line suddenly relieves the negative feelings and transforms them into laughter. The pleasure of the joke derives from this sudden reduction in anxiety. Greater reductions in anxiety are associated with greater pleasure and mirth. But if the punchline doesnt come as a surprise, the effect is lost. Therefore, most jokes take an unexpected turn, as in the following example, taken from Jesters repertoire:
Clinton has a clearly different perspective from his aide, making him sound ludicrous, although the answer in itself is perfectly justifiable. The listeners expectation is set up in a different frame of mind a bill to be passed, not paid, stupid. "Sense of humor plays into the ability to change the perspective," says Willibald Ruch, a psychologist at the University of Düsseldorf, Germany.
Ruch studies the connection between different personalities and their different senses of humor. Based on the structure of jokes he developed a taxonomy. He distinguishes two main categories of jokes: incongruent ones, which are solved by the feeling of "getting the point," and what he calls "nonsense jokes," such as this example:
Many researchers before Ruch had put forth theories of humor and laughter. Interestingly, however, the great majority of these theories did not specifically address individual differences in the sense of humor. They tried to explain why we laugh in certain situations and not in others. Ruch has studied peoples reactions to jokes and finds a wide variation among different people. But nevertheless he says, "I do believe that individuals are so consistent that you can make predictions."
The attraction of jokes, the wide variation in peoples reactions, and a persons individual consistency in taste preference make humor an ideal model system for computer programmers who try to come up with more accurate and faster filtering systems. Like a friend who knows you well and can predict which kind of jokes will make you laugh or which books you will probably like, recommender systems try to do the same. They automate the "word of mouth" process and come up with suggestions for movies, restaurants, or in this particular case, jokes. And people love it.
When Goldbergs team posted the second version of the Jester Web site last year, its immediate popularity surprised Dhruv Gupta, a former student in Goldbergs lab and the main developer of the algorithm behind it. He spent long nights in front of the server, watching with fascination as people from all over the world logged in. "It would be night in Berkeley but day in, lets say New Zealand, and people were logging on from there. As the day progressed west you could follow it, first Australia, then Japan, China, all over Asia and finally Europe," Gupta says. After wirednews.com wrote about Jester 2.0 in its online magazine, word of e-mail spread the news so fast that the server crumbled under the onslaught of hits. As of last September, more than 40,000 people had registered. And now, I am one of them.
But still, the idea that a machine could possibly know what I would find funny or which books I would enjoy sounded strange to me. So, how does it work? When I logged onto the Jester 2.0 Web page I was presented with 10 jokes and asked to rate them on a seamless scale from not funny to very funny. Some jokes are harmless:
Others contain sexual innuendo:
Perplexed, I asked myself what made these jokes so special that they would allow conclusions about my sense of humor. "If a joke had a wide range of ratings it was put in the initial set of 10 jokes," Gupta says. This allows the system to record information on many different tastes. There was no selection based on content or the structure of the joke.
The first step is to figure out whose ratings are close to your own. by comparing which jokes you liked and which ones you didnt to those someone else liked or didnt like. The result then tells you who has pretty much the same sense of humor and who hasnt. And because one persons recommendation might not be perfect, Jester combines ratings from several people.
Most collaborative systems work with this principle, called "nearest neighbor analysis." Just how the nearest neighbor is found varies. In the case of Jester 2.0, each user becomes a point in a "10-dimensional joke-space," as Goldberg calls it. Each joke is used to define another dimension of an individuals funny bone, comparable to describing a real person with 10 different physical features: rather tall, almost black hair, light blue eyes, and so on.
Lets assume that I find the first joke hilariously funny and rate it +5 on a scale from 5 (bores me out of my pants) to +5 (I am still choking). I will be assigned a point on the x-axis, namely x=5. I couldnt care less about the next one and rate it a straight 5, that gives me y=-5. The third joke is somewhat entertaining and receives a +1, placing me at 1 on the z-axis, and so on till I end up being a point defined in 10 dimensions, along 10 different axes. Since I am sharing this joke space with more than 40,000 other users, I will have lots of neighbors close to me, consisting of people who rated the jokes similarly (though probably not identically). Other users are far away, depending on their individual responses to each joke.
To cut down on computation, the Jester team reduced the 10 dimensions to two by collapsing all of this data onto a plane, like dots on a sheet of paper. They do it in a way that preserves the users relative positions to each other. Having arrived at two dimensions, close neighbors are lumped together. One can imagine a grid consisting of tiny squares being laid over the dots. Goldberg and his team had expected clusters of users, sharing specific senses of humor. To their surprise, they found an even distribution all over the chart.
After the nearest neighbors have been determined, Jester 2.0 moves on to another data base containing 100 fresh jokes. Choosing from this selection, the computer presents me with the one my nearest neighbors (I am sharing my little square with them) found the funniest, namely:
Goldbergs experience with Jester 2.0 made him a valuable resource for a recently started company, NetCustomize.com, where he serves on the board of advisers. Like Jester 2.0, NetCustomize.coms eLOL (electronic Laugh Out Loud) 2 offers free personalized jokes based on ratings. The users can specify type and content and how often they would like to have jokes delivered to the desktop. Being rewarded with a good laugh, people are signing up by the tens of thousands for eLOL to receive their daily dose of humor, serving unwittingly as guinea pigs. For the company has other goals in mind than making their customers laugh.
The ratings provide the feedback used to improve the filtering algorithms that the company will use to recommend videos and music later this year. Yet the company is not planning to sell CDs or videos. "The idea is to tailor the banner advertisements to taste," says Shuki Nir, CEO of the New York-based company. "We started with eLOL to prove that taste really matters - and it does."
After accumulating a wealth of data Jester 2.0 and eLOL might not only serve as training grounds for collaborative filtering systems but also offer interesting possibilities for personality psychologists and humor researchers. This may not happen soon, however. Several psychologists have already approached NetCustomize.com but were turned down. "Privacy is a major issue. We would distribute our aggregated data only to non-profit organizations if we think their proposed projects are useful", says Nir.
Goldberg offered to share his data with research groups studying the psychology of jokes. However, Ruch, for one, is not enthusiastic about Jesters results. "The sample is small and not representative. You have to keep in mind who is surfing the Internet and who has time to [rate jokes]", says Ruch. The typical eLOL user, as determined in a survey on behalf of NetCustomize, is an Internet-savvy male between the ages of 18 and 35
The same probably holds true for Jester 2.0 and might explain the predominance of engineering jokes I received. After getting a good laugh out of the first one, the novelty factorstarted to wear off quickly when four out of six jokes poked fun at engineers. The saturation point was reached, as humor researchers call it. "One may like blonde jokes, but after the fifth you have had enough. I see that as a weakness of Jester 2.0," says Ruch. Humor psychologists are interested in the topic and the structure of jokes. For them, the main limitation of recommender systems is that they rely on strictly statistical methods. The content of the jokes has no influence." But what can developers of filtering algorithms learn?
The wide variety of joke preferences and the willingness of customers to rate jokes proved very helpful for eLOLs technicians, who constantly analyze the performance of their algorithms. Because subscribers are routinely asked to rate recommended jokes their feedback is used to verify the accuracy of predictions and to improve them. But besides helping with the mathematics behind the filtering algorithm, joke preferences might yield deeper insights into the tastes of potential buyers.
Ruch has examined different aspects of taste and how they correlate. "I do believe that a general correlation exists between aesthetic forms. The picture on the wall correlates with the jokes one likes," he says. Gary Larson and modern art? "People who like nonsense jokes are attracted by complexity," says Ruch. Before NetCustomize.com moves on to recommending music, its marketing experts should take a careful look at their eLOL subscribers, because Ruch has more to offer. In one of his studies he looked for a correlation between music taste and joke preferences and discovered that, "Jazz lovers tend to more chaotic things, in this case nonsense jokes." But for now, he hasnt been invited onto NetCustomizes board of advisers.
Although humor undoubtedly has its merits, it is not a panacea to solve all problems faced by programmers and online shops relying on recommender systems. Online customers are not usually willing to spent an hour to rate a whole series of example items. Also, a joke delivered for free doesnt hurt the bank account, no matter whether it turns out to be funny or not. "For higher valued items people normally prefer more information than simply people like you recommended this item," says Paul Resnick, professor at the University of Michigans School of Information and developer of one of the first collaborative filtering systems that successfully predicted music tastes.
Therefore, the current recommender systems of companies like Amazon.com are based solely on previous purchases. That leads to phenomena like books on gardening cluttering up my personalized list. It all began with last years birthday present for Mum. Even worse, the author, who turned out to be hopelessly boring, still pops up, even though I bought just one of her books.
Systems that provide feedback through rating would circumvent this problem, if buyers can be motivated to take the time and voice their opinions. But what about the new book or CD that hasnt been rated yet? Commercial recommenders will face this problem frequently with newly available items. And those are exactly the ones they would like to show up on personalized lists.
Arndt Kohr, research assistant at the Eurécom Institute in France, tries to come up with a solution for this dilemma. In his Active WebMuseum3 he doesnt rely only on statistical methods, as does Jester 2.0. He also takes content into account. The paintings in his "museum" are indexed based on color and structure and grouped accordingly. Predominantly red paintings are closer to each other than to a mostly yellow one. If somebody likes a particular picture, chances are high that he or she will like a picture that is close to the first one, even if nobody else has ever rated it. To Kohrs chagrin a chronic shortage of visitors to his Web page, who are willing to take the time and rate paintings, makes progress slow for him. Nevertheless, he has found "clearly a correlation between color or structure preferences."
Kohr believes that content-based filtering can also reduce the chance that users miss new items that are far off the main path. "If you have a new movie that is very underground and very strange, and nobody has rated it yet, you can use the content to recommend it."
There have been a few empirical studies of prediction quality of collaborative filtering systems, and as it turns out, they tend to be accurate most of the time. However, Resnick, the pioneer of recommender systems, has his own personal experience. "When [recommendations] are off they are more likely to be way off than with direct suggestions from friends. Theres no common-sense check to make sure the recommendation isnt silly," he says. Collaborative filtering or not, there is still nothing like a good friend who laughs with us about the same old joke.
Text © 2000 Regina Kirchweger
Illustrations © 2000 Emma Skurnick