Project GAMR is live!

October 21, 2015

Occasionally I have written on these pages on my ideas on understanding players of games through analysis of their playstyles as expressed in all kinds of gameplay variables. These ideas have been the basis of research I have been doing with some of my PhD students.

One of the most visionary and challenging approaches to research in this area was construed by my PhD student Shoshannah Tekofsky — she aims to collect all kinds of data from hundreds of thousands of players of triple-A games, encompassing their demographics, personality, motivation, and psychological state, and combine this with data gathered from their actual game-playing, to gain insight in what drives game players, what they get out of games, and how their playstyle reflects their person. Her vision is that you can gather such data if you manage to really connect to players and offer them something that they value.

She has been working on this concept for over a year now, not just at our own university, but mainly at the famous MIT Media Lab, in close collaboration with people from Riot Games (League of Legends), DICE (Battlefield), and Blizzard (World of Warcraft). She calls it “Project GAMR”.

And today Project GAMR went live!

You can visit it at http://projectgamr.com. A Facebook page is found at https://www.facebook.com/projectgamr. Twitter handle is @ProjectGAMR (#ProjectGAMR).


Artificially stupid ducks

June 16, 2014

“The Eugene Goostman chatbot passed the Turing test. So now we finally have real artificial intelligence.”

That is what was reported recently by many news outlets. Of course, it is horribly wrong.

A chatbot is not intelligent. A chatbot has no understanding of what it says. A chatbot simply delves into a database of previously stored sentences (usually automatically retrieved from the Internet), and loosely links them to what the person who is testing the bot is typing. It uses non sequiturs instead of actual answers, repeats a persons statements back at him, and switches from topic to topic without rhyme or reason.

The authors of Eugene Goostman gave their bot the backstory that it was a 13-year old boy from the Ukraine, whose native language was not English. The fact that he was supposed to be a foreigner was introduced to make members of the jury more forgiving of the irrational answers that the bot provided. The fact that he was supposed to be 13 years old was introduced to make members of the jury more forgiving of the nonsensical switching of topics and general lack of knowledge and understanding. If that cheap trick is considered acceptable, then we have had artificial intelligence for many years now.

I mean, I have a program wherein you can type any text that you want, and it will never respond. As such, it functions nicely as a replica of an autistic person. It would also be relatively easy to create a program that resembles a heavy sufferer of Tourettes.

But even if the authors had not coined up this backstory, and were still able to fool 10 out of 30 judges — would we then have to conclude that Eugene Goostman is ‘real’ artificial intelligence? Would Alan Turing conclude that?

The answer is “no”. The Turing Test is one of the most misrepresented tests in the history of science. It is not a litmus test for artificial intelligence. It is merely an illustration of a philosophical stance that Alan Turing took.

The issue is as follows: how can we know whether a computer is intelligent or not? When Turing was alive, this topic was hotly debated amongst computer scientists and philosophers. Some claimed that a computer can never be ‘really intelligent’, as you can examine its programs and databases and (theoretically) derive exactly how it produces its answers. The counter-argument is that you can also open up a person’s brain and (theoretically) derive exactly how that person produces his answers. So what features would you want a computer to have, which allow you to unequivocally state that it is ‘really’ intelligent?

Alan Turing’s answer was: it is not important what is inside the computer; what is important is its behavior. If a computer’s behavior is indistinguishable from an intelligence, we should conclude that it is intelligent. Even if we could open up the computer, look inside, and point out some features that make us say: “You see that? That is how that intelligent behavior is generated!” that would only teach us something about how intelligence comes about, and would not invalidate the computer as an intelligent being (unless we open up the computer and see a human inside who provides all the answers, of course).

The Turing Test is only an illustration of Turing’s philosophical principle. He says that if a computer can converse so well that you cannot distinguish it from a human, then the computer converses as well as a human, and thus converses intelligently. There is no stipulation like ‘conversing for only 5 minutes’ or ‘the computer is allowed to limit the topics’ or ‘the computer should be forgiven for bad English’. Such stipulations would make no sense, because an intelligent conversation should demonstrate an understanding of the world. A chatbot that does not at least encompass a model of the world can never demonstrate an understanding. Simply reflecting sentences that you pick off the Internet might fool some uninitiated people for a while (that is not too hard, ELIZA managed to fool Joseph Weizenbaum’s secretary in 1964), but it will fool nobody for longer stretches of time.

The whole point is that Turing wanted to introduce the Duck Test for artificial intelligence — if it looks, swims, and quacks like a duck, you should conclude that it is a duck. We now know that it is not hard to fool a couple of people for 5 minutes into thinking that just maybe that computer over there is actually a human. We can do that due to the enormous speed that computers have achieved in processing data, and the huge storage capacity that modern computers have. But despite the fact that, by itself, it is not an easy task to make people think that a computer is conversing like a 13-year old Ukrainian boy, succeeding at that task is not the same as succeeding at creating an artificial intelligence.

As written, the Turing Test is not a test of artificial intelligence. Turing’s principle, however, stands: the Duck Test is the only viable way of determining whether a computer is really intelligent. However, we should realize that the duck itself is much bigger and much more complex than Turing’s original illustration sketches.


Observation

November 6, 2012

How do you recognize a “young researcher”?

They talk about what they do in their “spare time”.


Profiling a player

September 7, 2010

My work focuses on artificial intelligence (AI) in games. Not only on the AI that determines NPC behavior, but also AI that tries to understand the human player. The idea of the latter being that if a game can gain an understanding of the human player, it can automatically adapt the game to cater to a specific player. For example, if a game manages to determine that the player has an interest in a certain NPC, it might change the game so that the role of this NPC increases, maybe by shifting tasks to this NPC from another NPC that the player has less interest in.

We are also looking into possibilities to create a validated psychological profile by observing automatically a player’s behavior in a game. Psychologists usually employ introspective tests to build such a profile, but it is a well-known fact that the results of these tests are rather debatable. For instance, while it is assumed that a psychological profile changes only marginally over a few months, the difference in profiles determined by two tests with a few months in between might be radical. Our idea is that observation of the dozens of hours that someone plays a game might provide the means to build an accurate psychological profile of that person. And if the game is designed to build such a profile, it might go even faster.

One can argue that a game is not suitable for building a psychological profile, as a game provides a fantasy, and a person might act different in a fantasy than in real life. But that needn’t matter. If a player, while slightly provoked, kills off a whole village in-game when he is playing it for the first time, that certainly is indicative of a specific personality type, even if he would react rather demure to provocation in real life. One could even argue that when a situation encompasses no pressing outside influence (such as laws or peers), which is the case when playing a game, a person’s personality can truly come out.

The most convincing way to demonstrate our ideas is to use an actual, fairly recently published game. I think I have found that game with Fallout 3. Fallout 3 starts with a sequence of about one hour in which the player gets born, creates his character as a baby, then has a birthday party as a ten-year old, and finally must do a career aptitude test as a sixteen-year old. The sequence has encounters with several NPCs, and multiple possible ways of responding, from friendly to aggressive to obnoxious. This sequence has three purposes: (1) it teaches the player the game mechanics; (2) it allows the player to design his character; and (3) it soaks the player in the atmosphere of the game. The sequence would be ideal for the game to get to know the player, but is not used for that purpose (apart from checking whether the player prefers melee combat over ranged combat). The reason is probably that the designers would not know what to do with knowledge about a player’s psychological profile. Or maybe they do, but designing the game as a static experience is already work enough. Making the game adaptable would require a huge amount of extra work that they simply cannot afford.

In many domains outside game development interest in knowing a persons character through games exists. I am mainly thinking about serious games, which are usually employed to train a person for certain tasks. Training can be more effective if it is tailored to the trainee; not only to his skills and knowledge, but also to his character. And even commercial game developers can profit from further investigations in this area: some changes that certain player profiles would appreciate might be easy and cheap to implement, e.g., changes in prevalence of music, use of colors, or required speed and thinking time. If I look at myself as an example, I know several games which I do not like as they are, but which I probably would enjoy very much with a few simple changes (Katamari Damacy, for instance). The reason that I do not like such games can often be found in my psychological profile, as many of my preferences are shared by people with a similar profile.

Research in this area happens only at a small scale. Reasons are that it is new, and it requires knowledge of rather diverse areas, such as artificial intelligence, psychology, computer science, and sociology. It is also very time-consuming to execute. But I predict we will see more of it in the near future. I think there is a lot to gain, both for serious and commercial games.


Paper-thin quality

July 5, 2009

When I started doing research in artificial intelligence in games in 2001, I was one of the pioneers in this area. Such kind of research was seen by many as a bit ‘frivolous.’ I am glad to say that since then, this attitude has changed, and nowadays quite a few universities have scientists work on aspects of games. That is a good thing, because games comprise many elements that make them interesting for research. I might talk more about that in a future post, but in this post I want to make some statements about conferences in game research.

In 2001, there were very few conferences on game research. It was hard to find a place to submit papers to. Naturally, with the increasing popularity of game research, the number of conferences increased too. Nowadays there are several dozens of conferences and workshops in this area, and many more general conferences have a track on research in games. Personally, I am most interested in those conferences that focus exclusively on artificial intelligence in games, such as the AIIDE and the CIG.

Since I have contributed to quite a few of these conferences, I often get invited to act as referee. There was a time when I always accepted such an invitation, but nowadays I get so many of them that I sometimes have to refuse. This year I already reviewed about 30 papers for conferences, workshops, and journals. And I have noticed a disturbing trend.

The quality of the papers I get to review is decreasing rapidly. I am not entirely sure whether this is a general trend, or whether I have just been unlucky with the paper assignments, but I fear it is the first. Recently, I have been recommending rejections for about 80% of the papers that I get to review. And that is while I am a relatively ‘soft’ reviewer, who is usually willing to see if a paper is salvagable. Some of the grounds for rejection were:

  1. Submissions to conferences for which they are not suitable;
  2. Not being about scientific research;
  3. Excruciatingly bad English;
  4. Unoriginality and blandness;
  5. Vagueness in reporting; and
  6. Drawing conclusions that do not stand up to scrutiny.

I have a distinct feeling that many of the papers that I get to review are written by students. I have nothing against that in principle; I actually applaud having students write and present papers. But it is the job of students’ advisors to ensure that their papers are of acceptable quality. It seems to me that often advisors are keen to add their name to a paper but not assist in writing it.

Why do I get so many inferior papers to review? I think there are two main reasons.

The first reason is that game research is attractive, so many universities try to get on the bandwagon and do something in that area (nothing against that). Many universities also want to make a name for themselves, so they set up a conference or workshop around some theme in this area. But in getting papers they have to compete with all the other conferences and workshops that spring up out there. Some researchers have so many conferences and workshops where they can and want to send papers to, that they spread their research very thin, or write up very small results, or encourage their students to quickly write up their bachelor or master thesis results in paper format.

The second reason, and the most devastating one, is that the past has shown that many conferences indeed accept inferior papers. There are conferences out there that accept literally everything that is submitted to them. I have quite bad experiences with this. For instance, for a certain conference I refereed seven papers together with one other reviewer. We conversed about those papers and together rejected four of the seven. In the end we found that the conference organizers had simply accepted all seven of them.

Why do conferences do this? One reason is that they do not get a sufficient number of submissions, and to fill up their program they lower their standards. Another reason is that accepting papers boosts attendance: at most universities there is a policy that you only get to visit conferences where you have a paper accepted.

But what does it mean for me as a reviewer? Reviews take time, but they are a necessary part of doing science. Every scientist has to pay his dues in this respect. As a scientist, I do reviews to help out my fellow scientists, to boost research quality, and to get the occasional glimpse of interesting but yet unpublished research. What I expect from my fellow scientists is that they do their best to send in high-quality papers, with good research, grounded in science, focussed on an appropriate research area, and preferably written in acceptable English.

How do I respond when I get a paper that is unacceptable? If the authors attempted to do good research and write it down well, I do not mind writing a review that helps them to improve their work to acceptable levels. But if it is obvious that they just sent in a hastily thrown together piece of trash, then frankly, I feel insulted. I am expected to spend valuable time, which I could also spend on doing research, on reading and criticizing something that the authors themselves should have improved before submitting. They seem to say that their time spent on writing their paper is more valuable than my time spent on reviewing it.

All teachers know students who only start studying for exams after they failed one or more times. Evidently, such students aim to do the bare minimum needed for passing. I am wondering whether this attitude also persists with some scientists, whose goal it is to get published, no matter the quality of their work.

The main blame in this rests, in my opinion, with the conferences and workshops that set their standards too low. They encourage a bad attitude amongst authors. As a referee, my view is that they should simply not enlist my services if they intend to accept everything anyway. Yes, it sounds nice if your conference is ‘refereed,’ but if you accept everything, in practice it is not.

I think that, at present, we are inĀ  shake-out phase. There are too many conferences and workshops in game research. Those that accept too much low-quality work will die out after a short while, simply because serious scientists do not want to visit them any longer. I think it is therefore in the best interests of conferences to keep their quality standards high. Quality over volume will persist in the end.