Domination achieved

January 24, 2019

My post before this one was on the Global Gender Gap Index (GGGI), which is used by many news media, institutes, and even governments, as a foundation to argue that “women are disadvantaged compared to men in all countries in the world, and need to be awarded advantages to compensate for that.” I criticized the GGGI on three points, showing that it clearly purports a feminist agenda rather than trying to fight for parity between the sexes. Two weeks after I wrote that post, an article appeared in the high-quality scientific journal PLOS One, on this very topic. The article by Gijsbert Stoet and David C. Geary, titled “A simplified approach to measuring national gender inequality,” not only makes the same arguments as I did, but also demonstrates, using literature references, that in many areas men fall behind women. The article proposes a more objective measurement of gender parity than the activist one used by the GGGI, which is called the Basic Index of Gender Inequality (BIGI).

The BIGI is based on three components, namely (1) educational opportunities in childhood; (2) healthy life expectancy; and (3) overall life satisfaction. These three components share the fact that they are independent from life choices. For example, education is only examined in childhood, while tertiary education is excluded; the reason is that education in childhood is not a choice, while the decision to get into tertiary (university) education may be. So the fact that far more women than men go to university in developed countries is not giving women an advantage, as you cannot know whether this is because men are disadvantaged or that men on average simply do not like to go to university.

Education and life span are also components of the GGGI, but two notes should be made for the GGGI: (1) education is capped at 1.00, meaning that the fact that women are highly advantaged over men in developed countries in this respect is counted as ‘equality’; and (2) life expectancy is counted as equal when women live 6% longer than men, i.e., in a country where women only live 3% longer than men, the GGGI calls them ‘disadvantaged.’ The most remarkable thing, however, is that the third component, life satisfaction, is not even taken into account for the GGGI, while Stoet and Geary rightfully argue that “while it is very difficult to determine the degree to which men and women are disadvantaged in any particular aspect of life, an overall assessment of life satisfaction likely reflects the combination of advantages and disadvantages they have experienced, whatever they might be.”

Stoet and Geary use the BIGI to rate gender inequality in 134 countries. They also calculate the AADP, which basically is the variance in calculating the BIGI, to account for the fact that a country may have a BIGI close to zero (reflecting parity) while there are still high disadvantages for each of the genders, but in different areas. The “best” situation for a country is having both the BIGI and the AADP close to zero, meaning that men and women are treated in exactly the same way.

Stoet and Geary reach the unsurprising (to me, at least) conclusion that in underdeveloped countries women are usually disadvantaged over men, which is mainly the result of restricted education, while in developed countries women tend to be advantaged over men, which is mainly the result of a higher healthy life expectancy. It should be noted, however, that even though men tend to be disadvantaged compared to women in more developed countries, the higher the level of development in a country is, the closer it tends to be to complete gender equality.

Naturally, you can have a critical discussion about the components in the BIGI. However, the BIGI is a scale without an agenda: it tries to measure gender inequality in an objective way, rather than explicitly sell a biased message, as the GGGI does. One can only hope that research such as this makes governments in developed countries realize that the notion that women as a group are disadvantaged is not grounded in reality, and that letting radical feminists set their agenda is not a good idea.

Levels of sexism

December 15, 2018

Research has shown that on the highest level of secondary school in The Netherlands (VWO), large groups of children left primary school with a recommendation for a lower level of secondary school. This holds for 21% of the girls, and 14% of the boys. The reason for this difference is not known, but the cries of “it is sexism against girls, who are systematically underestimated by their primary school teachers” are already sounding.

I would just like to point out that these numbers could equally well be pointing at sexist attitudes towards boys. The advice of primary school teachers (predominantly women, by the way) is supposed to be leading in distributing children over secondary schools. This advice is far more often ignored for girls, providing them access to a higher level of school, than it is for boys. This sounds a lot like girls getting the benefit of the doubt far more often than boys get it. It just depends on which perspective you take: the primary school perspective which is holding girls back, or the secondary school perspective which is welcoming girls in.

In the end, however, I would like to stress that in this reporting it is explicitly stated that the reasons for the difference are unknown. Therefore, cries of “sexism,” whether it is against boys or girls, are at this time unwarranted.

Killer robots are here already

August 25, 2017

At the International Joint Conference on Artificial Intelligence 2017, an open letter was released, signed by over one hundred top scientists and industrialists in artificial intelligence, calling for a ban on the development of autonomous, artificially intelligent weapons, often referred to as “killer robots.”

This vapid gesture is equivalent to calling for a “ban on the development of knives that can be used to murder people.” The problem is that almost any device that can be taught behavior and is allowed autonomous functioning, can be employed as a “killer robot.” And all industrial artificial intelligence research advances intelligence, learning ability, and autonomy of machines.

Elon Musk might be in favor of a ban on the development of killer robots, but his Tesla company works on autonomous self-driving cars. Recent terrorist activities have demonstrated how cars can be used as weapons. You only need to teach a car to hit people instead of avoiding them.

Mustafa Suleyman might want to stop research into killer robots, but at the same time his DeepMind company is the leader in deep-learning research, which aims at allowing machines to learn patterns and respond to them. Such pattern recognizers can be easily placed in smart missiles or weaponized robots to autonomously find viable targets.

Jerome Monceaux signed the letter, while simultaneously heading Aldebaran Robotics, which develops general-purpose robots which can be taught or programmed to do anything — including using weapons and going on a murder-spree.

And the list goes on.

The whole point of artificial-intelligence research is to allow machines to do things that humans can do, preferably more efficiently and effectively, and preferably with a high degree of autonomy. Moreover, almost all modern artificial intelligence research is based on machine learning, i.e., teaching machines to behave in a particular way rather than directly programming them. Consequently, almost any artificial intelligence research can be used to teach machines to help people, or to behave as a weapon. This entails that machines that have the ability to operate as killer robots already exist.

Basically, the call for a ban on the development of killer robots amounts to a plea along the lines of: “Look, we are developing all this great technology which will bring fantastic benefits to humanity but please, please, please do not use it to murder people.” It is a call for sanity on the part of governments, the military, and terrorist organizations so that they won’t use the technology for evil. And we all know that the sanity of governments, military, and terrorists varies.

You cannot stop the possibility of (further) developing killer robots without a world-wide halt on artificial intelligence research altogether. I do not think that that is what any of these people who signed the letter, or anyone else, really wants. Or that it can be enforced, for that matter.

The best you can do is realize what artificial intelligence can be used for and then build in protections against misuse. For instance, autonomous self-driving cars should be strongly guarded against attempts to reprogram them. This is in the hands of Elon Musk and his competitors. Rather than calling for some kind of ban, they should do their jobs properly. And while I think they are trying to do a proper job, their call for a ban sounds like them trying to place the responsibility for misuse of their technology in the hands of others.

Any technology can be misused, and usually is. That is no reason not to develop beneficial technology. The benefits of autonomous artificial intelligence can be great. The dangers of it are lurking in the autonomy — technology which allows machines to operate autonomously, taking autonomous decisions on how to act, should be surrounded by stringent safeguards against the machines taking harmful decisions. But probably the biggest danger is not in the artificially intelligent machines themselves, but in the humans who place unwarranted trust in them to take autonomous decisions.

I applaud the fact that many influential people consider the dangers of artificial intelligence research seriously. The call for a ban, however, sounds like an after-the-fact plea.

Project GAMR is live!

October 21, 2015

Occasionally I have written on these pages on my ideas on understanding players of games through analysis of their playstyles as expressed in all kinds of gameplay variables. These ideas have been the basis of research I have been doing with some of my PhD students.

One of the most visionary and challenging approaches to research in this area was construed by my PhD student Shoshannah Tekofsky — she aims to collect all kinds of data from hundreds of thousands of players of triple-A games, encompassing their demographics, personality, motivation, and psychological state, and combine this with data gathered from their actual game-playing, to gain insight in what drives game players, what they get out of games, and how their playstyle reflects their person. Her vision is that you can gather such data if you manage to really connect to players and offer them something that they value.

She has been working on this concept for over a year now, not just at our own university, but mainly at the famous MIT Media Lab, in close collaboration with people from Riot Games (League of Legends), DICE (Battlefield), and Blizzard (World of Warcraft). She calls it “Project GAMR”.

And today Project GAMR went live!

You can visit it at A Facebook page is found at Twitter handle is @ProjectGAMR (#ProjectGAMR).

Artificially stupid ducks

June 16, 2014

“The Eugene Goostman chatbot passed the Turing test. So now we finally have real artificial intelligence.”

That is what was reported recently by many news outlets. Of course, it is horribly wrong.

A chatbot is not intelligent. A chatbot has no understanding of what it says. A chatbot simply delves into a database of previously stored sentences (usually automatically retrieved from the Internet), and loosely links them to what the person who is testing the bot is typing. It uses non sequiturs instead of actual answers, repeats a persons statements back at him, and switches from topic to topic without rhyme or reason.

The authors of Eugene Goostman gave their bot the backstory that it was a 13-year old boy from the Ukraine, whose native language was not English. The fact that he was supposed to be a foreigner was introduced to make members of the jury more forgiving of the irrational answers that the bot provided. The fact that he was supposed to be 13 years old was introduced to make members of the jury more forgiving of the nonsensical switching of topics and general lack of knowledge and understanding. If that cheap trick is considered acceptable, then we have had artificial intelligence for many years now.

I mean, I have a program wherein you can type any text that you want, and it will never respond. As such, it functions nicely as a replica of an autistic person. It would also be relatively easy to create a program that resembles a heavy sufferer of Tourettes.

But even if the authors had not coined up this backstory, and were still able to fool 10 out of 30 judges — would we then have to conclude that Eugene Goostman is ‘real’ artificial intelligence? Would Alan Turing conclude that?

The answer is “no”. The Turing Test is one of the most misrepresented tests in the history of science. It is not a litmus test for artificial intelligence. It is merely an illustration of a philosophical stance that Alan Turing took.

The issue is as follows: how can we know whether a computer is intelligent or not? When Turing was alive, this topic was hotly debated amongst computer scientists and philosophers. Some claimed that a computer can never be ‘really intelligent’, as you can examine its programs and databases and (theoretically) derive exactly how it produces its answers. The counter-argument is that you can also open up a person’s brain and (theoretically) derive exactly how that person produces his answers. So what features would you want a computer to have, which allow you to unequivocally state that it is ‘really’ intelligent?

Alan Turing’s answer was: it is not important what is inside the computer; what is important is its behavior. If a computer’s behavior is indistinguishable from an intelligence, we should conclude that it is intelligent. Even if we could open up the computer, look inside, and point out some features that make us say: “You see that? That is how that intelligent behavior is generated!” that would only teach us something about how intelligence comes about, and would not invalidate the computer as an intelligent being (unless we open up the computer and see a human inside who provides all the answers, of course).

The Turing Test is only an illustration of Turing’s philosophical principle. He says that if a computer can converse so well that you cannot distinguish it from a human, then the computer converses as well as a human, and thus converses intelligently. There is no stipulation like ‘conversing for only 5 minutes’ or ‘the computer is allowed to limit the topics’ or ‘the computer should be forgiven for bad English’. Such stipulations would make no sense, because an intelligent conversation should demonstrate an understanding of the world. A chatbot that does not at least encompass a model of the world can never demonstrate an understanding. Simply reflecting sentences that you pick off the Internet might fool some uninitiated people for a while (that is not too hard, ELIZA managed to fool Joseph Weizenbaum’s secretary in 1964), but it will fool nobody for longer stretches of time.

The whole point is that Turing wanted to introduce the Duck Test for artificial intelligence — if it looks, swims, and quacks like a duck, you should conclude that it is a duck. We now know that it is not hard to fool a couple of people for 5 minutes into thinking that just maybe that computer over there is actually a human. We can do that due to the enormous speed that computers have achieved in processing data, and the huge storage capacity that modern computers have. But despite the fact that, by itself, it is not an easy task to make people think that a computer is conversing like a 13-year old Ukrainian boy, succeeding at that task is not the same as succeeding at creating an artificial intelligence.

As written, the Turing Test is not a test of artificial intelligence. Turing’s principle, however, stands: the Duck Test is the only viable way of determining whether a computer is really intelligent. However, we should realize that the duck itself is much bigger and much more complex than Turing’s original illustration sketches.


November 6, 2012

How do you recognize a “young researcher”?

They talk about what they do in their “spare time”.

Profiling a player

September 7, 2010

My work focuses on artificial intelligence (AI) in games. Not only on the AI that determines NPC behavior, but also AI that tries to understand the human player. The idea of the latter being that if a game can gain an understanding of the human player, it can automatically adapt the game to cater to a specific player. For example, if a game manages to determine that the player has an interest in a certain NPC, it might change the game so that the role of this NPC increases, maybe by shifting tasks to this NPC from another NPC that the player has less interest in.

We are also looking into possibilities to create a validated psychological profile by observing automatically a player’s behavior in a game. Psychologists usually employ introspective tests to build such a profile, but it is a well-known fact that the results of these tests are rather debatable. For instance, while it is assumed that a psychological profile changes only marginally over a few months, the difference in profiles determined by two tests with a few months in between might be radical. Our idea is that observation of the dozens of hours that someone plays a game might provide the means to build an accurate psychological profile of that person. And if the game is designed to build such a profile, it might go even faster.

One can argue that a game is not suitable for building a psychological profile, as a game provides a fantasy, and a person might act different in a fantasy than in real life. But that needn’t matter. If a player, while slightly provoked, kills off a whole village in-game when he is playing it for the first time, that certainly is indicative of a specific personality type, even if he would react rather demure to provocation in real life. One could even argue that when a situation encompasses no pressing outside influence (such as laws or peers), which is the case when playing a game, a person’s personality can truly come out.

The most convincing way to demonstrate our ideas is to use an actual, fairly recently published game. I think I have found that game with Fallout 3. Fallout 3 starts with a sequence of about one hour in which the player gets born, creates his character as a baby, then has a birthday party as a ten-year old, and finally must do a career aptitude test as a sixteen-year old. The sequence has encounters with several NPCs, and multiple possible ways of responding, from friendly to aggressive to obnoxious. This sequence has three purposes: (1) it teaches the player the game mechanics; (2) it allows the player to design his character; and (3) it soaks the player in the atmosphere of the game. The sequence would be ideal for the game to get to know the player, but is not used for that purpose (apart from checking whether the player prefers melee combat over ranged combat). The reason is probably that the designers would not know what to do with knowledge about a player’s psychological profile. Or maybe they do, but designing the game as a static experience is already work enough. Making the game adaptable would require a huge amount of extra work that they simply cannot afford.

In many domains outside game development interest in knowing a persons character through games exists. I am mainly thinking about serious games, which are usually employed to train a person for certain tasks. Training can be more effective if it is tailored to the trainee; not only to his skills and knowledge, but also to his character. And even commercial game developers can profit from further investigations in this area: some changes that certain player profiles would appreciate might be easy and cheap to implement, e.g., changes in prevalence of music, use of colors, or required speed and thinking time. If I look at myself as an example, I know several games which I do not like as they are, but which I probably would enjoy very much with a few simple changes (Katamari Damacy, for instance). The reason that I do not like such games can often be found in my psychological profile, as many of my preferences are shared by people with a similar profile.

Research in this area happens only at a small scale. Reasons are that it is new, and it requires knowledge of rather diverse areas, such as artificial intelligence, psychology, computer science, and sociology. It is also very time-consuming to execute. But I predict we will see more of it in the near future. I think there is a lot to gain, both for serious and commercial games.

Paper-thin quality

July 5, 2009

When I started doing research in artificial intelligence in games in 2001, I was one of the pioneers in this area. Such kind of research was seen by many as a bit ‘frivolous.’ I am glad to say that since then, this attitude has changed, and nowadays quite a few universities have scientists work on aspects of games. That is a good thing, because games comprise many elements that make them interesting for research. I might talk more about that in a future post, but in this post I want to make some statements about conferences in game research.

In 2001, there were very few conferences on game research. It was hard to find a place to submit papers to. Naturally, with the increasing popularity of game research, the number of conferences increased too. Nowadays there are several dozens of conferences and workshops in this area, and many more general conferences have a track on research in games. Personally, I am most interested in those conferences that focus exclusively on artificial intelligence in games, such as the AIIDE and the CIG.

Since I have contributed to quite a few of these conferences, I often get invited to act as referee. There was a time when I always accepted such an invitation, but nowadays I get so many of them that I sometimes have to refuse. This year I already reviewed about 30 papers for conferences, workshops, and journals. And I have noticed a disturbing trend.

The quality of the papers I get to review is decreasing rapidly. I am not entirely sure whether this is a general trend, or whether I have just been unlucky with the paper assignments, but I fear it is the first. Recently, I have been recommending rejections for about 80% of the papers that I get to review. And that is while I am a relatively ‘soft’ reviewer, who is usually willing to see if a paper is salvagable. Some of the grounds for rejection were:

  1. Submissions to conferences for which they are not suitable;
  2. Not being about scientific research;
  3. Excruciatingly bad English;
  4. Unoriginality and blandness;
  5. Vagueness in reporting; and
  6. Drawing conclusions that do not stand up to scrutiny.

I have a distinct feeling that many of the papers that I get to review are written by students. I have nothing against that in principle; I actually applaud having students write and present papers. But it is the job of students’ advisors to ensure that their papers are of acceptable quality. It seems to me that often advisors are keen to add their name to a paper but not assist in writing it.

Why do I get so many inferior papers to review? I think there are two main reasons.

The first reason is that game research is attractive, so many universities try to get on the bandwagon and do something in that area (nothing against that). Many universities also want to make a name for themselves, so they set up a conference or workshop around some theme in this area. But in getting papers they have to compete with all the other conferences and workshops that spring up out there. Some researchers have so many conferences and workshops where they can and want to send papers to, that they spread their research very thin, or write up very small results, or encourage their students to quickly write up their bachelor or master thesis results in paper format.

The second reason, and the most devastating one, is that the past has shown that many conferences indeed accept inferior papers. There are conferences out there that accept literally everything that is submitted to them. I have quite bad experiences with this. For instance, for a certain conference I refereed seven papers together with one other reviewer. We conversed about those papers and together rejected four of the seven. In the end we found that the conference organizers had simply accepted all seven of them.

Why do conferences do this? One reason is that they do not get a sufficient number of submissions, and to fill up their program they lower their standards. Another reason is that accepting papers boosts attendance: at most universities there is a policy that you only get to visit conferences where you have a paper accepted.

But what does it mean for me as a reviewer? Reviews take time, but they are a necessary part of doing science. Every scientist has to pay his dues in this respect. As a scientist, I do reviews to help out my fellow scientists, to boost research quality, and to get the occasional glimpse of interesting but yet unpublished research. What I expect from my fellow scientists is that they do their best to send in high-quality papers, with good research, grounded in science, focussed on an appropriate research area, and preferably written in acceptable English.

How do I respond when I get a paper that is unacceptable? If the authors attempted to do good research and write it down well, I do not mind writing a review that helps them to improve their work to acceptable levels. But if it is obvious that they just sent in a hastily thrown together piece of trash, then frankly, I feel insulted. I am expected to spend valuable time, which I could also spend on doing research, on reading and criticizing something that the authors themselves should have improved before submitting. They seem to say that their time spent on writing their paper is more valuable than my time spent on reviewing it.

All teachers know students who only start studying for exams after they failed one or more times. Evidently, such students aim to do the bare minimum needed for passing. I am wondering whether this attitude also persists with some scientists, whose goal it is to get published, no matter the quality of their work.

The main blame in this rests, in my opinion, with the conferences and workshops that set their standards too low. They encourage a bad attitude amongst authors. As a referee, my view is that they should simply not enlist my services if they intend to accept everything anyway. Yes, it sounds nice if your conference is ‘refereed,’ but if you accept everything, in practice it is not.

I think that, at present, we are inĀ  shake-out phase. There are too many conferences and workshops in game research. Those that accept too much low-quality work will die out after a short while, simply because serious scientists do not want to visit them any longer. I think it is therefore in the best interests of conferences to keep their quality standards high. Quality over volume will persist in the end.