Posts Tagged ‘AI’

Musings on the Starcraft II AI Test …

February 22, 2019

So, today I’m going to muse on the Alpha Star AI playing Starcraft II that I kinda talked about here. These are just musings on the topic and so might be a bit inaccurate, and I’m not doing any extra research and so am just relying on what Shamus and the commenters said about it (links to the posts are in the post I linked above).

As best I can understand it, the system was essentially a neural net that was trained using what were in my day called Genetic Algorithms and potentially have evolved a much more “cool” name, where there were a number of agents that played the game against each other and the best ones were kept to play against each other again and so on and so forth. What I did way back in university — as part of my Honours Project I did a GA simulating a “Diplomacy Problem”, with various countries getting advantages or disadvantages based on whether or not they agreed with the others — was create a set number of agents — 50? — and rank them by score, and then drop the bottom 10, double the top 10 for the next run, and leave the rest. I hope they did something similar, but at any rate the overall idea is the same: run the agents, see which ones get the best score, keep those, keep or introduce some new agents so that they can learn new behaviour, rinse, repeat.

What this meant is that they needed to have agents that could play each other in order to generate the massive data sets that you need to train a neural net, which led them to limit the agents to being able to play as Protoss against players that are playing as Protoss. Right away, this is a little unimpressive, as humans can learn to play as pretty much any combination of the races a lot faster than the agents learned to play as Protoss against Protoss. This also led me to comment on the posts that there’s a risk with trying to get it to learn to play as other races or against other races because of the nature of neural nets. The big advantage of neural nets is that you don’t need to program any rules or semantics into it to get it to solve the problems. There aren’t really any rules or semantics in a neural net. Sure, there may be some in there somewhere, and it often acts like it has rules or semantics, but internally to the system there aren’t any. The system learns by semi-randomly adding and removing nodes and connections and adjusting the weights of connections, but the system doing that, at least in a pure neural net (supposedly Deep Learning systems combine the semantics of inference engines and the flexibility of neural nets, but I haven’t looked at them yet) doesn’t have any idea what actual rules or decisions those things are involved in. Thus, a common problem with early neural nets was that when you decided to train it to do something different or learn anything there was always a risk that you’d break existing behaviour unless you also trained it on the old functionality at the same time, which is not how things seem to work in humans. You can limit that by restricting how much it can change the original net, but then it has a harder time learning anything new. Make it static and the machine can’t learn, but make it random and it will forget lots of things it used to know.

What this means for our agents is that teaching them to play as another race or against another race might cause them to forget important details about how to play as a Protoss against a Protoss. I opined that what they’d probably do instead is build separate agents for each case and then have a front-end — which could be an inference engine since this is all deterministic — pick which agent to use. After all, while there are nine different combinations — the AI playing each race potentially against all other races — that’s set at the beginning of the game and so it’s a pretty straightforward decision of which agent to use, and there’s no real reason to try to teach the AI to try to find the ideal match-up given who they’re playing against. So this seems to me to be the easier way to go than trying to build a generic agent that can play all combinations, and it’s actually even less artificial than some of the other things that the agents were already committed to.

So, after the AI beat all the players the first round, how did the one player get a rematch and beat it rather handily? What he did was adopt a strategy that the AI was vulnerable to, which was harassment. The player waited until the AI had built a big army and sent it off towards his base, and then sent a few units in to attack the base. The AI turned its army around to meet the threat, and he moved the units away. After they were chased off and/or destroyed, the AI started out again … and the player repeated the harassing attack. Rinse, repeat, and eventually win the game.

One issue with neural net type AIs is that since they learn through repetition over massive data sets, they don’t really have the ability to learn or adapt on the fly. They don’t really learn much from one event or run. Inference engines actually can learn on the fly because their actions are driven by the premises and logic of their systems, and so if one event doesn’t turn out right they can immediately try to reassess their inferences. In this case, for example, the AI was probably bringing the army back because it anticipated that it was a mass invasion that it needed to repel. A neural net won’t store this explicitly, but an inference engine will. So there’s a chance that after a few repetitions it concludes that this doesn’t indicate a mass invasion and will learn to ignore it. Which, then, would leave it vulnerable to a “Cry Wolf” strategy: harass it until it learns to ignore the harassment, and then launch a full-scale attack to catch it napping. Which it could then learn to defend against as well, and so on and so forth.

People in the comments asked if you could just teach it to ignore the harassment, but the problem with neural nets is that you can’t really teach them anything, at least by explaining it or adding it as an explicit rule. Inference engines can be tweaked that way because they encode explicit rules, but neural nets don’t. To add a rule to the system you have to train it on data sets aimed at establishing that rule until they learn them. There are approaches that allow for feedback and training of that sort from what I’ve seen (mostly through short presentations at work) but either those will establish explicit rules which the system has to follow — even if wrong — or else they can be overridden by the training and so would need to be trained and retrained. In short, you can explain things to an inference engine, but not really to a neural net. You can only either let the net learn it itself or flat-out tell it the answer.

Neural nets, I think, excite people for two reasons. First, because they don’t generally have explicit rules they can come up with unique correct answers that we, ourselves, can’t figure out, or that at least are extremely difficult for us to figure out. This makes them look more intelligent than we are for coming up with answers that we couldn’t see. Inference engines and expert systems can come up with novel solutions as well, but all of those systems can explain how they came to that conclusion and so seem less “mysterious”, in much the same way as when we see Sherlock Holmes explain his reasoning it seems less mysterious and, often, more of a “Why didn’t we see that?”. We aren’t that impressed by computers having access to all the data and never forgetting or forgetting to consider any of it since that’s kinda what they do, but we are impressed by what seem like leaps of intuition that we can’t match. The other reason is that they loosely resemble the structure of the human brain — although anyone doing AI will tell you that they aren’t really that close, but as they are designed to model that in at least some ways the point still stands — and so people impressed by neuroscience will think that it’s closer to what we really do. Personally, I’m more interested in the reasoning aspects of intelligence and am more interested in finding the algorithm we use rather than emulating the hardware, so I’m less impressed by them. Still, they do manage to do the pattern-matching aspects of intelligence well and far better than more reasoning based systems, which has led me to opine that the ideal AI has an inference engine front-end and a neural net back-end. The inference engine answers what it can and passes off anything else to the neural net, assess the answer, adopts it if it seems to work and retrains the net if it doesn’t. Again, some people commented that this seems like what Deep Learning does.

All of this starts to tie back into the heart of questions about AI leading all the way back to Searle: does the Alpha Star agent actually understand how to play Starcraft II? There’s no semantics to a neural net. You could take those agents and hook them up to something that is, say, trying to run a factory and if the weights were correct the system could do that just as well (and people have indeed taken neural nets trained for one specific task and had them perform entirely different tasks and noted that they can more or less work). So what does the agent actually understand about Starcraft II itself? Does it know what the units are and what they mean? It doesn’t have to, as it doesn’t really encode that information in the neural net itself? If you don’t have the semantics, do you really understand anything at all? With Searle’s Chinese Room, most will agree, at least, that the person inside the room is not doing anything intelligent by simply taking in a symbol, looking up the answer, and passing it back out. That person doesn’t understand Chinese. What people say the error is with the thought experiment is assuming that the room itself can’t understand, or couldn’t if it had the right context and information. But all of that is semantic information about meaning. Does a neural net in and of itself ever have meanings? Does the Alpha Star agent store any semantic information at all, even to the extent that an inference engine does? Having the right output doesn’t guarantee meaning, especially if it can be used for things that mean things that are completely different. So does it have meaning? And if it doesn’t, does it really understand?

These may not be questions that the creators are worried about. They may simply want to build an AI to beat humans at a video game. But these questions will obviously be raised by these things, and the answers — and attempts to answer them — are of great interest to fields like Philosophy of Mind, Cognitive Science and Psychology.

Thanks, Shamus!

February 20, 2019

So, Shamus Young made two posts talking about Alpha Star’s attempts to create an AI that can play Starcraft II, and how it managed to beat human players and then where a human player exploited a tendency in it to beat it. There was a lot of discussions about that in the comments, and that made me want to do AI again after it being a … few years since my last attempt. And, of course, I clearly have lots of time to spare and no other projects that I want to look at that I could be doing instead of that. Thanks, Shamus!

Anyway, I went out and bought some books on the subject, two of which are detailed books about how to do AI in general and how to do Deep Learning in Python (the last is a technical book on Deep Learning that I would have already started reading except that it starts with Linear Algebra, which is not something I want to review while watching curling …). So I have that to get to, but in pondering it and reading the comments another idea percolated in me.

The AI there focuses a lot on neural nets, as far as I can tell. Now, neural nets have been around for ages, and have waxed and waned in their popularity for AI due to their rather well-known weaknesses (I’ll talk more about that in general in a later post). But one thing that kept coming up, especially when the exploit was revealed was “Can’t you just explain to it or make a rule in it to deal with that exploit?” And the answer is that you can’t really do that with neural nets, because they don’t explicitly encode rules and don’t really have an “Explain this to me” interface. What you can do is train them on various training sets until they get the right answers, and what often makes them appealing is that they can come to right answers that you can’t figure out the reasoning behind, which makes them look smarter even though they can’t figure out the reasoning behind them either. So, perhaps, they can be very intuitive but they cannot learn by someone carefully explaining the situation to them.

But inference engines, in theory, can.

There’s also a potential issue with using a game like Starcraft II for this, because as people have pointed out the intelligent parts of it — the strategy — can get swamped by simple speed of movement or, in the vernacular, “clicking”. As is the case in curling, the best strategy in the world doesn’t matter if you can’t make the shots, and in this case while you’re working out that grand strategy someone who builds units faster and maneuvers them better will wipe you out. A Zerg rush isn’t a particularly good strategy, but if you build them fast enough and can adjust their attack faster than your opponent can you might win, even if your opponent is a better strategist than you are. In short, Starcraft II privileges tactical reasoning over broad strategic reasoning, and while tactical reasoning is important — and arguably even more so in an actual battlefield situation — broad strategic reasoning seems more intelligent … especially when some of those tactical considerations are just how quickly you can get orders to your units.

So what we’d want, if we really wanted intelligence, is a game where you have lots of time to think about it and reason out situations. There’s a reason that chess is or at least was the paradigm for artificial intelligence (with Go recently making waves). But that game can be solved by look-ahead algorithms, and look-ahead algorithms are a form of reasoning that humans can really use because we just can’t remember that much (although it has been said that chess grandmasters do, in fact, employ a greater look-ahead strategy than most people are capable of. And now I want to start playing chess again and learning how to play it better, in my obviously copious spare time). There’s also an issue that it and Go are fairly static games (as far as I can tell because I’m not a Go expert) and so things proceed pretty orderly from move to move, and so aren’t very chaotic or diverse.

Which got me thinking about the board games I have that have chaotic or random elements to them, like Battlestar Galactica or Arkham Horror. These games let you develop grand strategies, but are generally random enough that those grand strategies won’t necessarily work and you have to adjust on the fly to new situations. They’re also games that have set rules and strategies that you can explain to someone … or to an AI. So my general musings led me to a desire to build an inference engine type system that could play one of those sorts of games but that I could explain what the system did wrong to it, and see how things go. Ideally, I could have multiple agents running and explain more or less to them and see how they work out. But the main components are games where you have set overall strategies that the agents can start with, and yet the agent also has to react to situations that call for deviations, and most importantly will try to predict the actions of other players so that it can hopefully learn to adjust that when they don’t do what is expected.

Now, other than picking a game to try to implement this way — Battlestar Galactica’s traitor mechanism is a bit much to start with, while Arkham Horror being co-operative means that you don’t have to predict other players much — the problem for me is that, well, I’m pretty sure that this sort of stuff has been done before. I’m not doing anything that unique other than with the games I’m choosing. So, if I did some research, I’d find all of these and get a leg up on doing it, at least. But a quick search on books didn’t give me anything for that specifically, a search of Google will make it difficult to sort the dreck from the good stuff, and the more up-front research I try to do the less actual work I’ll be doing, and I want to do some work. Simple research is just plain boring to me when I’m doing it as a hobby. So my choices are to reinvent the wheel or else spend lots of time looking for things that might not be there or might not be what I want.

So, I’ll have to see.

Anyway, thanks Shamus for adding more things to my already overflowing list of things I want to do!

Cacheing and Intelligence

July 22, 2015

At one point in my Cognitive Science/Philosophy courses, we talked a bit about contextualism about language, which is the idea that we critically rely on various contexts to determine the meaning of a sentence. For example, if I say “I went to the bank yesterday”, the sentence itself is perfectly compatible with my going to that place where I keep my money or to the place beside the river. For the most part, we get the determination right, but most interestingly to me are the cases where we in fact get that spectacularly wrong. In the case where I first heard about this, for example, in the example everyone in the room thought that the lecturer meant that the person should get on the desk, instead of looking for something that they could use on the desk. There are entire genres of comedy built entirely around someone failing to parse the right meaning out of a sentence, and having hilarity ensue. So we find that our ability to disambiguate words is both massively successful and shockingly terrible at times. What explains this ability?

To me, the main clue starts from the psychological process of “priming”. Essentially, this is the process where if we are exposed to, say, a word that is related to another word in a list that we’ve already recently processed, we process that word faster than we would otherwise. So, for example, if you’re reading a list of words and come across the word “Doctor” and then not too much later come across the word “Nurse”, you process “Nurse” faster and easier than you would if you hadn’t come across it beforehand. This is hard to explain.

Being someone from both a philosophical and a computing background, I do have a suggestion for what could be going on here. In general, it seems to me that what we probably have is a combination of time-saving techniques that are common in computer science when loading time is an issue. First, if it is common for a bunch of things to all be referenced together, instead of loading precisely the part you need and no more and then immediately loading the other parts, you load the whole thing into memory and use it. If you don’t use all of it, you don’t lose much because the problem is the initial loading and seeking out the object you’re looking for, not loading the individual parts of it. The second thing is to store things in memory that you have recently used because you’re likely to want to use it again in a short period of time, which is often implemented by or called “cacheing”. There are a number of Cognitive Science AI theories that rely on storing and loading objects and contexts instead of, say, simply words, so all we need to do, then, is add cacheing.

I’ve written a little program to play with cacheing to show how priming could work using it. I won’t reproduce the program here because HTML wants to ignore leading spaces and Python critically depends on leading spaces, so it’s a lot of work to put a program here, but in general what the program does is set up a number of lists that contain various characters that have various traits. For my demo, I created one with David Eddings characters, one with Persona characters, and one with other characters. The lists are as follows:

[Kalten, Sparhawk, Sephrenia, Ulath, Ehlana]
[Akihiko, Dojima, Yukari, Junpei, Naoto, Adachi, Yu, Mitsuru]
[Sherlock Holmes]

I then set up some matching criteria that you can ask the system to look for. You can look to see if the character is a Knight, is Male, is a Fictional Character, Carries a Sword, is a Detective, or is a Video Game Character. And you can ask for multiple criteria to be matched. For example, this was my first criteria:

print(matchMemoryElement([“Video Game Character”,”Carries A Sword” ]))

And given the lists above, the first one that it finds is Junpei.

So what if I run that search and then run another one looking for an Eddings Character. Note that since I randomize the lists every time (to allow me to get odd results without having to plan things out), the lists on this run start as follows:

[Kalten, Ehlana, Ulath, Sparhawk, Sephrenia]
[Dojima, Akihiko, Naoto, Junpei, Yukari, Adachi, Yu, Mitsuru]
[Sherlock Holmes]

And the results are:

[Sephrenia, Dojima, Akihiko, Naoto, Junpei]
Junpei
[Sephrenia, Dojima, Akihiko, Naoto, Junpei]
Sephrenia

So we still find Junpei for the first criteria, as he’s still the first person in the lists that is both a video game character and carries a sword. But how come I found Sephrenia first for the Eddings character? She’s the last in the list; shouldn’t I have found Kalten first?

The reason is that 5 element list that is printed out before the answer. That’s a cache, where I store the last five elements I’ve processed in case I need them again so I don’t have to go back to the lists. In this case, I parsed through all of the Eddings characters list, and then only got to the fourth element in the list of Persona characters before finding one, and then when I tried to match the second set of criteria it looked in the cache, found Sephrenia, and gave me that one … which would have been embarrassing if I was really looking for Kalten.

Let’s see what happens when instead of looking for an Eddings character, I look for a detective. The lists this time are:

[Ehlana, Sparhawk, Kalten, Sephrenia, Ulath]
[Yu, Junpei, Adachi, Yukari, Akihiko, Dojima, Naoto, Mitsuru]
[Sherlock Holmes]

And the results are:

[Sparhawk, Kalten, Sephrenia, Ulath, Yu]
Yu
[Sephrenia, Ulath, Yu, Junpei, Adachi]
Adachi

This time, there wasn’t a detective in the cache when it started, so it had to go back to the list to look for one, and ended up with Adachi.

Caches save loading time, because if you’ve already loaded an object and might use it again you might be able to get it from the cache without having to load any objects again. Also, despite the fact that the behaviour looks intelligent, it’s really quite simple, as all it does is store what you’ve loaded. Simple caches have no idea what you might load next, and even don’t have to intentionally cache in case that object might be needed again. All you need is a kind of white board that you just don’t erase, and a system that always looks on the white board first, and if nothing is there it erases some space and writes something else down. It’s a system that a brain could indeed implement by accident just by dealing with activation potentials. And yet, it has a lot of power to explain things like priming and contextualization of language processing. I hope to delve more into this if I have some time, but for now this ought to do to give a quick idea of the potential of cacheing for AI.

Memory and random access lists …

December 4, 2014

When I was actively taking Cognitive Science courses, I took a course on Cognitive Psychology. Unless I’m misremembering — I’m a bit too lazy to look it up at the moment — one experiment we covered was where they were trying to determine if when given a list of numbers to iterate through to find a particular element we generally iterated through the set of numbers and stopped when we found the right one, or if we just iterated through the entire list regardless. Of course, all experience and common sense suggested that we’d stop when we found the right one, but the experiment showed that we seemed to access the entire list every time. The reasoning for this is that the experiment measured the access times it took for us to find an element, and compared the times for when it was at, say, the first element in the list and when it was at, say, the last element in the list. If we stopped when we found the element, you’d expect there to be a significant difference between the time it takes to find it if it’s the first element and the time it takes to find it when it’s the last element. You have to run it a bunch of times to avoid issues where one access might take more or less time than another due to some elements that you can’t control for, but if you run it enough times you should always get this progression. And they didn’t see that. The times were, in general, pretty much flat regardless of what element in the list you were finding. So the conclusion was that we ended up searching the entire list anyway instead of stopping when we found the right element.

Now, having a Computer Science background, I immediately saw a potential confound here. This holds if the model is to simply iterate through the list of numbers and nothing else happens. However, if the model is to first load the list into some sort of buffer and then to iterate through it looking for the right answer, then whether this test would work or not depends greatly on how long it took to load that list into the buffer. After all, anyone who works with databases will know that often in order to find a particular element you will load the instances into memory and then iterate through them, and that if you’re trying to make that process as efficient as possible it often doesn’t make sense to try to speed up the time for iterating through the list, but instead try to reduce the time it takes to load the information into the buffer.

Wanting to play a bit with Python anyway, I finally got around to writing a Python program that demonstrates this:

def memoryArrayIterateTest(initialTime, timeBetweenAccesses, timesToRun):

#This function iterates through a five element memory list and calculates the time of access

testList = [2,3,4,5,6] #Start with 2 to make the difference between number and element clear
timesList = [0,0,0,0,0]
hitsList = [0,0,0,0,0]

fudgeFactor = 0

for x in range(0, timesToRun):

number = random.randint(2,6)
# print(number)
#fudgeFactor = random.randint(1,5)
accessTime = 0
for i in range(0, 5):

if(testList[i] == number):

hitsList[i] = hitsList[i] + 1
timesList[i] = timesList[i] + initialTime + fudgeFactor + accessTime
break

else:

accessTime = accessTime + timeBetweenAccesses

for y in range(0, 5):

if(hitsList[y] != 0): #Let’s avoid dividing by 0

s = “The time average at ” + repr(y+1) + ” is: ” + repr(timesList[y]/hitsList[y])
print(s)

Essentially, what this function does is create a five element list from 2 – 6, selects an element from that list at random, and then iterates to the list. It takes in an initial loading time, a time between accesses, and how many times you want to run it. It generates the element as many times as you tell it to, and then at the end of the day calculates the average access time for each element in the list.

I’ll keep my access time at 1 and run it 1000 times. Let’s start by seeing what happens when the initial loading time is also 1:

>>> memoryArrayIterateTest(1,1,1000)
The time average at 1 is: 1.0
The time average at 2 is: 2.0
The time average at 3 is: 3.0
The time average at 4 is: 4.0
The time average at 5 is: 5.0

So here, we get the nice progression, and a significant difference between the elements. So if the initial loading time is small, then we should see this sort of progression if we’re stopping when we find the element. Since we aren’t, it looks like that’s not what we do. But what happens when we say that the initial loading time is 1000?

>>> memoryArrayIterateTest(1000,1,1000)
The time average at 1 is: 1000.0
The time average at 2 is: 1001.0
The time average at 3 is: 1002.0
The time average at 4 is: 1003.0
The time average at 5 is: 1004.0

Now the time difference is insignificant. Our numbers are almost flat, percentage wise. Now what happens if I uncomment out that fudge factor and add in that sometimes there will be other factors that come into play on each iteration, and it will be different for each iteration?

>>> memoryArrayIterateTest(1000,1,1000)
The time average at 1 is: 1002.9009900990098
The time average at 2 is: 1003.8549222797927
The time average at 3 is: 1005.135
The time average at 4 is: 1006.1785714285714
The time average at 5 is: 1006.9377990430622
>>> memoryArrayIterateTest(1000,1,1000)
The time average at 1 is: 1002.9381443298969
The time average at 2 is: 1004.1609756097561
The time average at 3 is: 1005.0904522613065
The time average at 4 is: 1005.9368932038835
The time average at 5 is: 1006.969387755102
>>> memoryArrayIterateTest(1000,1,1000)
The time average at 1 is: 1002.8676470588235
The time average at 2 is: 1004.0449438202247
The time average at 3 is: 1004.9045454545454
The time average at 4 is: 1006.004854368932
The time average at 5 is: 1006.9375

Not a smoking gun — I was hoping to get wider time variances — but we do end up with some longer gaps and some shorter gaps, which some of them being essentially equal. This is probably because the random factors do even out over more iterations, because if I run it with only 10:

>>> memoryArrayIterateTest(1000,1,10)
The time average at 1 is: 1003.25
The time average at 2 is: 1002.0
The time average at 3 is: 1004.5
The time average at 4 is: 1005.0
The time average at 5 is: 1005.0

Then I can get the first one taking longer than the second one. So if we do enough iterations, we can indeed correct for those random factors, most of the time. We won’t, however, correct for the initial loading time, and that’s still a major confound there.

We’d need to know if there is an initial loading time to conclude that we don’t generally stop when iterating through a list of elements when we find the one we want, and in my view the experience of what I do when I consciously do that trumps psychological experiments unless we don’t have any serious confounds. So I’m skeptical about those results. The biggest objection you can make is that I still do get a progression, just not a significant one, and I’d have to see if the experiment found any progression at all. Which I’m not really going to do, because this was just a minor and interesting — at least to me — demonstration of a potential confound using Python. As I hope to do more AI programming in the near future, this was a nice way to run a little experiment and see all of the potential pitfalls of doing this sort of thing.