When I was actively taking Cognitive Science courses, I took a course on Cognitive Psychology. Unless I’m misremembering — I’m a bit too lazy to look it up at the moment — one experiment we covered was where they were trying to determine if when given a list of numbers to iterate through to find a particular element we generally iterated through the set of numbers and stopped when we found the right one, or if we just iterated through the entire list regardless. Of course, all experience and common sense suggested that we’d stop when we found the right one, but the experiment showed that we seemed to access the entire list every time. The reasoning for this is that the experiment measured the access times it took for us to find an element, and compared the times for when it was at, say, the first element in the list and when it was at, say, the last element in the list. If we stopped when we found the element, you’d expect there to be a significant difference between the time it takes to find it if it’s the first element and the time it takes to find it when it’s the last element. You have to run it a bunch of times to avoid issues where one access might take more or less time than another due to some elements that you can’t control for, but if you run it enough times you should always get this progression. And they didn’t see that. The times were, in general, pretty much flat regardless of what element in the list you were finding. So the conclusion was that we ended up searching the entire list anyway instead of stopping when we found the right element.

Now, having a Computer Science background, I immediately saw a potential confound here. This holds if the model is to simply iterate through the list of numbers and nothing else happens. However, if the model is to first load the list into some sort of buffer and then to iterate through it looking for the right answer, then whether this test would work or not depends greatly on how long it took to load that list into the buffer. After all, anyone who works with databases will know that often in order to find a particular element you will load the instances into memory and then iterate through them, and that if you’re trying to make that process as efficient as possible it often doesn’t make sense to try to speed up the time for iterating through the list, but instead try to reduce the time it takes to load the information into the buffer.

Wanting to play a bit with Python anyway, I finally got around to writing a Python program that demonstrates this:

def memoryArrayIterateTest(initialTime, timeBetweenAccesses, timesToRun):

#This function iterates through a five element memory list and calculates the time of access

testList = [2,3,4,5,6] #Start with 2 to make the difference between number and element clear

timesList = [0,0,0,0,0]

hitsList = [0,0,0,0,0]

fudgeFactor = 0

for x in range(0, timesToRun):

number = random.randint(2,6)

# print(number)

#fudgeFactor = random.randint(1,5)

accessTime = 0

for i in range(0, 5):

if(testList[i] == number):

hitsList[i] = hitsList[i] + 1

timesList[i] = timesList[i] + initialTime + fudgeFactor + accessTime

break

else:

accessTime = accessTime + timeBetweenAccesses

for y in range(0, 5):

if(hitsList[y] != 0): #Let’s avoid dividing by 0

s = “The time average at ” + repr(y+1) + ” is: ” + repr(timesList[y]/hitsList[y])

print(s)

Essentially, what this function does is create a five element list from 2 – 6, selects an element from that list at random, and then iterates to the list. It takes in an initial loading time, a time between accesses, and how many times you want to run it. It generates the element as many times as you tell it to, and then at the end of the day calculates the average access time for each element in the list.

I’ll keep my access time at 1 and run it 1000 times. Let’s start by seeing what happens when the initial loading time is also 1:

>>> memoryArrayIterateTest(1,1,1000)

The time average at 1 is: 1.0

The time average at 2 is: 2.0

The time average at 3 is: 3.0

The time average at 4 is: 4.0

The time average at 5 is: 5.0

So here, we get the nice progression, and a significant difference between the elements. So if the initial loading time is small, then we should see this sort of progression if we’re stopping when we find the element. Since we aren’t, it looks like that’s not what we do. But what happens when we say that the initial loading time is 1000?

>>> memoryArrayIterateTest(1000,1,1000)

The time average at 1 is: 1000.0

The time average at 2 is: 1001.0

The time average at 3 is: 1002.0

The time average at 4 is: 1003.0

The time average at 5 is: 1004.0

Now the time difference is insignificant. Our numbers are almost flat, percentage wise. Now what happens if I uncomment out that fudge factor and add in that sometimes there will be other factors that come into play on each iteration, and it will be different for each iteration?

>>> memoryArrayIterateTest(1000,1,1000)

The time average at 1 is: 1002.9009900990098

The time average at 2 is: 1003.8549222797927

The time average at 3 is: 1005.135

The time average at 4 is: 1006.1785714285714

The time average at 5 is: 1006.9377990430622

>>> memoryArrayIterateTest(1000,1,1000)

The time average at 1 is: 1002.9381443298969

The time average at 2 is: 1004.1609756097561

The time average at 3 is: 1005.0904522613065

The time average at 4 is: 1005.9368932038835

The time average at 5 is: 1006.969387755102

>>> memoryArrayIterateTest(1000,1,1000)

The time average at 1 is: 1002.8676470588235

The time average at 2 is: 1004.0449438202247

The time average at 3 is: 1004.9045454545454

The time average at 4 is: 1006.004854368932

The time average at 5 is: 1006.9375

Not a smoking gun — I was hoping to get wider time variances — but we do end up with some longer gaps and some shorter gaps, which some of them being essentially equal. This is probably because the random factors do even out over more iterations, because if I run it with only 10:

>>> memoryArrayIterateTest(1000,1,10)

The time average at 1 is: 1003.25

The time average at 2 is: 1002.0

The time average at 3 is: 1004.5

The time average at 4 is: 1005.0

The time average at 5 is: 1005.0

Then I can get the first one taking longer than the second one. So if we do enough iterations, we can indeed correct for those random factors, most of the time. We won’t, however, correct for the initial loading time, and that’s still a major confound there.

We’d need to know if there is an initial loading time to conclude that we don’t generally stop when iterating through a list of elements when we find the one we want, and in my view the experience of what I do when I consciously do that trumps psychological experiments unless we don’t have any serious confounds. So I’m skeptical about those results. The biggest objection you can make is that I still do get a progression, just not a significant one, and I’d have to see if the experiment found any progression at all. Which I’m not really going to do, because this was just a minor and interesting — at least to me — demonstration of a potential confound using Python. As I hope to do more AI programming in the near future, this was a nice way to run a little experiment and see all of the potential pitfalls of doing this sort of thing.

Tags: AI

November 3, 2017 at 8:15 am |

[…] not actually being valid. And so on. I’ve myself read scientific and psychological works and found obvious potential confounds. It might be a good idea for scientists to take more philosophy — it generally isn’t […]