Archive for the ‘Computers’ Category

More Net Neutrality

April 27, 2018

So, after the FCC in the United States dropped Net Neutrality regulations, a number of states have jumped on the bandwagon. Unfortunately, as I showed in an earlier post on the topic, most of them have no real idea how the Internet works, and Net Neutrality has become a buzzword that covers so many different things that it has become meaningless. I’m going to look at California’s proposed law to show that (which is referenced in the above Wired link).

First, again, the “Fast Lane” idea is brought up, and completely misunderstood:

The bill also leaves open the possibility of offering “fast lanes” for select content, but only at a customer’s discretion. Essentially, a carrier could allow you to pick a few applications to prioritize. For example, if you want to make sure your family’s video streaming doesn’t cut into your Skype calls, you could, hypothetically choose to prioritize Skype. But it must leave the selections to the customers, and not allow companies to pay for preferential treatment.

Except, as I pointed out in the original post linked above, that was never how “Fast Lanes” were going to work. What was almost certainly going to be the case was that additional infrastructure in the core was going to be added that would either have more available bandwidth or faster equipment — or both — and that the companies that required that would pay to get access to that. This would likely be accomplished by tagging the packets they sent with a tag in the header that the equipment would recognize and then route along the appropriate path. Thus, all of this was going to be done at the end that was sending the videos, not the side that was receiving them. Thus, at the customer side, nothing would be added and nothing would be done. It would all start from the company’s end and all the important things would be done long before the data reached the customer.

This “allowed” plan is completely different, as it’s all receiver/requester side. There are only two reasons that a customer could want this. The first is that the set-up inside their home that connects the routers to their own equipment is too slow to handle all of that bandwidth. There isn’t anything that the ISP can do about that, and I believe there are software and router solutions to allow this sort of prioritization. So no ISP is going to provide a solution for that kind of problem. The other reason is that there isn’t enough bandwidth on the line that comes from the outside equipment to the router to handle that load. In theory, an ISP could probably prioritize data at that point, although it would require them to deeply inspect the traffic or at least map the sending IP address to a specific destination address to do so. However, the preferred solution for them for these cases would be for the customer to upgrade to a faster Internet connection, which would make them more money. So, since this would require extra work on their end and would potentially cost them the money that the upgrade could provide, they would charge for it, and in general would charge more for doing it than it would cost to upgrade to a higher speed (since they can already do that without any extra work or software). Thus, this “possibility” will likely only be of use for people who are already at the highest speeds, or have very specific conditions that require the prioritization to be outside of their own routers and routing software. Somehow, I don’t really see this as being something that ISPs will rush to provide (and am not even really sure if there is any great demand for it now).

So, what the law allows for does not in any way address the issue that “Fast Lanes” were meant to address — which was that applications that required a lot of incredibly reliable bandwidth required equipment investments that telcos might not see extra profit from making — and instead allows a solution that customers may not want and ISPs aren’t going to see huge benefits from adding. This does not seem like a well-informed law to me.

This gets even worse when we look at a relatively new thing that wasn’t in most of the big Net Neutrality discussions, which is about things like “zero rating” and other ways to exclude some data from a customers download cap:

Wiener’s bill explicitly prohibits carriers from using interconnection agreements to circumvent its net neutrality rules, and bans certain types of zero rating. Broadband providers would no longer be able to exempt their own services from data caps, and they wouldn’t be allowed to selectively choose to exempt outside services from data caps, regardless of whether those services pay for the privilege or not. That means Verizon would no longer be able to zero rate live streaming National Football League games, even if the NFL doesn’t pay for the privilege.

Zero rating has long been controversial. Proponents argue that zero rating is consumer friendly because it lets customers use more data. Critics argue that it lets carriers pick winners and losers on the internet, which goes against the idea of net neutrality. The Obama-era FCC leaned towards the latter view. Towards the end of former FCC chair Tom Wheeler’s tenure in late 2016, the agency warned AT&T and Verizon, which exempts its Go90 video service from data caps, that their zero-rating practices were likely anticompetitive. But one of the first things Republican FCC chair Ajit Pai did after he was appointed by President Trump was end the agency’s investigation into the companies.

The California bill would allow zero rating in some cases. For example, a carrier could exempt a category of apps or services, as long as the exemption applied to all services in that category. T-Mobile’s Binge On and Music Freedom, which zero rate a large number of streaming video and music services, might be allowed. T-Mobile claims that all services that meet its technical requirements can be included in the service. The California bill would also allow carriers to zero rate all data under certain circumstances. For example, a provider could let people use unlimited amounts of data at night, but charge for data use during the day.

Now, “zero rating”, in this way, is a problem, and so it’s surprising how little attention it got in the original Net Neutrality debates. While excluding certain things from download caps does indeed allow customers to access more applications and more data-intensive applications before hitting their cap and having to pay more, in general it isn’t likely that they will use that to subscribe to two different competing apps, especially if they have to subscribe to both. It’s far more likely that they will be able to subscribe to both a video streaming service and an online game, say, because since both use a lot of bandwidth the cost of using both would exceed their cap and thus end up costing more than they are worth, and so they’d have to give up at least one of them. By being able to exclude their own, say, streaming service they give customers a reason to subscribe to theirs and not to their competitors, which is something that regulation might want to do something about. But this, then, is more about a business practice than about Net Neutrality.

And the law itself pretty much allows for that, by allowing ISPs to exempt full categories or to have day and time exemptions. But what’s interesting — especially in light of what is allowed for “fast lanes” — is that they don’t mention making these exemptions be at customer discretion. Being able to pay to exempt an MMO or streaming service, say, from my download cap seems pretty useful, and isn’t at all mentioned, and if it isn’t in the law then it might not be allowed because it wouldn’t be “Net Neutral”. And my experience with cable companies and the reasons given for why they, in general, couldn’t just allow complete choice — until, in Canada, the CRTC demanded they do it — was that the CRTC rules wouldn’t allow them to do what they needed in order to be able to provide that (my belief is that they were at least using the “Canadian content” rules as an excuse). Unless they see great money to be made from this, ISPs are likely to use the Net Neutrality rules as a reason not to do it, and to instead hold out for weakening Net Neutrality rules in other ways that benefit them along with doing this. Thus, this thing that would actually be really consumer-friendly won’t get off the ground until these laws are weakened.

Now, there are technical reasons why day and time and certain service exemptions might be useful for ISPs. Large downloads during peak times can slow the system and slow things down for everyone, so encouraging people to do these things outside of peak hours can make things better for everyone. And if certain applications aren’t an issue — because they aren’t time sensitive and so don’t have to get there right way — then excluding them allows them to focus on the content that is causing technical issues. But the law references them and, at least in the article, doesn’t reference another useful case that is similar to the “fast lane” one that is not that useful for “fast lanes”. Again, this does not seem like a well-informed law to me.

Does the Internet need some regulation? Sure. But what has struck me almost every time this whole “Net Neutrality” thing comes up is how little the people advocating for it seem to understand the Internet, and yet their voices end up being the loudest in this whole thing. That is how we’ll end up with terrible laws that people then point to to say how regulation is a bad thing and only hurts industry and consumers, and that’s not what we want. What we want is good, solid, well-informed regulation that works out best for everyone, ISPs and consumers alike. I don’t see us getting that this way.

Advertisements

Net Neutrality again …

December 22, 2017

So, the FCC in the United States has abandoned Net Neutrality, at least for now, and the panic has set in again. I was going to go through and talk about Net Neutrality because I have had sites — mainly “The Orbit” — crying wolf about the idea that sites like theirs might be blocked by their ISP — which is an odd claim to make considering that the original outcry about Net Neutrality was over “fast lanes” — and that we needed Net Neutrality to stop that, which struck me as odd. So I thought I’d go out and read up on what the issues were since things seemed inconsistent, and then talk about that … but, as it turns I’ve already done that. Twice. So I encourage everyone to read those posts for the issues around the “fast lane”, while here I’ll talk specifically about blocking sites.

There are essentially three places that a site can be blocked by an ISP. The first is at your end, when you request data from that site. The ISP can refuse to make that connection and thus deny you access to that site. There’s one issue here, though: you are paying for service from that ISP. If they deny you access to a site, they already have to have a reason or else they would be, at least potentially, violating the service contract. This is added to the fact that if they do that, customers complain bitterly. Typically, the argument here is that if an ISP is the only one available for an area, then it doesn’t need to care about that, but that doesn’t really apply in most cases. So if an ISP is going to do this, it’s going to have to be important to them to do so. They aren’t going to do that for a small site like “The Orbit”.

The second place they can do it is at the site’s end. This pretty much runs afoul of the same service contracts as the end user, only more so. This is especially dangerous because you might end up, say, blocking a blog from WordPress, and WordPress is a bit bigger than one customer or even a number of them, and they aren’t going to take it lying down. If they make a big enough fuss, lots of people will pay attention. Again, if they do this, they’d have to have a really good reason.

The third place they can do it is when they are an intermediary in the connection. So, say, Verizon hands off to AT&T in the core and then it gets switched back to Verizon for the end customer. Putting aside the fact that an ISP would be stupid to block a site this way if they didn’t block it for their own customers, there are already a lot of features that rely on ISPs treating packets they get from other ISPs as if they had originated it themselves, without adding extra restrictions. If an ISP suddenly starts blocking access to sites when the two end users are not their customers, those end users will complain to the other ISP, who will have to do damage control, and are likely to at least threaten to start inhibiting their services in retaliation. So even if an ISP blocks sites for their users, they aren’t likely to inhibit pass-through traffic.

All of this changes if they have sufficient reason, such as blocking problematic sites (child pornography being the obvious and uncontroversial example). Besides those cases, the most likely reason would be to give themselves a competitive advantage in some way. If they have a competing service, they can try to block their competitors so that they get more users than they do. Of course, all the other ISPs will try the same thing, which will only cause customers to be incredibly unhappy and likely refuse to use any of them. And the negative publicity will likely force regulations blocking that. They’re more likely to cheat with “fast lanes”, charging huge fees for access to it knowing that their service would, essentially, be paying themselves for that and so they’d break even, while other sites would run into problems with the cost and how to keep their site profitable having to pay for that. Again, note that ISPs are quite likely to respect each others’ “fast lanes”, so it’s sites like Netflix and any site that is independent of an ISP that will feel the heat there. But the issue here is not “fast lanes” or Net Neutrality, but is the fact that companies that own ISPs can and do also own content providers, meaning that they have a conflict of interest that they can exploit. We probably should focus more on dealing with consolidation rather than worrying about Net Neutrality.

As I’ve said in my previous posts, no one really wants Net Neutrality. What we really want is protection from unfair business practices. Net Neutrality is a “motherhood” statement that people are using to get that, but when examined closely that’s not really the way to go, since we can see benefits from not having Net Neutrality and the concerns people are pushing aren’t that credible.

Crap, It Succeeded …

November 1, 2017

So, at work I am quite busily working on a feature where, essentially, what I’m trying to do is take an operation that used to be one-step and introduce a step into it where you can do that first step — which is most of the work a user might want to do — and then finish it later. This means that there are two major parts to the feature. The first is to do the first step so that everything is stored and there when we want to finish it off. The second is the operation to complete the original one-step operation in precisely the same way as the one-step operation did it. Thus, a lot of testing has to be done to ensure that the end result of my two-step operation is exactly the same as the end result of the one-step operation. Since there are a ton of different combinations, this is something that I need a lot of help from QA to do.

It also means that I can get into an interesting situation, which happened over the weekend. One specific scenario was failing, so I was working through the code and fixing up all the places where that failed. After I did that, it completely succeeded! But, I had to check to see that it did all the same things as the one-step operation, and things were looking a little funny, so I tried to create it using the one-step process … and it failed. After making sure that what I was doing wasn’t screwing something up, I then spent the next day trying to figure out where my code was going wrong and succeeding when it should have failed. I finally managed to successfully get it to fail and thus knew that my code was closer, at least, to being correct.

This is the second time on this feature where I had something succeeding when it should have failed, and so was incorrect. The other time, it seemed to work — meaning fail or succeed appropriately — for the QA people, so I ignored it as being something odd with my set-up. But it’s one of those odd cases where succeeding is really a failure and a failure would really be a success.

Of course, all error cases are like that. But this wasn’t supposed to be an error case. It just happened to be a failure case due to misconfiguration. And that always leads to that odd feeling of “Damn, it worked, so I did something wrong!”

Needing the big picture …

December 9, 2015

So, in an attempt to update my programming skills — I’ve spent most of my career in C/C++, only now really getting into Java — I’ve decided to start doing little projects in HTML/Javascript, which I’ve been poking around with over the past few weekends. And what I’ve noticed is that, for me, the Javascript is generally pretty easy. I built an XML file and loaded it into classes in a couple of hours. No, when I get stuck, it’s always on the HTML stuff: hooking it up to Javascript, adding panels, etc, etc.

And I, of course, didn’t buy a book on HTML because, hey, how hard could it be?

The issue, I think, is that for HTML — and for any UI — it’s pretty hard to just build a bunch of small pieces and stick them together and make it work. For Javascript — or Python, for that matter — it’s relatively easy to start with some small classes and functions, stick them together, and then just Google or search through the book to find an example of what you need to do at this very moment, stitch that in, and move on. With a UI, everything pretty much has to work inside an overall context, and you need that overall, “big picture” definition before you’ll be able to do the small things. Again, in Javascript I can read in my XML file without storing it in a class, and can store it in a class without having that be used anywhere. In short, I can work “bottom-up” if I want to, which means that I can break things down into small tasks that I can assemble into a working program later. But with the UI, if I don’t have the overall structure in place, then nothing will look right, and nothing will work.

(The fact that, like Weyoun, I have no sense of aesthetics doesn’t really help [grin]).

I have a book now, and after skimming through it a bit it looks like it will be able to teach me what I need to know to progress. So all I have to do is actually sit down and do it.

That might be harder than doing the HTML …

Cacheing and Intelligence

July 22, 2015

At one point in my Cognitive Science/Philosophy courses, we talked a bit about contextualism about language, which is the idea that we critically rely on various contexts to determine the meaning of a sentence. For example, if I say “I went to the bank yesterday”, the sentence itself is perfectly compatible with my going to that place where I keep my money or to the place beside the river. For the most part, we get the determination right, but most interestingly to me are the cases where we in fact get that spectacularly wrong. In the case where I first heard about this, for example, in the example everyone in the room thought that the lecturer meant that the person should get on the desk, instead of looking for something that they could use on the desk. There are entire genres of comedy built entirely around someone failing to parse the right meaning out of a sentence, and having hilarity ensue. So we find that our ability to disambiguate words is both massively successful and shockingly terrible at times. What explains this ability?

To me, the main clue starts from the psychological process of “priming”. Essentially, this is the process where if we are exposed to, say, a word that is related to another word in a list that we’ve already recently processed, we process that word faster than we would otherwise. So, for example, if you’re reading a list of words and come across the word “Doctor” and then not too much later come across the word “Nurse”, you process “Nurse” faster and easier than you would if you hadn’t come across it beforehand. This is hard to explain.

Being someone from both a philosophical and a computing background, I do have a suggestion for what could be going on here. In general, it seems to me that what we probably have is a combination of time-saving techniques that are common in computer science when loading time is an issue. First, if it is common for a bunch of things to all be referenced together, instead of loading precisely the part you need and no more and then immediately loading the other parts, you load the whole thing into memory and use it. If you don’t use all of it, you don’t lose much because the problem is the initial loading and seeking out the object you’re looking for, not loading the individual parts of it. The second thing is to store things in memory that you have recently used because you’re likely to want to use it again in a short period of time, which is often implemented by or called “cacheing”. There are a number of Cognitive Science AI theories that rely on storing and loading objects and contexts instead of, say, simply words, so all we need to do, then, is add cacheing.

I’ve written a little program to play with cacheing to show how priming could work using it. I won’t reproduce the program here because HTML wants to ignore leading spaces and Python critically depends on leading spaces, so it’s a lot of work to put a program here, but in general what the program does is set up a number of lists that contain various characters that have various traits. For my demo, I created one with David Eddings characters, one with Persona characters, and one with other characters. The lists are as follows:

[Kalten, Sparhawk, Sephrenia, Ulath, Ehlana]
[Akihiko, Dojima, Yukari, Junpei, Naoto, Adachi, Yu, Mitsuru]
[Sherlock Holmes]

I then set up some matching criteria that you can ask the system to look for. You can look to see if the character is a Knight, is Male, is a Fictional Character, Carries a Sword, is a Detective, or is a Video Game Character. And you can ask for multiple criteria to be matched. For example, this was my first criteria:

print(matchMemoryElement([“Video Game Character”,”Carries A Sword” ]))

And given the lists above, the first one that it finds is Junpei.

So what if I run that search and then run another one looking for an Eddings Character. Note that since I randomize the lists every time (to allow me to get odd results without having to plan things out), the lists on this run start as follows:

[Kalten, Ehlana, Ulath, Sparhawk, Sephrenia]
[Dojima, Akihiko, Naoto, Junpei, Yukari, Adachi, Yu, Mitsuru]
[Sherlock Holmes]

And the results are:

[Sephrenia, Dojima, Akihiko, Naoto, Junpei]
Junpei
[Sephrenia, Dojima, Akihiko, Naoto, Junpei]
Sephrenia

So we still find Junpei for the first criteria, as he’s still the first person in the lists that is both a video game character and carries a sword. But how come I found Sephrenia first for the Eddings character? She’s the last in the list; shouldn’t I have found Kalten first?

The reason is that 5 element list that is printed out before the answer. That’s a cache, where I store the last five elements I’ve processed in case I need them again so I don’t have to go back to the lists. In this case, I parsed through all of the Eddings characters list, and then only got to the fourth element in the list of Persona characters before finding one, and then when I tried to match the second set of criteria it looked in the cache, found Sephrenia, and gave me that one … which would have been embarrassing if I was really looking for Kalten.

Let’s see what happens when instead of looking for an Eddings character, I look for a detective. The lists this time are:

[Ehlana, Sparhawk, Kalten, Sephrenia, Ulath]
[Yu, Junpei, Adachi, Yukari, Akihiko, Dojima, Naoto, Mitsuru]
[Sherlock Holmes]

And the results are:

[Sparhawk, Kalten, Sephrenia, Ulath, Yu]
Yu
[Sephrenia, Ulath, Yu, Junpei, Adachi]
Adachi

This time, there wasn’t a detective in the cache when it started, so it had to go back to the list to look for one, and ended up with Adachi.

Caches save loading time, because if you’ve already loaded an object and might use it again you might be able to get it from the cache without having to load any objects again. Also, despite the fact that the behaviour looks intelligent, it’s really quite simple, as all it does is store what you’ve loaded. Simple caches have no idea what you might load next, and even don’t have to intentionally cache in case that object might be needed again. All you need is a kind of white board that you just don’t erase, and a system that always looks on the white board first, and if nothing is there it erases some space and writes something else down. It’s a system that a brain could indeed implement by accident just by dealing with activation potentials. And yet, it has a lot of power to explain things like priming and contextualization of language processing. I hope to delve more into this if I have some time, but for now this ought to do to give a quick idea of the potential of cacheing for AI.

NOBODY wants Net Neutrality …

February 26, 2015

So, from The NY Times, it looks like Net Neutrality is going to go through, kinda, sorta. The FCC is going to regulate the Internet as if it was a public good, which would allow it to impose net neutrality. And the summary of what it would prevent is this:

The F.C.C. plan would let the agency regulate Internet access as if it is a public good. It would follow the concept known as net neutrality or an open Internet, banning so-called paid prioritization — or fast lanes — for willing Internet content providers.

In addition, it would ban the intentional slowing of the Internet for companies that refuse to pay broadband providers. The plan would also give the F.C.C. the power to step in if unforeseen impediments are thrown up by the handful of giant companies that run many of the country’s broadband and wireless networks.

The ability to step in and say that the providers can’t arbitrarily de-prioritize the content of companies that won’t play ball is good. However, no one wants the elimination of fast lanes. Even those who would never use a fast lane would rather there be a fee tacked on for high priority traffic than that all content providers are asked to pay for the infrastructure to provide high priority traffic. If all I’m doing is simple file transfers, I don’t need a high Quality of Service throughout the Internet or low latencies; a short delay is not going to impact my service at all. For video, however, a delay or packets coming out of order will hugely impact their service. Asking companies to pay to get access to even a priority that allows their traffic to be routed with low latency/high priority/high bandwidth routing features helps them guarantee their services work as expected, while the companies that don’t care as much about that don’t have to pay anything and get standard services, which works for their needs. No one really wants all traffic to be treated the same, because different traffic has different requirements and so needs different features to make it work to their ideal. If you try to treat them all the same, no one is happy because they aren’t getting the features they need.

The fact is that video services are as I’ve said before both bandwidth intensive and require low latency and a high priority. This is very expensive for ISPs to provide, requiring dedicated equipment that switches at a very high rate with an exceptionally low rate of dropped packets. As these services start to dominate, ISPs will have to provide some kind of infrastructure to handle them, or else the growing congestion will make those services unusable while also flooding out the services that didn’t care about that. Someone is going to have to pay for that infrastructure growth. The end user can’t because they are paying for the line to their system, and that’s not where the infrastructure needs to be added. If ISPs try that, they will end up charging end users more for speeds that aren’t any higher and for the needs of content that they aren’t using. This will not go over well. Despite what people have claimed, the issue is not at the end user, but is in the core, and ISPs will need to find a business case to expand the infrastructure in the core. Otherwise, their capital expenditures won’t result in an increase in revenue, and so they’ll simply end up losing money on the deal. It will not do well for the Internet to drive ISPs into loss trying to provide the services that customers want.

So if sites like Netflix want their content to have the features that they need to make their customers happy, they’ll have to find some reason for ISPs to provide those features. Trying to do it by standing on the “common good” or net neutrality won’t work because ISPs will simply insist on treating everyone alike as the regulations state and so won’t treat Netflix traffic differently than anyone else’s … and Netflix wants that. They also won’t develop new features for traffic like Netflix’s because there’s no profit in them to do so. Both of these are totally consistent with Net Neutrality.

So, no, no one really wants Net Neutrality. This issue has been clouded by the reasonable desire to limit dishonest business practices so that people aren’t seeing that there are business practices that everyone wants that can’t be provided under strict Net Neutrality.

Memory and random access lists …

December 4, 2014

When I was actively taking Cognitive Science courses, I took a course on Cognitive Psychology. Unless I’m misremembering — I’m a bit too lazy to look it up at the moment — one experiment we covered was where they were trying to determine if when given a list of numbers to iterate through to find a particular element we generally iterated through the set of numbers and stopped when we found the right one, or if we just iterated through the entire list regardless. Of course, all experience and common sense suggested that we’d stop when we found the right one, but the experiment showed that we seemed to access the entire list every time. The reasoning for this is that the experiment measured the access times it took for us to find an element, and compared the times for when it was at, say, the first element in the list and when it was at, say, the last element in the list. If we stopped when we found the element, you’d expect there to be a significant difference between the time it takes to find it if it’s the first element and the time it takes to find it when it’s the last element. You have to run it a bunch of times to avoid issues where one access might take more or less time than another due to some elements that you can’t control for, but if you run it enough times you should always get this progression. And they didn’t see that. The times were, in general, pretty much flat regardless of what element in the list you were finding. So the conclusion was that we ended up searching the entire list anyway instead of stopping when we found the right element.

Now, having a Computer Science background, I immediately saw a potential confound here. This holds if the model is to simply iterate through the list of numbers and nothing else happens. However, if the model is to first load the list into some sort of buffer and then to iterate through it looking for the right answer, then whether this test would work or not depends greatly on how long it took to load that list into the buffer. After all, anyone who works with databases will know that often in order to find a particular element you will load the instances into memory and then iterate through them, and that if you’re trying to make that process as efficient as possible it often doesn’t make sense to try to speed up the time for iterating through the list, but instead try to reduce the time it takes to load the information into the buffer.

Wanting to play a bit with Python anyway, I finally got around to writing a Python program that demonstrates this:

def memoryArrayIterateTest(initialTime, timeBetweenAccesses, timesToRun):

#This function iterates through a five element memory list and calculates the time of access

testList = [2,3,4,5,6] #Start with 2 to make the difference between number and element clear
timesList = [0,0,0,0,0]
hitsList = [0,0,0,0,0]

fudgeFactor = 0

for x in range(0, timesToRun):

number = random.randint(2,6)
# print(number)
#fudgeFactor = random.randint(1,5)
accessTime = 0
for i in range(0, 5):

if(testList[i] == number):

hitsList[i] = hitsList[i] + 1
timesList[i] = timesList[i] + initialTime + fudgeFactor + accessTime
break

else:

accessTime = accessTime + timeBetweenAccesses

for y in range(0, 5):

if(hitsList[y] != 0): #Let’s avoid dividing by 0

s = “The time average at ” + repr(y+1) + ” is: ” + repr(timesList[y]/hitsList[y])
print(s)

Essentially, what this function does is create a five element list from 2 – 6, selects an element from that list at random, and then iterates to the list. It takes in an initial loading time, a time between accesses, and how many times you want to run it. It generates the element as many times as you tell it to, and then at the end of the day calculates the average access time for each element in the list.

I’ll keep my access time at 1 and run it 1000 times. Let’s start by seeing what happens when the initial loading time is also 1:

>>> memoryArrayIterateTest(1,1,1000)
The time average at 1 is: 1.0
The time average at 2 is: 2.0
The time average at 3 is: 3.0
The time average at 4 is: 4.0
The time average at 5 is: 5.0

So here, we get the nice progression, and a significant difference between the elements. So if the initial loading time is small, then we should see this sort of progression if we’re stopping when we find the element. Since we aren’t, it looks like that’s not what we do. But what happens when we say that the initial loading time is 1000?

>>> memoryArrayIterateTest(1000,1,1000)
The time average at 1 is: 1000.0
The time average at 2 is: 1001.0
The time average at 3 is: 1002.0
The time average at 4 is: 1003.0
The time average at 5 is: 1004.0

Now the time difference is insignificant. Our numbers are almost flat, percentage wise. Now what happens if I uncomment out that fudge factor and add in that sometimes there will be other factors that come into play on each iteration, and it will be different for each iteration?

>>> memoryArrayIterateTest(1000,1,1000)
The time average at 1 is: 1002.9009900990098
The time average at 2 is: 1003.8549222797927
The time average at 3 is: 1005.135
The time average at 4 is: 1006.1785714285714
The time average at 5 is: 1006.9377990430622
>>> memoryArrayIterateTest(1000,1,1000)
The time average at 1 is: 1002.9381443298969
The time average at 2 is: 1004.1609756097561
The time average at 3 is: 1005.0904522613065
The time average at 4 is: 1005.9368932038835
The time average at 5 is: 1006.969387755102
>>> memoryArrayIterateTest(1000,1,1000)
The time average at 1 is: 1002.8676470588235
The time average at 2 is: 1004.0449438202247
The time average at 3 is: 1004.9045454545454
The time average at 4 is: 1006.004854368932
The time average at 5 is: 1006.9375

Not a smoking gun — I was hoping to get wider time variances — but we do end up with some longer gaps and some shorter gaps, which some of them being essentially equal. This is probably because the random factors do even out over more iterations, because if I run it with only 10:

>>> memoryArrayIterateTest(1000,1,10)
The time average at 1 is: 1003.25
The time average at 2 is: 1002.0
The time average at 3 is: 1004.5
The time average at 4 is: 1005.0
The time average at 5 is: 1005.0

Then I can get the first one taking longer than the second one. So if we do enough iterations, we can indeed correct for those random factors, most of the time. We won’t, however, correct for the initial loading time, and that’s still a major confound there.

We’d need to know if there is an initial loading time to conclude that we don’t generally stop when iterating through a list of elements when we find the one we want, and in my view the experience of what I do when I consciously do that trumps psychological experiments unless we don’t have any serious confounds. So I’m skeptical about those results. The biggest objection you can make is that I still do get a progression, just not a significant one, and I’d have to see if the experiment found any progression at all. Which I’m not really going to do, because this was just a minor and interesting — at least to me — demonstration of a potential confound using Python. As I hope to do more AI programming in the near future, this was a nice way to run a little experiment and see all of the potential pitfalls of doing this sort of thing.

Net Neutrality and the Core Network

May 8, 2014

Reading a tweet on Shamus Young’s site, I was directed to this youtube video by Vi Hart on Net Neutrality. And, in watching it, there are a few misconceptions in it that make sense from the perspective of someone who isn’t in a major ISP — meaning, the people who buy the hardware and maintain it to get all of that traffic from one place to another — but when you know what’s happening behind the scenes you can see that it isn’t quite right. Since I work in telecommunications myself — not at an ISP but at a company that supplies the ISPs, particularly in software that manages all of the equipment that you need to get traffic from one place to another — I thought I’d try to explain some of the things behind the scenes that I can do without, well, putting my job in jeopardy. Note that I don’t plan to say that major ISPs absolutely aren’t playing games in order to make more money, just to point out things that make the analysis and analogy misleading, and reasons why even ISPs that are playing things completely straight won’t like strict Net Neutrality.

The main analogy in the video is of a delivery company, delivering books. It starts by setting it up so that you have a person who is asking for delivery of books from two different companies, where one is from a chain bookstore and one is from a small bookstore. The chain bookstore ships a lot more things through them than the small bookstore, and at some point the delivery company says that the chain bookstore is shipping too much stuff so they’ll have to delay their deliveries while they still ship the book from the small bookstore — even if they’re going to the same person. It is then suggested that they just buy more delivery trucks, but this doesn’t appease them, and the company asks for more money from the chain bookstore instead, which is presented as being completely and totally unreasonable since, after all, isn’t it the case that the person buying more stuff or people buying more stuff brings them more business? Then why would they want more money on top of that?

And then it gets into all sorts of stuff about the FCC that I don’t know much about. But to explain how this ends up being misleading, I first want to talk about where the complaint is. The video talks about main roads and stuff like that, but it mainly talks about driveways, which would be the last bit of fibre from the main line to your house. It also talks about ISPs simply being able to run more cable (which is why this is limited to major ISPs that do lay cable as opposed to those that simply use the existing infrastructure) to solve the problem. All of this misses the point that the complaint is not about the edge of the network — ie the part directly attached to you — but is instead about the core of the network, which is what ships massive amounts of data between cities, across countries, and around the world.

So, let’s start there. Imagine that you have 100 units of bandwidth available for any application that wants to get its data to your customers. This bandwidth has to be shared amongst all applications, and if I understand Net Neutrality properly the idea is that all applications should, ideally, be treated the same. So, let’s say that we have 10 applications that want to use that bandwidth. Ideally, we’d want all of them to use 10 units each, because then the line is used to its full capacity and everyone still gets what they need when they need it. In practice, pretty much all applications will be “bursty” in some way, busier at some times than at others (and, of course, there won’t just be 10, but let’s live with that simplification for now). But that’s an ideal breakdown.

Now, imagine that one particular application starts getting more popular or bandwidth intensive, and so starts using more than its 10 units on a regular basis. Let’s say that it starts using 30 units. This is maintainable as long as everyone else isn’t using their 10 units, their bursts are at low-usage times, or the data isn’t critically time sensitive and so it can wait for a while if everything is busy. So, for example, E-mails and texts tend to be easily scalable this way because if they start putting out too much bandwidth and things get full, all that happens is that they get delayed for a few minutes or hours until things clear out, and most of the time few will really notice. Of course, separating out these cases immediately breaks strict Net Neutrality; we have to introduce the notion of priorities to know what traffic can be delayed for a bit and what has to be sent right now.

Which leads us to Netflix. Video and voice are incredibly high priority in a network, because for them to be useful you need to make sure that the next segment of video — a packet in IP — makes it there with a minimum of delay, at least relative to the last one you sent. If not, you get stuttering and a huge decrease in the quality of the service (in terms of the video, it gets “slow”), Voice, however, is fairly small, especially with all of the data compression that has been used for it over the past few decades (which is the main reason why TDM and ATM networks tended to find T1 level bandwidth acceptable for phone calls, with OC3 level required for their core, both of which as far as I can tell are very small today). Video, however, uses a lot of bandwidth, and it’s bandwidth that has to get there as quickly as possible and cannot be delayed without greatly affecting service.

So, going back to the example above, we have one application that can take up 30 to 50 units of our bandwidth — or possibly even more — and is also of the highest priority, so it will bump out everything else. Thus, what this risks — to return to the delivery truck analogy — that the chain bookstore will fill up all of the trucks so that the small bookstore simply can’t get their books delivered, and since this is in the core and not on the edge that would be true even if they were delivering to the same person. (Part of this is because at the core itself no one really knows where it’s going to end up, and since it’s servicing all customers and is trying to move between cities at times there’s no real sense in trying to figure out who the end user is. You’re trying to get the data to London at that point, not 123 Baker Street). And this is obviously not a good thing.

Now, the comment is that this is increasing the business for the ISP, so why can’t they analogously simply buy more trucks? In this specific case, why can’t they lay more cable? Well, in general, laying cable’s not that easy, but even then it’s not just about laying cable. The biggest part of the expansion is buying all of the switching equipment that figures out all of the important things like how to get the data to London and what traffic has to be sent now and what can wait. This equipment is not cheap, and each of these switches can only handle a certain amount of traffic itself before you need a new one. So there’s a significant amount of capital that you have to expend to expand the network, and to do that you have to believe that that expenditure will make you more money.

But wait, doesn’t the Netflix explosion make the ISPs more money? Well, not necessarily. For many if not most people, their ISP plans budget them get a certain rate of speed and a certain bandwidth and a certain usage in a month. While video uses up a ton of bandwidth, most of the time that’s in the rate they’re supposed to get … and if it isn’t, then at the edge they themselves are slowed down and the problem is solved for them. So most of their existing customers are already paying for enough bandwidth to watch videos, if they use all or most of it, and so won’t actually pay the ISP anymore unless they go on a splurge and have a limited plan … and if they notice this, then they’ll cut back once they hit their limit. That doesn’t stop people from all deciding to watch a great Netflix video all at the same time and flooding the core, and the ISP gets no more money from that than they are already getting. And the intermittent “Use it heavily until we hit our limit and then drop it” makes the expenditure worse because they might end up with an infrastructure that they need for two weeks out of a month and that doesn’t get used for the other two weeks … and they still didn’t get paid anymore for having it.

Thus, the idea of charging high-priority, high-bandwidth applications — again, video in general but Netflix in particular, perhaps — a fee to support an additional infrastructure in the core to get those applications the priority they need without screwing over everyone else. A gatekeeper at your driveway — as the video talked about when it talked about the fastlane — wouldn’t make sense because at that point they already have one. I don’t claim that ISPs aren’t putting one there, and I’d agree that that isn’t sane. What they can do is allocate out of their existing bandwidth a fastlane in the core, which would have a similar effect to the gatekeeper at the door but would ensure that the high-priority, high-bandwidth applications get what they need (as long as they pay for it), that other applications get what they need, and that they can tell when they need to add more infrastructure (ie either the smaller applications are still getting crowded out by themselves, or that those who are paying for the fastlane need more bandwidth to get the service that they’re paying for), all inside a structure where they actually do get more money the more these services come on-line. But to the end user, all they’d notice is that the site was getting slow or stuttery, which looks exactly the same as if there was a gatekeeper at the edge (or the driveway).

Look, I’m as cynical about big business as the next guy. I’m not here to praise nor bury the major ISPs. My goal here was to show the impact that services and applications like Netflix can have on a network, to show why maybe treating them differently isn’t so radical a notion after all. I mean, they would indeed want to be treated differently themselves because of how their traffic has to get there right away while E-mails and file transfers don’t, and so it’s also reasonable for ISPs to say that that — and the large bandwidth requirements — give them specific problems that they want to be able to resolve by treating them differently. At the end of the day, I’m not advocating for or against Net Neutrality or the ISPs “fastlane” ideas, but am instead just pointing out a technological issue from the other side that might have an impact on the discussion.

Scripted Monotony …

September 18, 2013

Well, my updating on this blog has been rather low of late, which has pretty much been the norm. I hope to get back into doing more posts soon, and more regularly, but the past few weeks have been both hectic and boring. The reason they’ve been boring is because at work I’m doing very repetitive additions of lines from one file into another, with the format altered to meet the standards of the other system. Now, some people — and myself included — would immediately think “Hey, can I write a script to do this for me instead of adding it all line by line manually?”. The problems were this:

1) The one file where I could write a script relatively easily to do the translation didn’t have very many entries, so doing it manually only took me a couple of days. Writing a script, testing it, fixing any bugs in it — and there are usually bugs in your script the first time — would probably take at least a day, so there wasn’t much time savings there … and it could be a time drain if I ran into any unforeseen problems. Which leads to …

2) The other file had a lot more entries, but wasn’t as easy to write a script for. There were missing numbers in the entries and even duplicate entries, so I wouldn’t be able to parse it line by line, but instead would have had to take the number and use it directly. And the numbers didn’t align; I’d have to prepend a lot of data — although it was always the same prepend — onto the beginning of the final number. And the final text would also have to be translated from one solid variable-type name — say “thisOurText” — into a more human-readable form, like “This Is Our Text” … but with acronyms and inconsistent captialization, I couldn’t just use the caps to divide it up into words. Meaning that even after writing the script, I’d have to check over each line to make sure it was right anyway, and correct it, running into unforeseen problems, messing things up, and likely not saving much time.

Some designers would have written the script anyway, and reason at the end that at least it would be there for next time. I, however, don’t have that attitude, and prefer brute force unless it’s clear that the script will save a lot of time. It’s a curious form of laziness: even if the brute force method might take longer in the long run, I’m averse to adding a lot of effort in advance when for at least this specific time I’ll save some effort by just doing it brute force.

A different problem is progress …

March 16, 2012

So, at work I’ve run into yet another one of the oddities of software design. I was checking out something that I’d never done before, and when I loaded it up one process — the one I needed to run — kept dying. I did a search on something else that does what I want it to do, and discovered that I wasn’t updating something that it was. I added it there, and then … every process either wouldn’t start or kept dying.

Progress!

I was getting it complaining about my definition on the North side, and so tried to fix that. Failed due to a typo. And then just before I left yesterday, I loaded it up and … it complained about the South side.

Progress!

The interesting thing is that this really counts as progress for software design. If the behaviour doesn’t change, then you haven’t gotten anywhere except that you might have eliminated one hypothesis as to what the real problem is. But if the behaviour changes, your change did something, and so that’s at least potentially progress. It stops you from simply banging your head against the wall because you can’t figure out why nothing ever changes.

In this case, it may turn out that to fix my problem I didn’t need to do the things I needed to do except for the last one, but I likely would eventually have had to do them anyway. So, progress!